A couple weeks ago somebody challenged me to see how I’d approach a series of tasks in PowerShell, each time adding a layer of complexity. The main crux of the challenge ended up being using jobs to parallelize the tasks rather than serially working through each task one by one, which is what I’m going to share with you in this post. Before I do, let’s take a look at the use case, which started simple:
How would you remotely test if a file exists on a Windows server C: drive?
Easy start! I’d use ForEach on a list of hosts with Test-Path on each to a remote SMB share logged in as domain admin account, so I can remotely access \\server\C$\filename.txt to test if the file exists.
But, what if the server is offline, how would you identify this as the cause of the failure?
I’d add a Test-Connection beforehand to ping the host, which I’d probably do anyway, to ensure the host is online before I test for the file existing.
Where would you get the list of hosts from?
I’d use a CSV list of IP addresses or hostnames to keep it simple with Import-CSV.
What if I wanted to use AD computers?
I’d import the ActiveDirectory module and use Get-ADComputer to query a list of computer names to test by a match on a specified name filter, to test only the AD computers required.
What if I want to search for a line of content within the file and pull specific data from it?
Wow, you want a lot! I’d use a combination of Get-Content with a Search-String within the jobs and return the result to be collated.
What if I needed to do this for 10,000 servers, wouldn’t ForEach take forever to run?
I agree it would take ages on this scale. To speed it up I’d combine ForEach with Start-Job, arguments, and a scriptblock to parallelize the operation rather than performing the task serially. I’d then use Get-Job to collate the results into a table.
How would you ensure that you aren’t overloading the system and network by running 10,000 simultaneous jobs?
I’d specify a max concurrent task variable, then use a specific job name for all jobs, which I’d then count using Get-Job within a Do Until. I’d put this after each Start-Job as a check before iterating on the next ForEach to ensure it didn’t add more than the maximum specified jobs. I’d also combine this with an IF statement and sleep interval to make it wait before checking again and to stop it from pausing when the max hasn’t been hit.
How would you know when all the jobs are completed?
As I’m using the same job name, a simple Get-Job -Name $JobName | Where-Object {$_.State -ne “Completed”} on a Do Until equals $null loop, run after every job has been created by the ForEach loop, would suffice.
At this point, I had a distinct feeling the challenger either didn’t quite believe what I was saying was possible, or that I couldn’t do it. Challenge accepted! To see how you can leverage PowerShell jobs to complete the above tasks you can download my example scripts below:
Included are 4 examples, 2 using jobs to ping hosts by CSV list or AD computer, 2 combining the ping with the file work. Extract to C:\UsingPowerShellJobsv1\, configure your variables and away you go! FYI I ran the AD scripts from my domain controller.
You can easily take these examples of using Start-Job, Get-Job, and Remove-Job to parallelize any task you desire while controlling the maximum simultaneous jobs. If you found this useful please like and share. Happy scripting,
Joshua
Take a look on runspaces for powershell, I feel them more efficient than stat-job