About Performance
This past November, I had the privilege of speaking with Nathan Ziehnert at MMS Jazz Edition in New Orleans, LA about PowerShell Performance. If you haven’t heard of his blog (Z-Nerd), you should definitely check it out. I’ve been meaning to write more blog posts, and I though that our topic would make a great post.
PowerShell script performance is often an afterthought, if it’s even a thought at all. Many times, we look at a script and say “It’s Works! My job here is done.” And for some instances, that’s good enough. Perhaps I’m just writing a single run script to do a one time task and then it’ll get tossed. But what about the rest of our PowerShell scripts? What about all the scripts that run our automation, task sequences, and daily tasks?
In my opinion, performance can be broken down into three categories: Cost, Maintainability, and Speed. Cost in this regard is not how much money it costs to run a script in say Azure Automation, but rather what is the resource cost on the machine that’s running it. Think CPU, memory, and disk utilization. Maintainability is the time it takes to maintain or modify the script. As our environment evolves, solutions get upgraded, or modules and snap-ins get code updates, we need to give our script the occasional tune-up. This is where format, and good coding practices come into play. Lastly, we have speed. This first blog post will focus on speed, or how fast our scripts run.
Disclaimer
What I’m covering here is general. While it may apply to 99% of use cases, be sure to test for your individual script. There are a lot of factors that can affect the speed of a script including, but not limited to, OS, PowerShell version, hardware, and run behavior. What works for one script may not work for the next. What works on one machine, may not work on the rest. What works when you tested it, will never work in production (Murphy’s Law)
What Affects Performance?
There are quite a few things that can affect performance. Some of them we can control, others we can’t. For the things we can’t control, there are often ways to mitigate or work around the constraints. I’ll take a look at some of both in this post such as:
- Test Order
- Loop Execution
- .NET Methods vs. Native Cmdlets
- Strong Typed/Typecasting
- Syntax
- Output
Testing method
For testing/measuring, I wrote a couple of functions. The first is a function named Test-Performance. It returns PSObject with all of our test data that we can use to generate visualizations. The second is a simple function that takes the mean or median results from two sets, compares them, and returns a winner along with how much faster it was. The functions:
function Test-Performance { [CmdletBinding()] Param( [Parameter(Mandatory=$true,Position=1)] [ValidateRange(5,50000)] [int]$Count, [Parameter(Mandatory=$true,Position=2)] [ScriptBlock]$ScriptBlock ) $Private:Occurrence = [System.Collections.Generic.List[Double]]::new() $Private:Sorted = [System.Collections.Generic.List[Double]]::new() $Private:ScriptBlockOutput = [System.Collections.Generic.List[string]]::new() [Double]$Private:Sum = 0 [Double]$Private:Mean = 0 [Double]$Private:Median = 0 [Double]$Private:Minimum = 0 [Double]$Private:Maximum = 0 [Double]$Private:Range = 0 [Double]$Private:Variance = 0 [Double]$Private:StdDeviation = 0 $Private:ReturnObject = '' | Select-Object Occurrence,Sorted,Sum,Mean,Median,Minimum,Maximum,Range,Variance,StdDeviation,Output #Gather Results for ($i = 0; $i -lt $Count; $i++) { $Timer = [System.Diagnostics.Stopwatch]::StartNew() #$Private:Output = Invoke-Command -ScriptBlock $ScriptBlock $Private:Output = $ScriptBlock.Invoke() $Timer.Stop() $Private:Result = $Timer.Elapsed $Private:Sum += $Private:Result.TotalMilliseconds [void]$Private:ScriptBlockOutput.Add($Private:Output) [void]$Private:Occurrence.Add($Private:Result.TotalMilliseconds) [void]$Private:Sorted.Add($Private:Result.TotalMilliseconds) } $Private:ReturnObject.Sum = $Private:Sum $Private:ReturnObject.Occurrence = $Private:Occurrence if (($Private:ScriptBlockOutput -notcontains "true") -and ($Private:ScriptBlockOutput -notcontains "false") -and ($Private:ScriptBlockOutput -notcontains $null)) { $Private:ReturnObject.Output = $Private:ScriptBlockOutput } else { $Private:ReturnObject.Output = $null } #Sort $Private:Sorted.Sort() $Private:ReturnObject.Sorted = $Private:Sorted #Statistical Calculations #Mean (Average) $Private:Mean = $Private:Sum / $Count $Private:ReturnObject.Mean = $Private:Mean #Median if (($Count % 2) -eq 1) { $Private:Median = $Private:Sorted[([Math]::Ceiling($Count / 2))] } else { $Private:Middle = $Count / 2 $Private:Median = (($Private:Sorted[$Private:Middle]) + ($Private:Sorted[$Private:Middle + 1])) / 2 } $Private:ReturnObject.Median = $Private:Median #Minimum $Private:Minimum = $Private:Sorted[0] $Private:ReturnObject.Minimum = $Private:Minimum #Maximum $Private:Maximum = $Private:Sorted[$Count - 1] $Private:ReturnObject.Maximum = $Private:Maximum #Range $Private:Range = $Private:Maximum - $Private:Minimum $Private:ReturnObject.Range = $Private:Range #Variance for ($i = 0; $i -lt $Count; $i++) { $x = ($Private:Sorted[$i] - $Private:Mean) $Private:Variance += ($x * $x) } $Private:Variance = $Private:Variance / $Count $Private:ReturnObject.Variance = $Private:Variance #Standard Deviation $Private:StdDeviation = [Math]::Sqrt($Private:Variance) $Private:ReturnObject.StdDeviation = $Private:StdDeviation return $Private:ReturnObject } Function Get-Winner { [CmdletBinding()] Param( [Parameter(Mandatory=$true,Position=1)] [ValidateNotNullOrEmpty()] [string]$AName, [Parameter(Mandatory=$true,Position=2)] [ValidateNotNullOrEmpty()] [Double]$AValue, [Parameter(Mandatory=$true,Position=3)] [ValidateNotNullOrEmpty()] [string]$BName, [Parameter(Mandatory=$true,Position=4)] [ValidateNotNullOrEmpty()] [Double]$BValue ) if ($ClearBetweenTests) { Clear-Host } $blen = $AName.Length + $BName.Length + 12 $Border = '' for ($i = 0; $i -lt $blen; $i++) { $Border += '*' } if ($OutToFile) { Out-File -FilePath $OutFileName -Append -Encoding utf8 -InputObject $Border Out-File -FilePath $OutFileName -Append -Encoding utf8 -InputObject ([string]::Format('** {0} vs {1} **', $AName, $BName)) Out-File -FilePath $OutFileName -Append -Encoding utf8 -InputObject $Border } Write-Host $Border -ForegroundColor White Write-Host ([string]::Format('** {0} vs {1} **', $AName, $BName)) -ForegroundColor White Write-Host $Border -ForegroundColor White if ($AValue -lt $BValue) { $Faster = $BValue / $AValue if ($Faster -lt 1.05) { $Winner = 'Tie' $AColor = [ConsoleColor]::White $BColor = [ConsoleColor]::White } else { $Winner = $AName $AColor = [ConsoleColor]::Green $BColor = [ConsoleColor]::Red } } elseif ($AValue -gt $BValue) { $Faster = $AValue / $BValue if ($Faster -lt 1.05) { $Winner = 'Tie' $AColor = [ConsoleColor]::White $BColor = [ConsoleColor]::White } else { $Winner = $BName $AColor = [ConsoleColor]::Red $BColor = [ConsoleColor]::Green } } else { $Winner = 'Tie' $AColor = [ConsoleColor]::White $BColor = [ConsoleColor]::White $Faster = 0 } $APad = '' $BPad = '' if ($AName.Length -gt $BName.Length) { $LenDiff = $AName.Length - $BName.Length for ($i = 0; $i -lt $LenDiff; $i++) { $BPad += ' ' } } else { $LenDiff = $BName.Length - $AName.Length for ($i = 0; $i -lt $LenDiff; $i++) { $APad += ' ' } } $AValue = [Math]::Round($AValue, 2) $BValue = [Math]::Round($BValue, 2) $Faster = [Math]::Round($Faster, 2) if ($OutToFile) { Out-File -FilePath $OutFileName -Append -Encoding utf8 -InputObject ([string]::Format('{0}: {1}{2}ms', $AName, $APad, $AValue)) Out-File -FilePath $OutFileName -Append -Encoding utf8 -InputObject ([string]::Format('{0}: {1}{2}ms', $BName, $BPad, $BValue)) Out-File -FilePath $OutFileName -Append -Encoding utf8 -InputObject ([string]::Format('WINNER: {0} {1}x Faster`r`n', $Winner, $Faster)) } Write-Host ([string]::Format('{0}: {1}{2}ms', $AName, $APad, $AValue)) -ForegroundColor $AColor Write-Host ([string]::Format('{0}: {1}{2}ms', $BName, $BPad, $BValue)) -ForegroundColor $BColor Write-Host ([string]::Format('WINNER: {0} {1}x Faster', $Winner, $Faster)) -ForegroundColor Yellow if ($PauseBetweenTests -eq $true) { Pause } }
Now, you may be wondering why I went through all of that trouble when there is a perfectly good Measure-Command cmdlet available. The reason is two fold. One, I wanted the statistics to be calculated without having to call a separate function. Two, Measure-Command does not handle output, and I wanted to be able to test and capture output if needed. If you haven’t tried to use Write-Output withing a Measure-Command script block before let me show you what I’m talking about:
PS C:> Measure-Command {Write-Host "Write-Host"} Write-Host Days : 0 Hours : 0 Minutes : 0 Seconds : 0 Milliseconds : 12 Ticks : 128366 TotalDays : 1.48571759259259E-07 TotalHours : 3.56572222222222E-06 TotalMinutes : 0.000213943333333333 TotalSeconds : 0.0128366 TotalMilliseconds : 12.8366 PS C:> Measure-Command {Write-Output "Write-Output"} Days : 0 Hours : 0 Minutes : 0 Seconds : 0 Milliseconds : 0 Ticks : 4005 TotalDays : 4.63541666666667E-09 TotalHours : 1.1125E-07 TotalMinutes : 6.675E-06 TotalSeconds : 0.0004005 TotalMilliseconds : 0.4005
Notice that when we call Write-Output that we don’t see the output? With my Test-Performance function, we can grab that output and still capture it. The output from the two functions looks like this:
Occurrance : {4.0353, 1.0091, 0, 0…} Sorted : {0, 0, 0, 1.0091…} Sum : 5.0444 Mean : 1.00888 Median : 1.0091 Minimum : 0 Maximum : 4.0353 Range : 4.0353 Variance : 2.4425469256 StdDeviation : 1.56286497356618 Output : {Write-Output, Write-Output, Write-Output, Write-Output…} ****************************** ** Filter vs Where-Object ** ****************************** Filter: 22ms Where-Object: 1007.69393ms WINNER: Filter 45.80x Faster
One other thing to mention is that I did not simply use $Start = Get-Date, $Stop = (Get-Date)-$Start. Why? Because some things happen so fast that I need to measure the speed in ticks or microseconds. Get-Date only measures time down to the millisecond, so anything less than a millisecond will be rounded to either 0 or 1 millisecond.
Test Order
With that out of the way, let’s look at test order first. Test Order is the order in which conditions are evaluated within a flow control block such as if/then or do/while. The compiler or engine will evaluate the conditions from left to right while respecting the order of operations. Why is this important? Let’s say you have the following if statement:
if (($haystack -contains "needle") -and ($x -eq 5)) { Do-Stuff }
We have two conditions: Does $Haystack contain the string “needle” and does $x equal 5. With the and statement we tell the engine that both must be true to meet the conditions of the if statement. The engine will evaluate the first statement, and if true, will continue through the remaining statements until it reaches either a false statement or has evaluated all statements. Let’s take a quick look at how long it takes to evaluate a few different types of conditions.
$v = 5000 $a = 1..10000 $s = 'reg.exe' $as = (Get-ChildItem -Path C:\Windows\System32 -Filter '*.exe').Name Test-Performance -Count 100 -ScriptBlock {$v -eq 5000} Test-Performance -Count 100 -ScriptBlock {$a -contains 5000} Test-Performance -Count 100 -ScriptBlock {$s -eq 'reg.exe'} Test-Performance -Count 100 -ScriptBlock {$as -contains 'reg.exe'}
That gives me the following output:
Occurrence : {0.9741, 0, 0, 5.9834…} Sorted : {0, 0, 0, 0…} Sum : 40.8437 Mean : 0.408437 Median : 0 Minimum : 0 Maximum : 5.9834 Range : 5.9834 Variance : 0.538503541531 StdDeviation : 0.733828005414757 Output : Occurrence : {0.9977, 0.9969, 0, 0.9971…} Sorted : {0, 0, 0, 0…} Sum : 67.7895 Mean : 0.677895 Median : 0.9934 Minimum : 0 Maximum : 3.9467 Range : 3.9467 Variance : 0.450743557675 StdDeviation : 0.671374379668304 Output : Occurrence : {0.989, 0.9973, 0, 0…} Sorted : {0, 0, 0, 0…} Sum : 52.9174 Mean : 0.529174 Median : 0 Minimum : 0 Maximum : 8.9804 Range : 8.9804 Variance : 1.222762781524 StdDeviation : 1.10578604690238 Output : Occurrence : {0.997, 0, 0, 1.0292…} Sorted : {0, 0, 0, 0…} Sum : 74.7425 Mean : 0.747425 Median : 0.957 Minimum : 0 Maximum : 6.9484 Range : 6.9484 Variance : 1.391867727275 StdDeviation : 1.1797744391514 Output :
What we see is that comparing if something is equal to something else is a lot faster than checking to see if an array contains an object. Now, I know, you’re thinking it’s just a couple milliseconds, but, checking if $v is equal to 5000 is almost twice as fast as checking if $as contains “reg.exe”. Keep in mind, that depending on where in the array our match is and how big our array is, that number can go up or down quite a bit. I’m just doing some simple synthetic tests to illustrate that there is a difference. When doing conditional statements like this, try to have your quicker conditions evaluated first and try to have statements that are most likely to fail evaluated first. Example:
Test-Performance -Count 100 -ScriptBlock { if (($x -eq 100) -or ($a -contains -5090) -or ($s -eq 'test.fake') -or ($as -contains 'reg.exe')) { $t = Get-Random } } Test-Performance -Count 100 -ScriptBlock { if (($as -contains 'reg.exe') -or ($x -eq 100) -or ($a -contains -5090) -or ($s -eq 'test.fake')) { $t = Get-Random } }
Gives me the following:
Occurrence : {0.9858, 0, 0.9969, 0…} Sorted : {0, 0, 0, 0…} Sum : 36.8537 Mean : 0.368537 Median : 0 Minimum : 0 Maximum : 3.9959 Range : 3.9959 Variance : 0.390509577731 StdDeviation : 0.624907655362774 Output : Occurrence : {0.9974, 0, 0.9971, 0…} Sorted : {0, 0, 0, 0…} Sum : 54.8193 Mean : 0.548193 Median : 0.4869 Minimum : 0 Maximum : 3.9911 Range : 3.9911 Variance : 0.425326705251 StdDeviation : 0.652170763873236 Output :
We can see that by changing the order of the evaluated conditions, our code runs in about 2/3 the time. Again, these are generic tests to illustrate the effect that test order has on execution time, but they illustrate some basic guidelines that should be able to be applied to most situations. Be sure to test your code.
Loop Execution
Now that I’ve gotten test order out of the way, let’s start with loop execution. Sometimes when we are working with a loop, like say stepping through an array, we don’t need to do something for every element. Sometimes, we are looking for a specific element and don’t care about anything after that. In these cases, break is our friend. For our first example, I’ll create an array of all years in the 1900’s. I’ll then loop through each one and write some output when I find 1950.
$Decade = 1900..1999 $TargetYear = 1950 $NoBreakResult = Test-Performance -Count 10 -ScriptBlock { for ($i = 0; $i -lt $Decade.Count; $i++) { if ($Decade[$i] -eq 1950) { Write-Output "Found 1950" } } } $BreakResult = Test-Performance -Count 10 -ScriptBlock { for ($i = 0; $i -lt $Decade.Count; $i++) { if ($Decade[$i] -eq 1950) { Write-Output "Found 1950" break } } } Get-Winner "No Break" $NoBreakResult.Median "Break" $BreakResult.Median
Our output looks as follows:
************************* ** No Break vs Break ** ************************* No Break: 0.38ms Break: 0.28ms WINNER: Break 1.35x Faster $NoBreakResult Occurrence : {0.8392, 0.4704, 0.4566, 0.444…} Sorted : {0.3425, 0.3438, 0.3442, 0.3445…} Sum : 47.8028 Mean : 0.478028 Median : 0.38175 Minimum : 0.3425 Maximum : 2.6032 Range : 2.2607 Variance : 0.127489637016 StdDeviation : 0.357056910052165 Output : {Found 1950, Found 1950, Found 1950, Found 1950…} $BreakResult Occurrence : {3.2739, 0.3445, 0.32, 0.3167…} Sorted : {0.2657, 0.266, 0.266, 0.2662…} Sum : 40.0342 Mean : 0.400342 Median : 0.2871 Minimum : 0.2657 Maximum : 3.2739 Range : 3.0082 Variance : 0.182262889036 StdDeviation : 0.426922579674582 Output : {Found 1950, Found 1950, Found 1950, Found 1950…}
As expected, the instance with the break commandwas about 25% faster. Next I’ll take a look at a different method that I don’t see in too many peoples code, the While/Do-While loop.
$DoWhileResult = Test-Performance -Count 100 -ScriptBlock { $i = 0 $Year = 0 do { $Year = $Decade[$i] if ($Year -eq 1950) { Write-Output "Found 1950" } $i++ } While ($Year -ne 1950) }
Which nets me the following:
**************************** ** Do-While vs No Break ** **************************** Do-While: 0.24ms No Break: 0.38ms WINNER: Do-While 1.57x Faster $DoWhileResult Occurrence : {0.9196, 0.313, 0.2975, 0.2933…} Sorted : {0.2239, 0.224, 0.2242, 0.2243…} Sum : 33.8217 Mean : 0.338217 Median : 0.2436 Minimum : 0.2239 Maximum : 5.0187 Range : 4.7948 Variance : 0.262452974211 StdDeviation : 0.512301643771519 Output : {Found 1950, Found 1950, Found 1950, Found 1950…}
As we can see, Do-While is also faster than running through the entire array. My example above does not have any safety mechanism for running beyond the end of the array or not finding the element I’m searching for. In practice, be sure to include such a condition/catch in your loop. Next, I’m going to compare the performance between a few different types of loops. Each loop will run through an array of numbers from 1 to 10,000 and calculate the square root of each. I’ll use the basic for loop as the baseline to compare against the other methods.
$ForLoop = Test-Performance -Count 100 -ScriptBlock { $ForArray = 1..10000 for ($i = 0; $i -lt 10000; $i++) { $sqrt = [Math]::Sqrt($Array[$i]) } } $ForEachLoop = Test-Performance -Count 100 -ScriptBlock { $ForEachArray = 1..10000 foreach ($item in $ForEachArray) { $sqrt = [Math]::Sqrt($item) } } $DotForEachLoop = Test-Performance -Count 100 -ScriptBlock { $DotForEachArray = 1..10000 $DotForEachArray.ForEach{ $sqrt = [Math]::Sqrt($_) } } $ForEachObjectLoop = Test-Performance -Count 100 -ScriptBlock { $ForEachObjectArray = 1..10000 $ForEachObjectArray | ForEach-Object { $sqrt = [Math]::Sqrt($_) } }
So how do they fare?
*********************** ** For vs For-Each ** *********************** For: 3ms For-Each: 1.99355ms WINNER: For-Each 1.50x Faster *********************** ** For vs .ForEach ** *********************** For: 3ms .ForEach: 1150.95495ms WINNER: For 383.65x Faster ***************************** ** For vs ForEach-Object ** ***************************** For: 3ms ForEach-Object: 1210.7644ms WINNER: For 403.59x Faster
Quite a bit of difference. Let’s take a look at the statistics for each of them.
$ForLoop Occurrence : {38.8952, 3.9984, 2.9752, 2.966…} Sorted : {0, 0, 1.069, 1.9598…} Sum : 330.7618 Mean : 3.307618 Median : 2.9731 Minimum : 0 Maximum : 38.8952 Range : 38.8952 Variance : 26.100085284676 StdDeviation : 5.10882425658546 Output : $ForEachLoop Occurrence : {7.0133, 1.9972, 1.9897, 0.9678…} Sorted : {0.9637, 0.9678, 0.9927, 0.9941…} Sum : 187.5277 Mean : 1.875277 Median : 1.99355 Minimum : 0.9637 Maximum : 7.0133 Range : 6.0496 Variance : 0.665303603371 StdDeviation : 0.815661451443551 Output : $DotForEachLoop Occurrence : {1225.7258, 1169.9073, 1147.9007, 1146.9384…} Sorted : {1110.0618, 1110.0688, 1113.9906, 1114.0656…} Sum : 114948.9291 Mean : 1149.489291 Median : 1150.95495 Minimum : 1110.0618 Maximum : 1225.7258 Range : 115.664 Variance : 534.931646184819 StdDeviation : 23.1285893686757 Output : $ForEachObjectLoop Occurrence : {1217.7802, 1241.7037, 1220.686, 1249.688…} Sorted : {1181.8081, 1188.8231, 1188.8291, 1191.7818…} Sum : 121345.8078 Mean : 1213.458078 Median : 1210.7644 Minimum : 1181.8081 Maximum : 1274.6289 Range : 92.8208000000002 Variance : 318.356594303116 StdDeviation : 17.8425501065043 Output :
If you notice on the for and for-each loops, the first run is significantly higher than the other entries (in fact, it’s the slowest run in the batch for each) whereas the method and cmdlet versions are much more consistent. This is due to the behavior of those methods. With a for and for-each loop, the method loads the entire collection into memory before processing. This causes the first run of the loop to take a bit longer, although, it’s still faster than the method or cmdlet. The cmdlet and method are slower as they load one iteration into memory at a time which is slower than loading the sum all at once (think random read/write vs sequential). The for loop is slightly slower than for-each because it has to evaluate the condition before proceeding through the next iteration.
.NET Methods vs. Cmdlets
Next, I’ll take a look at some of the differences between some of the common “native” PowerShell cmdlets and their .NET counterparts. I’m going to start with what will likely be the most common things that you’ll encounter in your scripts or scripts that you use, the array. We’ve probably all used them, and maybe even continue to use them. But should you? First, Let’s look at adding items to an array. We frequently start with blank arrays and add items to them as we go along.
$ArrayResult = Test-Performance -Count 100 -ScriptBlock { $Array = @() for ($i =0; $i -lt 10000; $i ++) { $Array += $i } } $ListResult = Test-Performance -Count 100 -ScriptBlock { $List = [System.Collections.Generic.List[PSObject]]::new() for ($i =0; $i -lt 10000; $i ++) { [void]$List.Add($i) } } Get-Winner "Array" $ArrayResult.Median "List" $ListResult.Median
********************* ** Array vs List ** ********************* Array: 2274ms List: 2.97945ms WINNER: List 763.23x Faster $ArrayResult Occurrence : {2407.5676, 2311.8239, 2420.5336, 2268.9383…} Sorted : {2190.1917, 2200.1205, 2219.1807, 2223.0887…} Sum : 228595.7729 Mean : 2285.957729 Median : 2274.42135 Minimum : 2190.1917 Maximum : 2482.3996 Range : 292.2079 Variance : 2527.01010551066 StdDeviation : 50.2693754239165 Output : $ListResult Occurrence : {24.9343, 19.9729, 3.9623, 5.9836…} Sorted : {0.9974, 1.9776, 1.9925, 1.994…} Sum : 373.999 Mean : 3.73999 Median : 2.97945 Minimum : 0.9974 Maximum : 51.8617 Range : 50.8643 Variance : 37.0781465771 StdDeviation : 6.0891827511662 Output :
Modifying an existing array can be a VERY expensive operation. Arrays are fixed length and can not be expanded or contracted. When we add or subtract an element, the engine first has to create a new array of size n + 1 or n – 1 and then copy each of the elements from the old array into the new one. This is slow, and can consume a lot of memory while the new array is being created and contents copied over. Lists on the other hand are not statically sized. The advantage of an array however is that they have a smaller memory footprint. Since an array is stored as a whole consecutively in memory, it’s size can roughly be calculated as SizeOf(TypeOf(Element))*NumElements. A Linked list on the other hand is not stored consecutively within memory and is a bit larger since each element contains a pointer to the next object. It’s size can roughly be calculated as (SizeOf(TypeOf(Element)) + SizeOf(Int)) * NumElements. You might be thinking, well, if an array is stored in a consecutive memory blocks, once the array is established, it should be faster to work with right? I’ll test.
[int[]]$Array = 1..10000 $List = [System.Collections.Generic.List[int]]::new() for ($i = 1; $i -lt 10001; $i++) { [void]$List.Add($i) } $ArrayForEachResult = Test-Performance -Count 100 -ScriptBlock { foreach ($int in $Array) { $int = 5 } } $ListForEachResult = Test-Performance -Count 100 -ScriptBlock { foreach ($int in $List) { $int = 5 } }
First, we create an array of 10,000 elements with the numbers 1 through 10,000 inclusive. We declare the array as an integer array to ensure we are comparing to objects of the same type so to speak. We then create a list<int> and fill with the same values. So how do they fare?
*************************************** ** Array For-Each vs List For-Each ** *************************************** Array For-Each: 1.47ms List For-Each: 0.83ms WINNER: List For-Each 1.77x Faster $ArrayForEachResult Occurrence : {4.643, 1.4673, 1.4029, 1.3336…} Sorted : {1.3194, 1.3197, 1.3255, 1.3272…} Sum : 156.4036 Mean : 1.564036 Median : 1.47125 Minimum : 1.3194 Maximum : 4.643 Range : 3.3236 Variance : 0.136858367504 StdDeviation : 0.369943735592319 Output : {, , , …} $ListForEachResult Occurrence : {7.233, 1.723, 0.8305, 0.8632…} Sorted : {0.6174, 0.6199, 0.6203, 0.6214…} Sum : 164.8467 Mean : 1.648467 Median : 0.83335 Minimum : 0.6174 Maximum : 71.705 Range : 71.0876 Variance : 50.017547074011 StdDeviation : 7.07230846852787 Output : {, , , …}
As we can see, the list still out-performs the array, although by less of a margin than it did during the manipulation test. I suspect that this is due to the cmdlet having to load the entirety of the array as opposed to just pointers with the list. Now let’s compare .NET Regex vs the PowerShell method. For this, I’m going to be replacing text instead of just checking for the match. Let’s look at the code.
$Haystack = "The Quick Brown Fox Jumped Over the Lazy Dog 5 Times" $Needle = "\ ([\d]{1})\ " $NetRegexResult = Test-Performance -Count 1000 -ScriptBlock { [regex]::Replace($Haystack, $Needle, " $(Get-Random -Minimum 2 -Maximum 9) ") Write-Output $Haystack } $PoshRegexResult = Test-Performance -Count 1000 -ScriptBlock { $Haystack -replace $Needle, " $(Get-Random -Minimum 2 -Maximum 9) " Write-Output $Haystack } Get-Winner ".NET RegEx" $NetRegexResult.Median "PoSh RegEx" $PoshRegexResult.Median
Nothing too fancy here. We take our haystack (the sentence), look for the needle (the number of times the fox jumped over the dog) and replace it with a new single digit random number.
******************************** ** .NET RegEx vs PoSh RegEx ** ******************************** .NET RegEx: 0.23ms PoSh RegEx: 0.23ms WINNER: Tie 1x Faster $NetRegexResult Occurrence : {0.7531, 0.2886, 0.3572, 0.3181…} Sorted : {0.2096, 0.2106, 0.211, 0.2189…} Sum : 282.3331 Mean : 0.2823331 Median : 0.23035 Minimum : 0.2096 Maximum : 2.2704 Range : 2.0608 Variance : 0.03407226235439 StdDeviation : 0.184586733961003 Output : {The Quick Brown Fox Jumped Over the Lazy Dog 5 Times The Quick Brown Fox Jumped Over the Lazy Dog 5 Times, The Quick Brown Fox Jumped Over the Lazy Dog 8 Times The Quick Brown Fox Jumped Over the Lazy Dog 5 Times, The Quick Brown Fox Jumped Over the Lazy Dog 4 Times The Quick Brown Fox Jumped Over the Lazy Dog 5 Times, The Quick Brown Fox Jumped Over the Lazy Dog 4 Times The Quick Brown Fox Jumped Over the Lazy Dog 5 Times…} $PoshRegexResult Occurrence : {0.7259, 0.2546, 0.2513, 0.2486…} Sorted : {0.2208, 0.2209, 0.2209, 0.2211…} Sum : 279.0913 Mean : 0.2790913 Median : 0.231 Minimum : 0.2208 Maximum : 2.1124 Range : 1.8916 Variance : 0.03001781767431 StdDeviation : 0.173256508317321 Output : {The Quick Brown Fox Jumped Over the Lazy Dog 8 Times The Quick Brown Fox Jumped Over the Lazy Dog 5 Times, The Quick Brown Fox Jumped Over the Lazy Dog 6 Times The Quick Brown Fox Jumped Over the Lazy Dog 5 Times, The Quick Brown Fox Jumped Over the Lazy Dog 7 Times The Quick Brown Fox Jumped Over the Lazy Dog 5 Times, The Quick Brown Fox Jumped Over the Lazy Dog 2 Times The Quick Brown Fox Jumped Over the Lazy Dog 5 Times…}
Surprisingly, or not surprisingly, the two methods are pretty much dead even. This is one of the cases where I suspect the PowerShell cmdlet is pretty much just a wrapper/alias for the corresponding .NET equivalent. You might ask yourself why you would use the .NET methods if the PowerShell cmdlets net the same performance. The answer (and this applies in a lot of cases) is that the PowerShell Cmdlets don’t always offer the same options as their .NET counterparts. I’m going to use String.Split as an example. Take a look at the two documentation pages for “String” -Split and String.Split. You may have noticed that they aren’t entirely the same. Far from it actually. In most cases, they will return you with the same results, but they don’t both support the same options. For example, if you want to remove blank entries, you’ll need to use the .Split() method. But what about performance?
$SplitString = "one,two,three,four,five,six,seven,eight,nine,ten," $DashSplitResult = Test-Performance -Count 10000 -ScriptBlock { $SplitArray = $SplitString -Split ',' } $DotSplitResult = Test-Performance -Count 10000 -ScriptBlock { $SplitArray = $SplitString.Split(',') } Get-Winner "-Split" $DashSplitResult.Median ".Split()" $DotSplitResult.Median
************************** ** -Split vs .Split() ** ************************** -Split: 0.13ms .Split(): 0.12ms WINNER: .Split() 1.1x Faster $DashSplitResult Occurrence : {0.4855, 0.1837, 0.2387, 0.1916…} Sorted : {0.1128, 0.113, 0.1131, 0.1131…} Sum : 1613.99049999999 Mean : 0.161399049999999 Median : 0.125 Minimum : 0.1128 Maximum : 2.1112 Range : 1.9984 Variance : 0.0234987082460975 StdDeviation : 0.153292883872988 Output : {, , , …} $DotSplitResult Occurrence : {0.552, 0.1339, 0.1245, 0.1227…} Sorted : {0.1052, 0.1056, 0.1056, 0.1057…} Sum : 1485.38330000001 Mean : 0.148538330000001 Median : 0.1162 Minimum : 0.1052 Maximum : 1.9226 Range : 1.8174 Variance : 0.0186857188798111 StdDeviation : 0.136695716391594 Output : {, , , …}
Pretty close, but .Split does edge out -Split by just a hair. Is it worth re-writing all of your code? Doubtful. But if you use string splitting methods frequently, it may be worth doing some testing with your common use cases to see if there could be an impact. And with that, I’m going to wrap up the first part of this post.