This code works. I just want to see how much faster someone can make it work.
Backup your Windows 10 batch file in case something goes wrong. Find all instances of string {LINE2 1-9999} and replace with {LINE2 "line number the code is on"}. Overwrite, encoding as ASCII.
If _61.bat is:
TITLE %TIME% NO "%zmyapps1%\*.*" ARCHIVE ATTRIBUTE LINE2 1243
TITLE %TIME% DOC/SET YQJ8 LINE2 1887
SET ztitle=%TIME%: WINFOLD LINE2 2557
TITLE %TIME% _*.* IN WINFOLD LINE2 2597
TITLE %TIME% %%ZDATE1%% YQJ25 LINE2 3672
TITLE %TIME% FINISHED. PRESS ANY KEY TO SHUTDOWN ... LINE2 4922
Results:
TITLE %TIME% NO "%zmyapps1%\*.*" ARCHIVE ATTRIBUTE LINE2 1
TITLE %TIME% DOC/SET YQJ8 LINE2 2
SET ztitle=%TIME%: WINFOLD LINE2 3
TITLE %TIME% _*.* IN WINFOLD LINE2 4
TITLE %TIME% %%ZDATE1%% YQJ25 LINE2 5
TITLE %TIME% FINISHED. PRESS ANY KEY TO SHUTDOWN ... LINE2 6
Code:
Copy-Item $env:windir\_61.bat -d $env:temp\_61.bat
(gc $env:windir\_61.bat) | foreach -Begin {$lc = 1} -Process {
$_ -replace "LINE2 \d*", "LINE2 $lc";
$lc += 1
} | Out-File -Encoding Ascii $env:windir\_61.bat
I expect this to take less than 984 milliseconds. It takes 984 milliseconds. Can you think of anything to speed it up?
The key to better performance in PowerShell code (short of embedding C# code compiled on demand with
Add-Type, which may or may not help) is to:avoid use of cmdlets and the pipeline in general,
especially invocation of a script block (
{...}) for each pipeline input object, such as withForEach-ObjectandWhere-ObjectHowever, it isn't the pipeline per se that is to blame, it is the current inefficient implementation of these cmdlets - see GitHub issue #10982 - and there is a workaround that noticeably improves pipeline performance:
avoiding the pipeline requires direct use of the .NET framework types as an alternative to cmdlets.
if feasible, use
switchstatements for array or line-by-line file processing -switchstatements generally outperformforeachloops.To be clear: The pipeline and cmdlets offer clear benefits, so avoiding them should only be done if optimizing performance is a must.
In your case, the following code, which combines the
switchstatement with direct use of the .NET framework for file I/O seems to offer the best performance - note that the input file is read into memory as a whole, as an array of lines, and a copy of that array with the modified lines is created before it is written back to the input file:Note:
Enclosing the
switchstatement in& { ... }is an obscure performance optimization explained in this answer.If case-sensitive matching is sufficient, as suggested by the sample input, you can improve performance a little more by adding the
-CaseSensitiveoption to theswitchcommand.In my tests (see below), this provided a more than 4-fold performance improvement in Windows PowerShell relative to your command.
Here's a performance comparison via the
Time-Commandfunction:The commands compared are:
The
switchcommand from above.A slightly streamlined version of your own command.
A PowerShell Core v6.1+ alternative that uses the
-replaceoperator with the array of lines as the LHS and a scriptblock as the replacement expression.Instead of a 6-line sample file, a 6,000-line file is used. 100 runs are being averaged. It's easy to adjust these parameters.
Here are sample results from my Windows 10 machine (the absolute timings aren't important, but hopefully the relative performance show in in the
Factorcolumn is somewhat representative); the PowerShell Core version used is v6.2.0-preview.4