Skip to content

PowerShell

PowerShell Gotcha! - dynamic scoping

PowerShell uses dynamic scoping. Yet the about_Scopes page doesn't mention the word "dynamic".

Wird (wird - so weird that you need to misspell weird to get your point across).

tl;dr;

In PowerShell variables are copied into the stack frame created for the function you're calling. So the "child" function can use your variables but can only modify its own copies. You can avoid this by setting your variable to private $private:varName=... and using Set-StrictMode -version latest to throw an error if "child" functions try to access a undefined variable.

PowerShell uses dynamic scoping. What we know from most programming languages is lexical scoping.


function Do-InnerFunction  { Write-Host $t }
function Do-OutterFunction {
    $t = "hello"
    Do-InnerFunction
}

Do-OutterFunction
hello

Weird! (this is dynamic scoping)


Set-StrictMode -Version Latest
function Do-InnerFunction  { Write-Host $t }
function Do-OutterFunction {
    $t = "hello"
    Do-InnerFunction
}

Do-OutterFunction
Set-StrictMode -Off # remember to turn strict mode off for further testing
hello

Weird! (but makes sense since in PowerShell's world this is perfectly legal hence "strict" changes nothing here)


function Do-InnerFunction  { Write-Host $t }
function Do-OutterFunction {
    $private:t = "hello"
    Do-InnerFunction
}

Do-OutterFunction

Output is empty. No errors but at least $t behaves more like a variable we know from C#/F#.


Set-StrictMode -Version Latest
function Do-InnerFunction  { Write-Host $t }
function Do-OutterFunction {
    $private:t = "hello"
    Do-InnerFunction
}

Do-OutterFunction
InvalidOperation: C:\Users\...\Temp\44f5ff41-4105-482b-a134-b505049d2c61\test3.ps1:2
Line |
   2 |      Write-Host $t
     |                 ~~
     | The variable '$t' cannot be retrieved because it has not been set.

Finally!


function Do-InnerFunction {
    Write-Host $t
    $t = "world"
    Write-Host $t
}

function Do-OutterFunction {
    $t = "hello"
    Do-InnerFunction
    Write-Host $t
}

Do-OutterFunction
hello
world
hello

Ah! So variables are copied to the next "scope".


function Do-InnerFunction {
    Write-Host $t
    $global:t = "world"
    Write-Host $t
}

function Do-OutterFunction {
    $t = "hello"
    Do-InnerFunction
    Write-Host $t
}

Do-OutterFunction
Write-Host $t
hello
hello
hello
world

Now we have created a global $t variable.

This https://ig2600.blogspot.com/2010/01/powershell-is-dynamically-scoped-and.html explains it nicely.

Environment variable

but only in a specific directory

The idea - use the Prompt function to check if you're in a specific dir and set/unset an env var:

function Prompt {
    $currentDir = Get-Location
    if ("C:\git\that-special-dir" -eq $currentDir) {
        $env:THAT_SPECIAL_ENV_VAR = "./extra.cer"
    }
    else {
        Remove-Item Env:\THAT_SPECIAL_ENV_VAR
    }
}

Extract the special env setting/unsetting to a function:

function SetOrUnSet-DirectoryDependent-EnvironmentVariables {
    $currentDir = Get-Location
    if ("C:\git\that-special-dir" -eq $currentDir) {
        $env:THAT_SPECIAL_ENV_VAR = "./extra.cer"
    }
    else {
        Remove-Item Env:\THAT_SPECIAL_ENV_VAR
    }
}

function Prompt {
    SetOrUnSet-DirectoryDependent-EnvironmentVariables
}

If your Prompt function is already overwritten by ex. oh-my-posh:

function SetOrUnSet-DirectoryDependent-EnvironmentVariables {
    $currentDir = Get-Location
    if ("C:\git\that-special-dir" -eq $currentDir) {
        $env:THAT_SPECIAL_ENV_VAR = "./extra.cer"
    }
    else {
        Remove-Item Env:\THAT_SPECIAL_ENV_VAR
    }
}

$promptFunction = (Get-Command Prompt).ScriptBlock

function Prompt {
    SetOrUnSet-DirectoryDependent-EnvironmentVariables
    $promptFunction.Invoke()
}

Why did I need this?

In a repository with several js scrapers run by NODE a few scrape data from misconfigured websites. These websites don't provide the intermediate certificate for https. Your browser automatically fills in the gap for convenience but a simple http client like axios will rightfully reject the connection as it can't verify who it is talking to (see more here)

Solution?

Use NODE_EXTRA_CA_CERTS

  • You configure your production server with NODE_EXTRA_CA_CERTS.
  • When testing locally you get tired of remembering to set NODE_EXTRA_CA_CERTS.
  • You add NODE_EXTRA_CA_CERTS to your powershell profile. Now every time you run anything using NODE (like vs code) you see
    Warning: Ignoring extra certs from `./extra.cer`, load failed: error:02000002:system library:OPENSSL_internal:No such file or directory
    
  • You get annoyed and you ask yourself how to set an environment variable but only in a specific directory

I use this myself here -> the public part of my powershell-profile

PowerShell "Oopsie"

Task - remove a specific string from each line of multiple CSV files.

This task was added to the scripting exercise list.

First - let's generate some CSV files to work with:

$numberOfFiles = 10
$numberOfRows = 100

$fileNames = 1..$numberOfFiles | % { "file$_.csv" }
$csvData = 1..$numberOfRows | ForEach-Object {
    [PSCustomObject]@{
        Column1 = "Value $_"
        Column2 = "Value $($_ * 2)"
        Column3 = "Value $($_ * 3)"
    }
}

$fileNames | % { $csvData | Export-Csv -Path $_ }

The "Oopsie"

ls *.csv | % { cat $_ | % { $_ -replace "42","" } | out-file $_ -Append }

This command will never finish. Run it for a moment (and then kill it), see the result, and try to figure out what happens. Explanation below.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

The explanation

Get-Content (aka. cat) keeps the file open and reads the content that our command is appending, thus creating an infinite loop.

The fix

There are many ways to fix this this "oopsie"

Perhaps the simplest one is to not write to and read from the exact same file. A sensible rule is when processing files always write to a different file:

ls *.csv | % { cat $_ | % { $_ -replace "42","" } | out-file -path "fixed$($_.Name)" }

Knowing the reason for our command hanging we can make sure the whole file is read before we overwrite it:

ls *.csv | % { (cat $_ ) | % { $_ -replace "42","" } | out-file $_ }
ls *.csv | % { (cat $_ ) -replace "42","" | out-file $_ } # we can also use -replace as an array operator

I'm amazed by github's co-pilot answer for "powershell one liner to remove a specific text from multiple CSV files":

Get-ChildItem -Filter "*.csv" | ForEach-Object { (Get-Content $_.FullName) -replace "string_to_replace", "replacement_string" | Set-Content $_.FullName }

PowerShell quirk

tl;dr

In PowerShell if you want to return an array instead of one element of the array at the time do this:

> @(1..2) | % { $a = "a" * $_; @($a,$_) } # wrong! will pipe/return 1 element at a time
> @(1..2) | % { $a = "a" * $_; ,@($a,$_) } # correct! will pipe/return pairs
Beware! Result of both snippets will be displayed in the exact same way even though they have different types! See below:
> @(1,2,3,4)
1
2
3
4
> @((1,2),(3,4))
1
2
3
4

To check actual types:

> $x = @(1..2) | % { $a = "a" * $_; @($a,$_) } ; $x.GetType().Name ; $x[0].GetType().Name ; $x
> $x = @(1..2) | % { $a = "a" * $_; ,@($a,$_) } ; $x.GetType().Name ; $x[0].GetType().Name ; $x
# Alternatively
> @(1..2) | % { $a = "a" * $_; @($a,$_) } | Get-Member -name GetType
> @(1..2) | % { $a = "a" * $_; ,@($a,$_) } | Get-Member -name GetType
# Get-Member only shows output for every distinct type

Longer read

Occasionally I have a fist fight with PS to return an Array instead of one element at a time. PS is a tough oponent. I think I get it now though.

The comma , in PS is a binary and unary operator. You can use it with a single or 2 arguments.

> ,1 # as an unary operator the comma creates an array with 1 member
1
> 1,2 # as an binary operator the comma creates an array with 2 members
1
2

Beware that both an array[] and array[][] will be displayed the same way. $y is an array[][], it is printed the same way to the output as $x

> $x = @(1,2,3,4) ; $x.GetType().Name ; $x[0].GetType().Name ; $x
Object[]
Int32
1
2
3
4
> $y = @((1,2),(3,4)) ; $y.GetType().Name ; $y[0].GetType().Name ; $y
Object[]
Object[]
1
2
3
4

If you're trying to return an array of pairs:

> @(1..2) | % { $a = "a" * $_; @($a,$_) } # wrong
a
1
aa
2
> @(1..2) | % { $a = "a" * $_; ,@($a,$_) } # correct!
a
1
aa
2
# Even though the result looks like a flat array this this time it's an array of arrays
> @(1..2) | % { $a = "a" * $_; @($a,$_) } | Get-Member -name GetType # we get strings and ints

   TypeName: System.String

Name    MemberType Definition
----    ---------- ----------
GetType Method     type GetType()

   TypeName: System.Int32

Name    MemberType Definition
----    ---------- ----------
GetType Method     type GetType()

> @(1..2) | % { $a = "a" * $_; ,@($a,$_) } | Get-Member -name GetType # we get arrays

   TypeName: System.Object[]

Name    MemberType Definition
----    ---------- ----------
GetType Method     type GetType()

More on printing your arrays of pairs:

> @(1..4) | % { $a = "a" * $_; ,@($a,$_) } | write-output # write-output will "unwind" your array
a
1
aa
2
> @(1..4) | % { $a = "a" * $_; ,@($a,$_) } | write-host
a 1
aa 2
> @(1..4) | % { $a = "a" * $_; ,@($a,$_) } | % { write-output "$_" }
a 1
aa 2
> @(1..4) | % { $a = "a" * $_; ,@($a,$_) } | write-output -NoEnumerate # returns an array of arrays but it's printed as if it's a flat array
a
1
aa
2

This https://stackoverflow.com/a/29985418/2377787 explains how @() works in PS.

> $a='A','B','C'
> $b=@($a;)
> $a
A
B
C
> $b
A
B
C
> [Object]::ReferenceEquals($a, $b)
False
Above $a; is understood as $a is a collection, collections should be enumerated and each item is passed to the pipeline. @($a;) sees 3 elements but not the original array and creates an array from the 3 elements. In PS @($collection) creates a copy of $collection. @(,$collection) - creates an array with a single element $collection.

Exercises in bash/shell/scripting

Being fluent in shell/scripting allows you to improve your work by 20%. It doesn't take you to another level. You don't suddenly poses the knowledge to implement flawless distributed transactions but some things get done much faster with no frustration.

Here is my collection of shell/scripting exercises for others to practice shell skills.

A side note - I'm still not sure if I should learn more PowerShell, try out a different shell or do everything in F# fsx. PowerShell is just so ugly ;(

Scroll down for answers

Exercise 1

What were the arguments of DetectOrientationScript function in https://github.com/tesseract-ocr/tesseract when it was first introduced?

Exercise 2

Get Hadoop distributed file system log from https://github.com/logpai/loghub?tab=readme-ov-file

Find the ratio of (failed block serving)/(failed block serving + successful block serving) for each IP

The result should like:

...
10.251.43.210  0.452453987730061
10.251.65.203  0.464609355865785
10.251.65.237  0.455237129089526
10.251.66.102  0.452124935995904
...

Exercise 3

This happened to me once - I had to find all http/s links to a specific domains in the export of our company's messages as someone shared proprietary code on websites available publicly.

Exercise - find all distinct http/s links in https://github.com/tesseract-ocr/tesseract

Exercise 4

Task - remove the string "42" from each line of multiple CSV files.

You can use this to generate the input CSV files:

$numberOfFiles = 10
$numberOfRows = 100

$fileNames = 1..$numberOfFiles | % { "file$_.csv" }
$csvData = 1..$numberOfRows | ForEach-Object {
    [PSCustomObject]@{
        Column1 = "Value $_"
        Column2 = "Value $($_ * 2)"
        Column3 = "Value $($_ * 3)"
    }
}

$fileNames | % { $csvData | Export-Csv -Path $_ }

Exercise 5

Just like me you created tens of repositories while writing code katas. Now you would like to keep all katas in a single repository. Write a script to move several repositories to a single repository. Each repo's content will end up in a dedicated directory in the new "master" repo. Remember to merge unrelated histories in the "master" repo.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Exercise 1 - answer

Answer:

bool DetectOrientationScript(int& orient_deg, float& orient_conf, std::string& script, float& script_conf);

[PowerShell]
> git log -S DetectOrientationScript # get sha of oldest commit
> git show bc95798e011a39acf9778b95c8d8c5847774cc47 | sls DetectOrientationScript

[bash]
> git log -S DetectOrientationScript # get sha of oldest commit
> git show bc95798e011a39acf9778b95c8d8c5847774cc47 | grep DetectOrientationScript

One-liner:

[PowerShell]
> git log -S " DetectOrientationScript" -p | sls DetectOrientationScript | select -Last 1

[bash]
> git log -S " DetectOrientationScript" -p | grep DetectOrientationScript | tail -1

Bonus - execution times

[PowerShell 7.4]
> measure-command { git log -S " DetectOrientationScript" -p | sls DetectOrientationScript | select -Last 1 }
...
TotalSeconds      : 3.47
...

[bash]
> time git log -S " DetectOrientationScript" -p | grep DetectOrientationScript | tail -1
...
real    0m3.471s
...

Without git log -S doing heavy lifting times look different:

[PowerShell 7.4]
> @(1..10) | % { Measure-Command { git log -p | sls "^\+.*\sDetectOrientationScript" } } | % { $_.TotalSeconds } | Measure-Object -Average

Count    : 10
Average  : 9.27122774
[PowerShell 5.1]
> @(1..10) | % { Measure-Command { git log -p | sls "^\+.*\sDetectOrientationScript" } } | % { $_.TotalSeconds } | Measure-Object -Average

Count    : 10
Average  : 27.33900077
[bash]
> seq 10 | xargs -I '{}' bash -c "TIMEFORMAT='%3E' ; time git log -p | grep -E '^\+.*\sDetectOrientationScript' > /dev/null" 2> times
> awk '{s+=$1} END {print s}' times
6.7249 # For convince I moved to dot one place to the left

Reflections

Bash is faster then PowerShell. PowerShell 7 is much faster then PowerShell 5. It was surprisingly easy to get the average with Measure-Object in PowerShell and surprisingly difficult in bash.

Exercise 2 - answer

[PowerShell 7.4]
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | % { $_.Matches[0].Value -replace "Served block.*/","ok/" -replace "Got exception while serving.*/","nk/" -replace ":","" } | % { $_ -replace "(ok|nk)/(.*)", "`${2} `${1}"} | sort > sorted
> cat .\sorted | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; ,@($_.name, ($g.Length/$_.count)) } | write-host

This is how I got to the answer:

> sls "Served block" -Path .\HDFS.log | select -first 10
> sls "Served block|Got exception while serving" -Path .\HDFS.log | select -first 10
> sls "Served block|Got exception while serving" -Path .\HDFS.log | select -first 100
> sls "Served block|Got exception while serving" -Path .\HDFS.log | select -first 1000
> sls "Served block.*|Got exception while serving" -Path .\HDFS.log | select -first 1000
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | select -first 1000
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log -raw | select -first 1000
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log -raw | select matches -first 1000
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log -raw | select Matches -first 1000
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log -raw | select Matches
> $a = sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log -raw
> $a[0]
> get-type $a[0]
> Get-TypeData $a
> $a[0]
> $a[0].Matches[0].Value
> $a = sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log
> $a[0]
> $a[0].Matches[0].Value
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | % { $_.Matches[0].Value -replace "Served block.*/","ok/" }
> "asdf" -replace "a","b"
> "asdf" -replace "a","b" -replace "d","x"
> "asdf" -replace "a.","b" -replace "d","x"
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | % { $_.Matches[0].Value -replace "Served block.*/","ok/" -replace "Got exception while serving.*/","nk" }
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | % { $_.Matches[0].Value -replace "Served block.*/","ok/" -replace "Got exception while serving.*/","nk/" }
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | % { $_.Matches[0].Value -replace "Served block.*/","ok/" -replace "Got exception while serving.*/","nk/" -replace ":","" }
> "aaxxaa" -replace "a.","b"
> "aaxxaa" -replace "a.","b$0"
> "aaxxaa" -replace "a.","b$1"
> "aaxxaa" -replace "a.","b${1}"
> "aaxxaa" -replace "a.","b${0}"
> "aaxxaa" -replace "a.","b`${0}"
> "okaaxxokaa" -replace "(ok|no)aa","_`{$1}_"
> "okaaxxokaa" -replace "(ok|no)aa","_`${1}_"
> "okaaxxokaa" -replace "(ok|no)aa","_`${1}_`${0}"
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | % { $_.Matches[0].Value -replace "Served block.*/","ok/" -replace "Got exception while serving.*/","nk/" -replace ":","" } | % { $_ -replace "(ok|nk)/(.*)", "`${2} `${1}"}
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | % { $_.Matches[0].Value -replace "Served block.*/","ok/" -replace "Got exception while serving.*/","nk/" -replace ":","" } | % { $_ -replace "(ok|nk)/(.*)", "`${2} `${1}"} | sort
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | % { $_.Matches[0].Value -replace "Served block.*/","ok/" -replace "Got exception while serving.*/","nk/" -replace ":","" } | % { $_ -replace "(ok|nk)/(.*)", "`${2} `${1}"} | sort > sorted
> cat .\sorted -First 10
> cat | group
> cat | group -Property {$_}
> cat .\sorted | group -Property {$_}
> cat .\sorted -Head 10 | group -Property {$_}
> cat .\sorted -Head 100 | group -Property {$_}
> cat .\sorted -Head 1000 | group -Property {$_}
> cat .\sorted -Head 10000 | group -Property {$_}
> cat .\sorted -Head 10000 | group -Property {$_} | select name,count
> cat .\sorted | group -Property {$_} | select name,count
> cat .\sorted | group -Property {$_ -replace "nk|ok",""}
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""}
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $_.name, $g.Length / $_.count }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $_.name, $g.Length, $_.count }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $_.name, $g.Length / $_.count }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $_.name, $g.Length, $_.count }
> $__
> $__[0]
> $__[1]
> $__[2]
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $_.name, $g.Length, $_.count }
> $a = cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $_.name, $g.Length, $_.count }
> $a[0]
> $a[1]
> $a[2]
> $a[1].GetType()
> $a[2].GetType()
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $_.name, ($g.Length) / ($_.count) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $_.name, (($g.Length) / ($_.count)) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; ,$_.name, (($g.Length) / ($_.count)) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; @($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; ,@($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; return ,@($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; [Array] ,@($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; [Array]@($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; return ,$_.name, (($g.Length) / ($_.count)) }
> $a = cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; return ,$_.name, (($g.Length) / ($_.count)) }
> $a[0]
> $a = cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; return ,($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; return ,($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; return ,@($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; ,@($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); $x }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); ,$x }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); return ,$x }
> $a = cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); return ,$x }
> $a[0]
> $a[0][0]
> $a = cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); return ,$x } | % { wirte-output "$_[0]" }
> $a = cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); return ,$x } | % { write-output "$_[0]" }
> $a = cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); return ,$x } | % { write-output "$_[0]" }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); return ,$x } | % { write-output "$_[0]" }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); return ,$x } | % { write-output "$_" }
> cat .\sorted | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); return ,$x } | % { write-output "$_" }

[F#]
open System.IO
open System.Text.RegularExpressions

let lines = File.ReadAllLines("HDFS.log")

let a =
    lines
    |> Array.filter (fun x -> x.Contains("Served block") || x.Contains("Got exception while serving"))

a
// |> Array.take 10000
|> Array.map (fun x ->
    let m = Regex.Match(x, "(Served block|Got exception while serving).*/(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})")
    m.Groups[2].Value,
    match m.Groups[1].Value with
    | "Served block"                -> true
    | "Got exception while serving" -> false )
|> Array.groupBy fst
|> Array.map (fun (key, group) ->
    let total = group.Length
    let failed = group |> Array.map snd |> Array.filter not |> Array.length
    key, (decimal failed)/(decimal total)
    )
|> Array.sortBy fst
|> Array.map (fun (i,m) -> sprintf "%s  %.15f" i m)
|> fun x -> File.AppendAllLines("fsout", x)

Exercise 3 - answer

[PowerShell 7.4]
> ls -r -file | % { sls -path $_.FullName -pattern https?:.* -CaseSensitive } | % { $_.Matches[0].Value } | sort | select -Unique

# finds 234 links
[bash]
> find . -type f -not -path './.git/*' | xargs grep -E https?:.* -ho | sort | uniq

# finds 234 links

Exercise 4 - answer

[PowerShell 7.4]
ls *.csv | % { (cat $_ ) -replace "42","" | out-file $_ }

[bash]
> sed -i 's/43//' *.csv
> sed -ibackup 's/43//' *.csv # creates backup files
This neat, perhaps unix people had wisdom that is lost now.

Exercise 5 - answer

$repos = @(
    @("https://github.com/inwenis/kata.sortingitout", "sortingitout", "kata_sorting_it_out"  ),
    @("https://github.com/inwenis/anagrams_kata2",    "anagrams2",    "kata_anagrams2"  ),
    @("https://github.com/inwenis/anagram_kata",      "anagrams",     "kata"  )
)

$repos | ForEach-Object {
    $repo, $branch, $dir = $_
    $repoName = $repo.Split("/")[-1]
    git clone $repo
    pushd $repoName
    git co -b $branch
    $all = Get-ChildItem
    mkdir $dir
    $all | ForEach-Object {
        Move-Item $_ -Destination $dir
    }
    git add -A
    git cm -am "move"
    git remote add kata https://github.com/inwenis/kata
    git push -u kata $branch
    popd
    Remove-Item $repoName -Recurse -Force
    Read-Host "Press Enter to continue"
}