F Sharp

February 9, 2025
in F Sharp
2 min read

Some neat fsx F#

My company had a hackathon focused on data scraping/processing.

Each team had to scrape 3 endpoints. I came up with something similar to this:

open System
open System.Net.Http
open System.Text

let c = new HttpClient()
c.Timeout <- TimeSpan.FromSeconds(5.0)

let lockObject = new obj()
let printSync text =
    let now = DateTimeOffset.Now.ToString("O")
    lock lockObject (fun _ -> printfn "[%s] %s" now text)

let s = new HttpClient()
s.Timeout <- TimeSpan.FromSeconds(5.0)
s.DefaultRequestHeaders.Add("X-Sender", "this is me, Mario!")
let sendToDestination stream response = async {
    let template = """{
    "CreatedAt": "xxXCreatedAtXxx",
    "Stream": "xxXStreamXxx",
    "Data": [
        xxXDataXxx
    ]
}"""
    let payload = template.Replace("xxXCreatedAtXxx", DateTimeOffset.Now.ToString("O"))
                          .Replace("xxXStreamXxx", stream)
                          .Replace("xxXDataXxx", response)
    let! response = s.PostAsync("http://localhost:8080", new StringContent(payload, Encoding.UTF8, "application/json") ) |> Async.AwaitTask
    sprintf "%s done sending response code %A" stream response.StatusCode |> printSync
}

let scraper (url:string) stream = async {
    while true do
        try
            let! response = c.GetStringAsync(url) |> Async.AwaitTask
            do! sendToDestination stream response
            sprintf "scraped %40s sendTo %s" url stream |> printSync
        with
        | _ -> sprintf "failed to scrape/or send %40s" url |> printSync

        do! Async.Sleep 1000
}

let urls = [
    "https://jsonplaceholder.typicode.com/posts", "123"
    "https://jsonplaceholder.typicode.com/posts", "124"
    "https://jsonplaceholder.typicode.com/posts", "125"
]

urls
|> List.map (fun (url, stream) -> scraper url stream)
|> Async.Parallel
|> Async.Ignore
|> Async.Start

// Async.CancelDefaultToken()

Things to keep in mind:

always have a try/catch all exceptions in async/tasks/threads
- you don't want your thread to die without you knowing
always set a timeout when scraping (default timeout in .NET is 100s which is excessive for this script)

A minimalistic http server to listen to our scrapers:

open System.Net
open System.Text

// https://sergeytihon.com/2013/05/18/three-easy-ways-to-create-simple-web-server-with-f/
// run with `fsi --load:ws.fsx`
// visit http://localhost:8080

let host = "http://localhost:8080/"

let listener (handler:(HttpListenerRequest->HttpListenerResponse->Async<unit>)) =
    let hl = new HttpListener()
    hl.Prefixes.Add host
    hl.Start()
    let task = Async.FromBeginEnd(hl.BeginGetContext, hl.EndGetContext)
    async {
        while true do
            let! context = task
            Async.Start(handler context.Request context.Response)
    } |> Async.Start

listener (fun req response ->
    async {
        response.ContentType <- "text/html"
        let bytes = UTF8Encoding.UTF8.GetBytes("thanks!")
        response.OutputStream.Write(bytes, 0, bytes.Length)
        response.OutputStream.Close()
    })

December 29, 2024
in F Sharp, json
9 min read

Json

Should I use System.Text.Json (STJ) or Newtonsoft.Json (previously Json.NET)?

use STJ, Newtonsoft is no longer enhanced with new features. The author works for Microsoft now on some non-json stuff.

JamesNK reddit comment

Terms

marshal - assemble and arrange (a group of people, especially troops) in order.

"the general marshalled his troops"

marshalling (UK) (in computer science) (marshal US) - getting parameters from here to there

serialization - transforming something (data) to a format usable for storage or transmission over the network

https://stackoverflow.com/questions/770474/what-is-the-difference-between-serialization-and-marshaling

JSON - Java Script Object Notation - data interchange format. https://www.json.org/json-en.html

While analysing some logs I used FSharp.Data's JsonProvider. Only a few properties were relevant but JsonProvider stores the whole json in memory. With 10GB of logs to analyse I quick run out of memory.

Let's do some testing!

open System
open System.IO
open System.Text.Json

fsi.AddPrinter<DateTimeOffset>(fun dt -> dt.ToString("O"))

Environment.CurrentDirectory <- __SOURCE_DIRECTORY__ // ensures the script runs from the directory it's located in
// -------------------------------------------------------------------------

// sample log entry for testing
type LogEntry = {
    Timestamp : DateTimeOffset
    Level     : string
    Message   : string
}

// only the properties we're interested in
type LogEntryRecord = {
    Timestamp : DateTimeOffset
    Level     : string
}

let random = Random()
let levels = [ "INFO"; "WARN"; "ERROR"; "DEBUG" ]

let generateLogEntry () =
    {
        Timestamp = DateTimeOffset.Now.AddSeconds(-random.Next(0, 10000))
        Level     = levels.[random.Next(levels.Length)]
        Message   = String.replicate(random.Next(10, 100)) "x" // random string to simulate redundant content
    }

List.init 7_000_000 (fun _ -> generateLogEntry()) // 7M entries is around 1GB of data
|> List.map (fun entry -> JsonSerializer.Serialize(entry))
|> fun lines -> File.WriteAllLines("./logs.json", lines)

let lines = File.ReadAllLines "./logs.json"

let runWithMemoryCheck lines singleLineParser =
    GC.Collect()
    let before = GC.GetTotalMemory(true)
    let x = lines |> Array.map singleLineParser
    GC.Collect()
    let after = GC.GetTotalMemory(true)
    let m = ((after - before) |> float) / 1024. / 1024. / 1024. // GB
    x, m

#time
// -------------------------------------------------------------------------

open System.Text.Json.Nodes

#r "nuget: FSharp.Data"
open FSharp.Data

#r "nuget: FSharp.Json"
open FSharp.Json

type LogEntryJsonProvider = JsonProvider<"""
{
    "Timestamp"        : "2024-12-23T20:51:18.2020753+01:00",
    "Level"            : "ERROR",
    "Message"          : "File not found"
}""">

let fSharpDataJsonProvider = LogEntryJsonProvider.Parse

let fSharpDataJsonValue (x:string) =
    let line = x |> FSharp.Data.JsonValue.Parse
    let t = line.GetProperty("Timestamp").AsDateTimeOffset()
    let l = line.GetProperty("Level").AsString()
    { Timestamp = t; Level = l }

let stjJsonSerializer (x:string) = JsonSerializer.Deserialize<LogEntryRecord>(x)

let stjJsonNode (line:string) =
    let line = line |> JsonNode.Parse
    let t = line.["Timestamp"].GetValue<DateTimeOffset>()
    let l = line.["Level"].GetValue<string>()
    { Timestamp = t; Level = l }

let stjJsonDocument (x:string) =
    use doc = x |> JsonDocument.Parse
    let t = doc.RootElement.GetProperty("Timestamp").GetDateTimeOffset()
    let l = doc.RootElement.GetProperty("Level").GetString()
    { Timestamp = t; Level = l }

let sharpJson (x:string) = Json.deserialize<LogEntryRecord> x

runWithMemoryCheck lines fSharpDataJsonProvider |> snd |> printfn "Memory used: %f GB" // Memory used: 4.420363 GB | Real: 00:00:35.829, CPU: 00:02:07.312, GC gen0: 84,   gen1: 25,  gen2: 8
runWithMemoryCheck lines fSharpDataJsonValue    |> snd |> printfn "Memory used: %f GB" // Memory used: 0.521624 GB | Real: 00:00:16.557, CPU: 00:00:35.281, GC gen0: 29,   gen1: 10,  gen2: 4
runWithMemoryCheck lines stjJsonSerializer      |> snd |> printfn "Memory used: %f GB" // Memory used: 0.521555 GB | Real: 00:00:10.823, CPU: 00:00:44.453, GC gen0: 11,   gen1: 6,   gen2: 4
runWithMemoryCheck lines stjJsonNode            |> snd |> printfn "Memory used: %f GB" // Memory used: 0.521419 GB | Real: 00:00:09.533, CPU: 00:00:27.359, GC gen0: 16,   gen1: 7,   gen2: 4
runWithMemoryCheck lines stjJsonDocument        |> snd |> printfn "Memory used: %f GB" // Memory used: 0.521525 GB | Real: 00:00:06.208, CPU: 00:00:17.546, GC gen0: 5,    gen1: 4,   gen2: 4
runWithMemoryCheck lines sharpJson              |> snd |> printfn "Memory used: %f GB" // Memory used: 0.520846 GB | Real: 00:01:02.761, CPU: 00:01:20.578, GC gen0: 1022, gen1: 260, gen2: 4

Conclusion

FSharp.Data.JsonProvider is terrible compared to any other alternative (slow and uses lots more memory)
STJ.JsonDocument is the speed winner.
FSharp.Json supports F# types but it quite slow

`System.Text.Json` cheat sheet

System.Text.Json namespaces

JsonSerializer -> deserialize into fixed type
JsonDocument -> immutable (for reading only)
JsonDocument -> faster, IDisposable, uses shared memory pool
JsonNode -> mutable (you can construct json)

JsonNode vs JsonDocument see https://learn.microsoft.com/en-us/dotnet/standard/serialization/system-text-json/use-dom#json-dom-choices

`System.Text.Json.JsonSerializer`

open System
open System.Text.Json

// The System.Text.Json namespace contains all the entry points and the main types.
// The System.Text.Json.Serialization namespace contains attributes and APIs for advanced scenarios and customization specific to serialization and deserialization.

// System.Text.Json.JsonSerializer -> is a static class
//                                 -> you can instantiate and reuse the JsonSerialization options

let jsonString = """{
    "PropertyName1" : "dummyValue",
    "PropertyName2" : 42,
    "PropertyName3" : "2024-12-29T10:31:36.3774099+01:00",
    "PropertyName4" : {"NestedProperty" : 42},
    "PropertyName5" : [
        42,
        11
    ]
}"""

type InnerType = {
    NestedProperty: int
}

type DummyType = {
    PropertyName1: string
    PropertyName2: int
    PropertyName3: DateTimeOffset
    PropertyName4: InnerType
    PropertyName5: int list
}

type LogEntryRecord = {
    Timestamp: DateTimeOffset
    Level    : string
}


// # JsonSerializer.Deserialize

// JsonSerializer.Deserialize<'Type>(jsonString)
// JsonSerializer.Deserialize<'Type>(jsonString, options)
// JsonSerializer.DeserializeAsync(stream, ...) <- only streams can be parsed async cuz parsing string is purely CPU bound

// Deserialization behaviour:
//  - By default, property name matching is case-sensitive. You can specify case-insensitivity.
//  - Non-public constructors are ignored by the serializer.
//  - Deserialization to immutable objects or properties that don't have public set accessors is supported but not enabled by default.
//    ^ I'm not sure about this cuz F# records seem to work just fine

JsonSerializer.Deserialize<LogEntryRecord>(jsonString)
// { Timestamp = 0001-01-01T00:00:00.0000000+00:00 Level = null }
// no properties match but JsonSerializer just returns default values

JsonSerializer.Deserialize<DummyType>(jsonString)
// val it: DummyType = { PropertyName1 = "dummyValue"
//                       PropertyName2 = 42
//                       PropertyName3 = 2024-12-29T10:31:36.3774099+01:00
//                       PropertyName4 = { NestedProperty = 42 }
//                       PropertyName5 = [42; 11] }

// Deserialization is case sensitive by default!
let jsonString2 = """{
    "propertyName1" : "dummyValue",
    "propertyName2" : 42
}"""
JsonSerializer.Deserialize<DummyType>(jsonString2)
// val it: DummyType = { PropertyName1 = null
//                       PropertyName2 = 0
//                       PropertyName3 = 0001-01-01T00:00:00.0000000+00:00
//                       PropertyName4 = null
//                       PropertyName5 = null }
let options = new JsonSerializerOptions()
options.PropertyNameCaseInsensitive <- true
JsonSerializer.Deserialize<DummyType>(jsonString2, options)
// val it: DummyType = { PropertyName1 = "dummyValue"
//                       PropertyName2 = 42
//                       PropertyName3 = 0001-01-01T00:00:00.0000000+00:00
//                       PropertyName4 = null
//                       PropertyName5 = null }


// # JsonSerializer.Serialize

// let's pretty print during testing
// by default the json is minified
let options = new JsonSerializerOptions()
options.WriteIndented <- true

JsonSerializer.Serialize(options, options)
//val it: string =
//  "{
//  "Converters": [],
//  "TypeInfoResolver": {},
//  "TypeInfoResolverChain": [
//    {}
//  ],
//  "AllowOutOfOrderMetadataProperties": false,
//  "AllowTrailingCommas": false,
//  "DefaultBufferSize": 16384,
//  "Encoder": null,
//  "DictionaryKeyPolicy": null,
//  "IgnoreNullValues": false,
//  "DefaultIgnoreCondition": 0,
//  ...

// Serialization behaviour:
//  - by default, all public properties are serialized. You can specify properties to ignore. You can also include private members.
//  - by default, JSON is minified. You can pretty-print the JSON.
//  - by default, casing of JSON names matches the .NET names. You can customize JSON name casing.
//  - by default, fields are ignored. You can include fields.

`System.Text.Json.JsonNode`

open System
open System.Text.Json.Nodes

let jsonString = """{
    "PropertyName1" : "dummyValue",
    "PropertyName2" : 42,
    "PropertyName3" : "2024-12-29T10:31:36.3774099+01:00",
    "PropertyName4" : {"NestedProperty" : 42},
    "PropertyName5" : [
        42,
        11
    ]
}"""

let x = JsonNode.Parse(jsonString) // type(x) = JsonNode
x.ToJsonString()
x.["PropertyName3"].GetValue<DateTimeOffset>()
x.["PropertyName3"].GetPath()
x.["PropertyName4"].["NestedProperty"].GetPath()
x.["PropertyName2"] |> int
// x.["PropertyName3"] |> DateTimeOffset // TODO - why can't I use this explicit conversion?

x["PropertyName4"].GetValueKind() |> string // "Object"
x["NonExistingProperty"] // null
x["NonExistingProperty"].GetValue<int>() // err - System.NullReferenceException
x["PropertyName5"].AsArray() |> Seq.map (fun a -> a.GetValue<int>()) // ok
x["PropertyName5"].AsArray() |> Seq.map int // ok
x["PropertyName5"].[0].GetValue<int>() // ok

// create a json object
let m = new JsonObject()
m["TimeStamp"] <- DateTimeOffset.Now
m.ToJsonString() // {"TimeStamp":"2024-12-29T16:06:17.046746+01:00"}
m["SampleProperty"] <- new JsonArray(1,2)
m.Remove("TimeStamp")

let a = JsonNode.Parse("""{"x":{"y":[1,2,3]}}""")
a.["x"] // this is a JasonNode
a.["x"].AsObject() // this returns a JsonObject
a.["x"].AsObject() |> Seq.map (fun x -> printfn "%A" x) // iterate over properties of the object
a.["x"].ToJsonString() // you can serialize subsection of the json
// {"y":[1,2,3]}

JsonNode.DeepEquals(x, a) // comparison

`System.Text.Json.JsonDocument`

open System
open System.Text.Json

let jsonString = """{
    "PropertyName1" : "dummyValue",
    "PropertyName2" : 42,
    "PropertyName3" : "2024-12-29T10:31:36.3774099+01:00",
    "PropertyName4" : {"NestedProperty" : 42},
    "PropertyName5" : [
        42,
        11
    ]
}"""

use x = JsonDocument.Parse(jsonString) // remember this is an IDisposable
x.RootElement.GetProperty("PropertyName1").GetString()
x.RootElement.GetProperty("PropertyName2").GetInt32()
x.RootElement.GetProperty("PropertyName3").GetDateTime()
x.RootElement.GetProperty("PropertyName4").GetProperty("NestedProperty").GetInt32()
x.RootElement.GetProperty("PropertyName5").EnumerateArray() |> Seq.map (fun x -> x.GetInt32())

for i in x.RootElement.GetProperty("PropertyName5").EnumerateArray() do
    printfn "%A" i

// you could also write a generic helper

type JsonElement with
  member x.Get<'T>(name:string) : 'T =
    let p = x.GetProperty(name)
    match typeof<'T> with
    | t when t = typeof<string> -> p.GetString() |> unbox
    | t when t = typeof<int> -> p.GetInt32() |> unbox
    | t when t = typeof<DateTime> -> p.GetDateTime() |> unbox
    | t when t = typeof<JsonElement> -> p |> unbox
    | t when t = typeof<int[]> -> p.EnumerateArray() |> Seq.map (fun x -> x.GetInt32()) |> Seq.toArray |> unbox
    | _ -> failwith "unsupported type"

x.RootElement.Get<string>("PropertyName1")

F# types and json serialization

open System.Text.Json

// Record - OK
type DummyRecord = {
    Text: string
    Num:  int
    }

let r = { Text = "asdf"; Num = 1 }

JsonSerializer.Serialize(r) |> JsonSerializer.Deserialize<DummyRecord>

let tuple = (42, "asdf")
JsonSerializer.Serialize(tuple) |> JsonSerializer.Deserialize<int * string>

type TupleAlias = int * string
let tuple2 = (43, "sfdg") : TupleAlias
JsonSerializer.Serialize(tuple2) |> JsonSerializer.Deserialize<TupleAlias>

// Discriminated Union :(
type SampleDiscriminatedUnion =
    | A of int
    | B of string
    | C of int * string
let x = A 1
JsonSerializer.Serialize(x) // eeeeeeeeeeeeee !

// Option - OK
JsonSerializer.Serialize(Some 42) |> JsonSerializer.Deserialize<int option>
JsonSerializer.Serialize(None) |> JsonSerializer.Deserialize<int option>
open System
type RecordTest2 = {
    Timestamp: DateTimeOffset
    Level: string
    TestOp: int option
    }

// Discriminated Union is supported in FSharp.Json
// https://github.com/fsprojects/FSharp.Json
#r "nuget: FSharp.Json"
open FSharp.Json
let data = C (42, "The string")
let json = Json.serialize data
// val json: string = "{
//   "C": [
//     42,
//     "The string"
//   ]
// }

let deserialized = Json.deserialize<SampleDiscriminatedUnion> json
// val deserialized: SampleDiscriminatedUnion = C (42, "The string")

More on `FSharp.Data`'s `JsonValue`

#r "nuget:FSharp.Data"
open FSharp.Data

let j = JsonValue.Parse("""{"x":{"y":[1,2,3]}}""")
j.Properties()
// val it: (string * JsonValue) array =
//   [|("x", {
//   "y": [
//     1,
//     2,
//     3
//   ]
// })|]
j.["x"].["y"].AsArray()
j.TryGetProperty "x"

// JsonValue is a discriminated union
// union JsonValue =
//   | String  of string
//   | Number  of decimal
//   | Float   of float
//   | Record  of properties: (string * JsonValue) array
//   | Array   of elements: JsonValue array
//   | Boolean of bool
//   | Null
//
// docs:
// https://fsprojects.github.io/FSharp.Data/reference/fsharp-data-jsonvalue.html
// https://fsprojects.github.io/FSharp.Data/library/JsonValue.html <- if you'll be working with JsonValue read this
//
// there are also extension methods:
// https://fsprojects.github.io/FSharp.Data/reference/fsharp-data-jsonextensions.html
//
// AsArray doesn't fail if the value is not an array, as opposed to other AsSth methods
// See below how extension methods are defined
// source: https://github.com/fsprojects/FSharp.Data/blob/main/src/FSharp.Data.Json.Core/JsonExtensions.fs
open System.Globalization
open System.Runtime.CompilerServices
open System.Runtime.InteropServices
open FSharp.Data.Runtime
open FSharp.Core

[<Extension>]
type JsonExtensions =
    /// Get all the elements of a JSON value.
    /// Returns an empty array if the value is not a JSON array.
    [<Extension>]
    static member AsArray(x: JsonValue) =
        match x with
        | (JsonValue.Array elements) -> elements
        | _ -> [||]

    /// Get a number as an integer (assuming that the value fits in integer)
    [<Extension>]
    static member AsInteger(x, [<Optional>] ?cultureInfo) =
        let cultureInfo = defaultArg cultureInfo CultureInfo.InvariantCulture

        match JsonConversions.AsInteger cultureInfo x with
        | Some i -> i
        | _ ->
            failwithf "Not an int: %s"
            <| x.ToString(JsonSaveOptions.DisableFormatting)

// construct a json object
let d =
    JsonValue.Record [|
        "event",      JsonValue.String "asdf"
        "properties", JsonValue.Record [|
            "token",       JsonValue.String "tokenId"
            "distinct_id", JsonValue.String "123123"
        |]
    |]

d.ToString().Replace("\r\n", "").Replace(" ", "")

// if you want to process the json object
for (k, v) in d.Properties() do
    printfn "Property: %s" k
    match v with
    | JsonValue.Record props -> printfn "\t%A" props
    | JsonValue.String s     -> printfn "\t%A" s
    | JsonValue.Number n     -> printfn "\t%A" n
    | JsonValue.Float f      -> printfn "\t%A" f
    | JsonValue.Array a      -> printfn "\t%A" a
    | JsonValue.Boolean b    -> printfn "\t%A" b
    | JsonValue.Null         -> printfn "\tnull"

Serialize straight to UTF-8

JsonSerializer.SerializeToUtf8Bytes(value, options) <- why does this one exist?

Strings in .Net are stored in memory as UTF-16, so if you don't need a string, you can use this method and serialize straight to UTF-8 bytes (it's 5-10% faster, see link) https://learn.microsoft.com/en-us/dotnet/standard/serialization/system-text-json/how-to#serialize-to-utf-8

<3 regex

https://regex101.com/r/RdCR7j/1 - set the global flag (g) to get all matches

https://www.debuggex.com/ - havent't played with this a lot but I might give it a try, looks like a decent learning tool

regex - use static Regex.Matches() or instantiante Regex()?

By default use static method.

.NET regex engine caches regexes (by default 15).

Are you using more than 15 regexes and use them frequently and they're complex and you care about a performance?

Investigate Regex() and RegexOptions.Compiled RegexOptions.CompiledToAssembly

Test performance before you optimize

https://learn.microsoft.com/en-us/dotnet/standard/base-types/best-practices-regex#static-regular-expressions

https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regexoptions?view=net-9.0

What is the whole fus about backtracing?

Microsoft's documentation does a bad job explaning backtracking.

Read about backtracking here - https://www.regular-expressions.info/catastrophic.html

To experience backtracing yourself - https://regex101.com/r/1rWKNN/1 - keep on adding "x" to the input and see how the execution time increses - with 35*"x" it takes 5 seconds for the regex to find out it doesn't match!

Code

These are the methods you need:

open System
open System.Text.RegularExpressions


Regex.Matches("input", "pattern")
Regex.Matches("input", "pattern", RegexOptions.IgnoreCase ||| RegexOptions.Singleline)
Regex.Matches("input", "pattern", RegexOptions.IgnoreCase ||| RegexOptions.Singleline, TimeSpan.FromSeconds(10.)) // you can use a timeout to prevent a DoS attack with malicous inputs
Regex.Match()
Regex.IsMatch()
Regex.Replace()
Regex.Split()
Regex.Count()

let r = new Regex("pattern") // instance Regex offers the same methods
r.Matches("input")

Regex class - https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex?view=net-9.0

Sample:

let matches = Regex.Matches("Lorem ipsum dolor sit amet, consectetur adipiscing elit", "(\w)o")
matches |> Seq.iter (fun x -> printfn "%s" x.Value)
matches |> Seq.iter (fun x -> printfn "%A" x.Groups)
matches.[0].Groups.[1].Value |> printfn "%s"

// Lo             // these are the whole matches
// do             //
// lo             //
// co             //
// seq [Lo; L]    // group 0 is the whole match, group 1 is the (\w)
// seq [do; d]    //
// seq [lo; l]    //
// seq [co; c]    //
// L              // this is the letter captured by (\w)

let matches2 = Regex.Matches("Lorem ipsum dolor sit amet, consectetur adipiscing elit", "(\w)+o")
matches2.[1].Groups.[1].Value |> printfn "%A"
matches2.[1].Groups.[1].Captures |> Seq.iter (fun c -> printfn "%s" c.Value)
// l              // gotcha! the value of the group is the last thing captured by that group
// d              // here the (\w)+ group captures 3 times
// o              //
// l              //

Match object properties:
Match.Success -> bool   | true      | false        |
Match.Value   -> string | the match | String.Empty |

let match3 = Regex.Match("Lorem ipsum dolor sit amet, consectetur adipiscing elit", "Lorem i[a-z ]+i")
match3.Success |> printfn "%A"
match3.Value   |> printfn "%A"
// true
// "Lorem ipsum dolor si"

let match4 = Regex.Match("Lorem ipsum dolor sit amet, consectetur adipiscing elit", "Lorem i[A-Z ]+i")
match4.Success            |> printfn "%A"
match4.Value              |> printfn "%A"
match4.Groups.Count       |> printfn "%A"
match4.Groups.[0].Success |> printfn "%A"
// false
// ""    // notice this is String.empty not <null>
// 1     // even for a failed match there is always at least one group
// false

let mutable m = Regex.Match("Lorem ipsum dolor sit amet, consectetur adipiscing elit", "\wo")
while m.Success do
    printfn "%s" m.Value
    m <- m.NextMatch()

let lines = [
    "The next day the children were ready to go to the plum thicket in the"
    "peach orchard as soon as they had their breakfast, but while they were"
    "talking about it a new trouble arose. It grew out of a question asked by"
    "Drusilla."
]

lines
|> List.filter (fun line -> Regex.IsMatch(line, "the"))
|> List.map    (fun line -> Regex.Replace(line, "(\w+) the", "the $1"))

let text =
    "don't we all love\n" +
    "dealing with different\r\n" +
    "line endings\n" +
    "it's so much fun"
Regex.Split(text, "\r?\n")
|> Array.iter (printfn "%s")

open System.Net.Http
let book = (new HttpClient()).GetStringAsync("https://www.gutenberg.org/cache/epub/74886/pg74886.txt").Result
Regex.Count(book, "[^\w]\w{3}[^\w]") |> printfn "%d" // count 3 letter words

regex - Quick Reference (Microsoft)

https://learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference

Cheat sheet

Character escapes

\t     matches a tab \u0009
\r     match a carriage return \u000D
\n     new line \u000A
\unnnn match a unicode character by hexadecimal representation, exactly 4 digits
\.     match a dot (not any character) aka. match literally
\*     match an asterisk (don't interpret * as a regex special quantifier)

Character classes

[character_group]       /[ae]/ will match "a" in "gray"
[^not_character_group]
[a-z] [A-Z] [a-z0-9A-Z] character ranges
.                       wildcard - any character except \n except when using SingleLine option
\w                      word character - upper/lower case letters and numbers
\W                      non word character
\s                      white-space character
\S                      non whitespace character
\d                      digit
\D                      non digit

Anchors

^   $ beginning and end of a string (in multiline mode beginning and end of a line)

Grouping

(subexpression)               (\w)\1 - match a character and the same character again - "aa" in "xaax"
(?<name>subexpression)        named group (?<double>\w)\k<double> - same as above
(?:subexpression)             noncapturing group - Write(?:Line)? - will match both Write and WriteLine in a string
                              (:?Mr\. |Ms\. |Mrs\. )?\w+\s\w+ -> match fist name, last name and optional preceding title
(?imnsx-imnsx: subexpression) turn options on or off for a group
(?=subexp)                    zero-width positive lookahead assertion
(?!subexp)                    negative lookahead
(?<=subexp)
(?<!subexp)                   look behind assertions
                              make sure a subexp is/is not following (but don't match it, ie. don't consume the characters)

Quantifiers

*     0...n (all these are greedy by default -> match as many as possible)
+     1...n
?     0...1
{n}   exactly n
{n,}  at least n
{n,m} n...m
*?
+?
??
{n,}?
{n,m}? question mark makes the match nongreedy (mach as few as possible)

Backreference

\number   match the value of a previous subexpression - (\w)\1 - matches the same \w character twice
\k<name>  backreference using group name

Alternation Constructs

| - any element separated by | - th(e|is|at) and the|this|that both match "the" "this" "that"
    ala|ma|kota - match "ala" or "ma" or "kota"
    ala ma (kota|psa) - match "ala ma kota" or "ala ma psa"
TODO - match yes if expresion else match no

Substitution

$number use numbered group
${name} use named group
$$      literal $
$&      whole match
$`      text before the match
$'      text after the match
$+      last group
$_      entier input string

Inline options

(?imnsx-imnsx)               use it like this at the beginning
(?imnsx-imnsx:subexpression) use for a group
i                            case insensetive
m                            multiline - match beginning and end of a line
n                            do not capture unnamed groups
s                            signle line - . matches \n also

More options are available using RegexOptions enum

Practice regex

https://regex101.com/quiz

https://regexcrossword.com/

https://alf.nu/RegexGolf

Tutorial

I recall reading this tutorial years ago and I liked it - https://www.regular-expressions.info/tutorial.html

Misc

https://blog.codinghorror.com/regular-expressions-now-you-have-two-problems/

I love regex.

However I used to say "if you solve a problem with regex now you have 2 problems"

Not knowing how this quote came to be I repeated it for years. I'll smack the next person to repeat this quote without elaborating.

If regex did not exist, it would be necessary to invent it.

Why does .Matches() return a custom collection instead of List<Match>?

Historic reasons. Regex was made in .Net 1.0 before generic were a thing.

https://github.com/dotnet/runtime/discussions/74919?utm_source=chatgpt.com

I used (?<!\[.*?)(?<!\(")https?://\S+ with replace [$&]($&) to linkify links in this post

My lovely regex helpers

let regexExtract  regex                      text = Regex.Match(text, regex).Value
let regexExtractg regex                      text = Regex.Match(text, regex).Groups.[1].Value
let regexExtracts regex                      text = Regex.Matches(text, regex) |> Seq.map (fun x -> x.Value)
let regexReplace  regex (replacement:string) text = Regex.Replace(text, regex, replacement)
let regexRemove   regex                      text = Regex.Replace(text, regex, String.Empty)

October 28, 2022
in PowerShell, bash, scripting, F Sharp, exercises
9 min read

Exercises in bash/shell/scripting

Being fluent in shell/scripting allows you to improve your work by 20%. It doesn't take you to another level. You don't suddenly poses the knowledge to implement flawless distributed transactions but some things get done much faster with no frustration.

Here is my collection of shell/scripting exercises for others to practice shell skills.

A side note - I'm still not sure if I should learn more PowerShell, try out a different shell or do everything in F# fsx. PowerShell is just so ugly ;(

Scroll down for answers

Exercise 1

What were the arguments of DetectOrientationScript function in https://github.com/tesseract-ocr/tesseract when it was first introduced?

Exercise 2

Get Hadoop distributed file system log from https://github.com/logpai/loghub?tab=readme-ov-file

Find the ratio of (failed block serving)/(failed block serving + successful block serving) for each IP

The result should like:

...
10.251.43.210  0.452453987730061
10.251.65.203  0.464609355865785
10.251.65.237  0.455237129089526
10.251.66.102  0.452124935995904
...

Exercise 3

This happened to me once - I had to find all http/s links to a specific domains in the export of our company's messages as someone shared proprietary code on websites available publicly.

Exercise - find all distinct http/s links in https://github.com/tesseract-ocr/tesseract

Exercise 4

Task - remove the string "42" from each line of multiple CSV files.

You can use this to generate the input CSV files:

$numberOfFiles = 10
$numberOfRows = 100

$fileNames = 1..$numberOfFiles | % { "file$_.csv" }
$csvData = 1..$numberOfRows | ForEach-Object {
    [PSCustomObject]@{
        Column1 = "Value $_"
        Column2 = "Value $($_ * 2)"
        Column3 = "Value $($_ * 3)"
    }
}

$fileNames | % { $csvData | Export-Csv -Path $_ }

Exercise 5

Just like me you created tens of repositories while writing code katas. Now you would like to keep all katas in a single repository. Write a script to move several repositories to a single repository. Each repo's content will end up in a dedicated directory in the new "master" repo. Remember to merge unrelated histories in the "master" repo.

.

Exercise 1 - answer

Answer:

bool DetectOrientationScript(int& orient_deg, float& orient_conf, std::string& script, float& script_conf);

[PowerShell]
> git log -S DetectOrientationScript # get sha of oldest commit
> git show bc95798e011a39acf9778b95c8d8c5847774cc47 | sls DetectOrientationScript

[bash]
> git log -S DetectOrientationScript # get sha of oldest commit
> git show bc95798e011a39acf9778b95c8d8c5847774cc47 | grep DetectOrientationScript

One-liner:

[PowerShell]
> git log -S " DetectOrientationScript" -p | sls DetectOrientationScript | select -Last 1

[bash]
> git log -S " DetectOrientationScript" -p | grep DetectOrientationScript | tail -1

Bonus - execution times

[PowerShell 7.4]
> measure-command { git log -S " DetectOrientationScript" -p | sls DetectOrientationScript | select -Last 1 }
...
TotalSeconds      : 3.47
...

[bash]
> time git log -S " DetectOrientationScript" -p | grep DetectOrientationScript | tail -1
...
real    0m3.471s
...

Without git log -S doing heavy lifting times look different:

[PowerShell 7.4]
> @(1..10) | % { Measure-Command { git log -p | sls "^\+.*\sDetectOrientationScript" } } | % { $_.TotalSeconds } | Measure-Object -Average

Count    : 10
Average  : 9.27122774

[PowerShell 5.1]
> @(1..10) | % { Measure-Command { git log -p | sls "^\+.*\sDetectOrientationScript" } } | % { $_.TotalSeconds } | Measure-Object -Average

Count    : 10
Average  : 27.33900077

[bash]
> seq 10 | xargs -I '{}' bash -c "TIMEFORMAT='%3E' ; time git log -p | grep -E '^\+.*\sDetectOrientationScript' > /dev/null" 2> times
> awk '{s+=$1} END {print s}' times
6.7249 # For convince I moved to dot one place to the left

Reflections

Bash is faster then PowerShell. PowerShell 7 is much faster then PowerShell 5. It was surprisingly easy to get the average with Measure-Object in PowerShell and surprisingly difficult in bash.

Exercise 2 - answer

[PowerShell 7.4]
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | % { $_.Matches[0].Value -replace "Served block.*/","ok/" -replace "Got exception while serving.*/","nk/" -replace ":","" } | % { $_ -replace "(ok|nk)/(.*)", "`${2} `${1}"} | sort > sorted
> cat .\sorted | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; ,@($_.name, ($g.Length/$_.count)) } | write-host

This is how I got to the answer:

> sls "Served block" -Path .\HDFS.log | select -first 10
> sls "Served block|Got exception while serving" -Path .\HDFS.log | select -first 10
> sls "Served block|Got exception while serving" -Path .\HDFS.log | select -first 100
> sls "Served block|Got exception while serving" -Path .\HDFS.log | select -first 1000
> sls "Served block.*|Got exception while serving" -Path .\HDFS.log | select -first 1000
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | select -first 1000
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log -raw | select -first 1000
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log -raw | select matches -first 1000
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log -raw | select Matches -first 1000
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log -raw | select Matches
> $a = sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log -raw
> $a[0]
> get-type $a[0]
> Get-TypeData $a
> $a[0]
> $a[0].Matches[0].Value
> $a = sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log
> $a[0]
> $a[0].Matches[0].Value
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | % { $_.Matches[0].Value -replace "Served block.*/","ok/" }
> "asdf" -replace "a","b"
> "asdf" -replace "a","b" -replace "d","x"
> "asdf" -replace "a.","b" -replace "d","x"
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | % { $_.Matches[0].Value -replace "Served block.*/","ok/" -replace "Got exception while serving.*/","nk" }
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | % { $_.Matches[0].Value -replace "Served block.*/","ok/" -replace "Got exception while serving.*/","nk/" }
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | % { $_.Matches[0].Value -replace "Served block.*/","ok/" -replace "Got exception while serving.*/","nk/" -replace ":","" }
> "aaxxaa" -replace "a.","b"
> "aaxxaa" -replace "a.","b$0"
> "aaxxaa" -replace "a.","b$1"
> "aaxxaa" -replace "a.","b${1}"
> "aaxxaa" -replace "a.","b${0}"
> "aaxxaa" -replace "a.","b`${0}"
> "okaaxxokaa" -replace "(ok|no)aa","_`{$1}_"
> "okaaxxokaa" -replace "(ok|no)aa","_`${1}_"
> "okaaxxokaa" -replace "(ok|no)aa","_`${1}_`${0}"
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | % { $_.Matches[0].Value -replace "Served block.*/","ok/" -replace "Got exception while serving.*/","nk/" -replace ":","" } | % { $_ -replace "(ok|nk)/(.*)", "`${2} `${1}"}
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | % { $_.Matches[0].Value -replace "Served block.*/","ok/" -replace "Got exception while serving.*/","nk/" -replace ":","" } | % { $_ -replace "(ok|nk)/(.*)", "`${2} `${1}"} | sort
> sls "Served block.*|Got exception while serving.*" -Path .\HDFS.log | % { $_.Matches[0].Value -replace "Served block.*/","ok/" -replace "Got exception while serving.*/","nk/" -replace ":","" } | % { $_ -replace "(ok|nk)/(.*)", "`${2} `${1}"} | sort > sorted
> cat .\sorted -First 10
> cat | group
> cat | group -Property {$_}
> cat .\sorted | group -Property {$_}
> cat .\sorted -Head 10 | group -Property {$_}
> cat .\sorted -Head 100 | group -Property {$_}
> cat .\sorted -Head 1000 | group -Property {$_}
> cat .\sorted -Head 10000 | group -Property {$_}
> cat .\sorted -Head 10000 | group -Property {$_} | select name,count
> cat .\sorted | group -Property {$_} | select name,count
> cat .\sorted | group -Property {$_ -replace "nk|ok",""}
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""}
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $_.name, $g.Length / $_.count }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $_.name, $g.Length, $_.count }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $_.name, $g.Length / $_.count }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $_.name, $g.Length, $_.count }
> $__
> $__[0]
> $__[1]
> $__[2]
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $_.name, $g.Length, $_.count }
> $a = cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $_.name, $g.Length, $_.count }
> $a[0]
> $a[1]
> $a[2]
> $a[1].GetType()
> $a[2].GetType()
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $_.name, ($g.Length) / ($_.count) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $_.name, (($g.Length) / ($_.count)) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; ,$_.name, (($g.Length) / ($_.count)) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; @($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; ,@($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; return ,@($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; [Array] ,@($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; [Array]@($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; return ,$_.name, (($g.Length) / ($_.count)) }
> $a = cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; return ,$_.name, (($g.Length) / ($_.count)) }
> $a[0]
> $a = cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; return ,($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; return ,($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; return ,@($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; ,@($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))) }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); $x }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); ,$x }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); return ,$x }
> $a = cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); return ,$x }
> $a[0]
> $a[0][0]
> $a = cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); return ,$x } | % { wirte-output "$_[0]" }
> $a = cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); return ,$x } | % { write-output "$_[0]" }
> $a = cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); return ,$x } | % { write-output "$_[0]" }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); return ,$x } | % { write-output "$_[0]" }
> cat .\sorted -Head 10000 | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); return ,$x } | % { write-output "$_" }
> cat .\sorted | group -Property {$_ -replace "nk|ok",""} | % { $g = $_.group | ? {$_.contains("nk") }; $x = @($_.name, (($g.Length) / ($_.count))); return ,$x } | % { write-output "$_" }

[F#]
open System.IO
open System.Text.RegularExpressions

let lines = File.ReadAllLines("HDFS.log")

let a =
    lines
    |> Array.filter (fun x -> x.Contains("Served block") || x.Contains("Got exception while serving"))

a
// |> Array.take 10000
|> Array.map (fun x ->
    let m = Regex.Match(x, "(Served block|Got exception while serving).*/(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})")
    m.Groups[2].Value,
    match m.Groups[1].Value with
    | "Served block"                -> true
    | "Got exception while serving" -> false )
|> Array.groupBy fst
|> Array.map (fun (key, group) ->
    let total = group.Length
    let failed = group |> Array.map snd |> Array.filter not |> Array.length
    key, (decimal failed)/(decimal total)
    )
|> Array.sortBy fst
|> Array.map (fun (i,m) -> sprintf "%s  %.15f" i m)
|> fun x -> File.AppendAllLines("fsout", x)

Exercise 3 - answer

[PowerShell 7.4]
> ls -r -file | % { sls -path $_.FullName -pattern https?:.* -CaseSensitive } | % { $_.Matches[0].Value } | sort | select -Unique

# finds 234 links

[bash]
> find . -type f -not -path './.git/*' | xargs grep -E https?:.* -ho | sort | uniq

# finds 234 links

Exercise 4 - answer

[PowerShell 7.4]
ls *.csv | % { (cat $_ ) -replace "42","" | out-file $_ }

[bash]
> sed -i 's/43//' *.csv
> sed -ibackup 's/43//' *.csv # creates backup files

This neat, perhaps unix people had wisdom that is lost now.

Exercise 5 - answer

$repos = @(
    @("https://github.com/inwenis/kata.sortingitout", "sortingitout", "kata_sorting_it_out"  ),
    @("https://github.com/inwenis/anagrams_kata2",    "anagrams2",    "kata_anagrams2"  ),
    @("https://github.com/inwenis/anagram_kata",      "anagrams",     "kata"  )
)

$repos | ForEach-Object {
    $repo, $branch, $dir = $_
    $repoName = $repo.Split("/")[-1]
    git clone $repo
    pushd $repoName
    git co -b $branch
    $all = Get-ChildItem
    mkdir $dir
    $all | ForEach-Object {
        Move-Item $_ -Destination $dir
    }
    git add -A
    git cm -am "move"
    git remote add kata https://github.com/inwenis/kata
    git push -u kata $branch
    popd
    Remove-Item $repoName -Recurse -Force
    Read-Host "Press Enter to continue"
}

June 12, 2021
in F Sharp
2 min read

F# async - be mindful of what you put in async {}

open System

let r = Random()

let m () =
  let random_num = r.Next()
  async {
    printfn "%i" random_num
  }

m () |> Async.RunSynchronously // prints a random number
m () |> Async.RunSynchronously // prints another random number
let x = m ()
x |> Async.RunSynchronously // prints another random number
x |> Async.RunSynchronously // prints same number as above

Why does it matter that lines 14 and 15 print the same number?

Let's consider the following code:

// We're sending http requests and if they fail we'd like to retry them

#r "System.Net.Http"
open System.Net.Http

let HTTP_CLIENT = new HttpClient()

let send url =
  let httpRequest = new HttpRequestMessage()
  httpRequest.RequestUri <- Uri url

  async {
    let! r =
      HTTP_CLIENT.SendAsync httpRequest
      |> Async.AwaitTask
    return r
  }

send "http://test" |> Async.RunSynchronously
send "http://test" |> Async.RunSynchronously
let y = send "http://test"
y |> Async.RunSynchronously
y |> Async.RunSynchronously

let retry computation =
  async {
    try
      let! r = computation
      return r
    with
    | e ->
      printf "ups, err, let's retry"
      let! r2 = computation
      return r2
  }

send "http://test" |> retry |> Async.RunSynchronously
// retrying will fail always with "The request message was already sent. Cannot send the same request message multiple times."
// This is because just like L14/15 print the same number, here we send the exact same request object and that's not allowed

The fix

let send2 url =
  async {
    let httpRequest = new HttpRequestMessage()
    httpRequest.RequestUri <- Uri url
    let! r =
      HTTP_CLIENT.SendAsync httpRequest
      |> Async.AwaitTask
    return r
  }

send2 "http://test" |> retry |> Async.RunSynchronously