Building a Raspberry Pi Powered Brewery Control Panel That Reports to Azure Log Analytics

The Why?

I’ve been home brewing for over 10 years. I got my start after helping a colleague brew on his system in his garage and jumped straight into an all-grain 3 vessel setup the next day. Well, not exactly the next day. It took me about a month to build my first system. I did what a lot of homebrewers do and used three 1/2 bbl kegs as brew pots with propane as the heat source. This system served me very well for those 10 years and is still fully functional today. But after 10 years of brewing in my garage, I decided it was finally time to upgrade some components, switch over to a fully electric setup, and brew in my basement where the temperature is 74℉ year round thanks to the server rack that sits in the corner. I’ve gone back and forth over the years leading up to the build as to whether or not to use a PLC of some sort, or a simple PID controlled setup. I really liked the simplicity of the PID controllers, however, the prospect of flexibility and future upgrade-ability eventually won me over and I decided to go with a Raspberry Pi powered solution. Since there are tons of articles around the net detailing how to build a panel, I’ll just touch on some of the bits without going into super detail on every aspect and instead focus on the Raspberry Pi and the code that runs everything.

My existing propane setup. Apart from adding a RIMS tube a few years ago, the setup is the same as the day I started brewing with it.

Build Objectives

For the build, there were a few things that I wanted:

  • Full stainless
  • Sanitary Fittings (Tri-Clover)
  • DIN Rail mounted components
  • 1/2 bbl (15 gal) batch size
  • Ability to run both HLT and BK at the same time

Hardware

I’ve included a full build sheet at the bottom of this article that includes places that I purchased items from and prices. The main difference between this panel, and a PID controlled one comes down to just four components:

  • Raspberry Pi
  • Touchscreen
  • Hats (expansion boards)
  • Power Supplies

For the Raspberry Pi, I went with the 4GB model 4B. I figured that the additional memory couldn’t hurt, and since everything is part of the board, the only way to upgrade down the road is to completely replace the Pi. I also opted to go with a larger touchscreen than what I used for my Fermentation Controller build. Instead of the 7″, I chose a 10″ screen to allow for an easier to read (and control) interface. We will also need a 5v and 12v power supply to power the Pi and touchscreen.

For the hats, I opted to utilise the DAQC2plate and THERMOplate from Pi-Plates. With the fermentation controller build, I simply used a DAQCplate to do both the control and temperature measurement. One key difference between the DAQCplate and THERMOplate is that the DAQCplate does the temperature readings when you ask for them (which take about a second each), as opposed to the THERMOplate which does them in the background and then just returns the last read value when you request it. For a fermentation controller, waiting one second for each temperature measurement is no big deal as we aren’t making changes rapidly or drastically. For the brewing process, we need something a bit more responsive.

Layout

The front of the panel. Element controls on the left, pumps on the right. Master switch and E-Stop on the top row.
Back side of the front panel showing the switches and wires.
The interior of the panel

I wanted to keep the inside of my panel as tidy as possible. The DIN rails and cable raceways helped with that tremendously. The input is a 50A circuit which comes into a DPST relay (RLY1). From there, it is split to two 32A breakers (BRKR1 & 2) to ensure that a single element can’t pull more than that and to reduce down-stream wiring requirements. Each breaker feeds into a bank of terminal blocks to distribute the current to the various components on each leg. All of the 110v components (Relay coils and pumps) run off of a single breaker and phase. Two other relays (RLY2/3) are controlled via switches on the front panel to over-ride power to the heating elements. The transformers at the top power the Raspberry Pi, Pi-Plates, and Touchscreen.

Code

There are three parts to the code. The UI, the controller, and finally, the log writer. The UI is a fairly simple webpage. It makes use of css for the layout, and javascript to update the elements. PHP powers the back end to generate the responses for the AJAX calls. The controller is a python script which monitors the temperature of various components, controls the SSRs and heatsink fans, and writes metrics to a file that is used to update the UI and send data to Azure. All of the code for this is posted to my github repository.

UI

  1. Hot Liquor Tank SSR Heatsink Temperature – Monitored using a LM35DZ chip.
  2. CPU Temperature – pulled via the gpiozero python module
  3. Boil Kettle SSR Heatsink Temperature – Monitored using an LM35DZ chip.
  4. HLT Set Value – When in AUTO Mode, the controller will attempt to maintain the indicated temperature +/- the value of the Hysteresis.
  5. HLT Process Value – The value currently read from the DS18B20 probe in the HLT.
  6. Mash Tun SV – Currently, this is only used to toggle the color of the background of the MT temperature. Should I add more advanced logic or automation in the future, this can be used to control more of the process.
  7. MT PV – This is fed by another DS18B20 probe just like the HLT, only in the MT.
  8. BK SV – In this instance, the Set Value is indicated by a percentage. Instead of using temperature to control the heating element, we simply run it for a portion of the interval (in this case, 50%). The interval length can be configured to adjust the length of time the element is on or off.
  9. BK PV – Another DS18B20, only this one is in the… wait for it… Boil Kettle.
  10. HLT Mode – This can be toggled between AUTO or Temperature Mode, and MAN or Manual mode. Auto mode is controlled based on the temperature and hysteresis to determine whether the element is on or off.
  11. BK Mode – Same as #10, but for the BK.
  12. Total Runtime – How long the controller has been running (HH:mm:ss)
  13. UI Refresh – Allows the UI to be force refreshed at will.
  14. Toggles the Log Analytics Ingstion on or off.
  15. Toggles the Temperature controller on or off.
  16. Config.
  1. HLT Cycle – How many seconds between each evaluation of the temperature to determine whether or not the element should
  2. HLT Delta (Hysteresis). If the Pv is above or below the Sv by more than the Hysteresis, the element is cycled on or off.
  3. MT Delta
  4. BK Cycle Time
  5. BK Delta
  6. Shutdown System
  7. Restart System
  8. Cancel
  9. Confirm

azure.py

The azure.py script is what sends the data to Log Analytics. It consists of 4 main portions. The settings are loaded by reading the conf.json file and reading the WorkspaceId, WorkspaceKey, and LogName parameters. The WorkspaceId is the guid for the Log Analytics workspace. The Key is either the primary or secondary key. The keys and workspace ID can all be found under Agent Management (formerly advanced settings). LogName is the name of the log that will show up in Log Analytics. Like all other custom log sources, ‘_CL’ will be automatically appended to the end of whatever name you use. Our first function builds the signature to authenticate our POST requests.

def build_signature(customer_id, shared_key, date, content_length, method, content_type, resource):
    x_headers = 'x-ms-date:' + date
    string_to_hash = method + "\n" + str(content_length) + "\n" + content_type + "\n" + x_headers + "\n" + resource
    bytes_to_hash = bytes(string_to_hash).encode('utf-8')  
    decoded_key = base64.b64decode(shared_key)
    encoded_hash = base64.b64encode(hmac.new(decoded_key, bytes_to_hash, digestmod=hashlib.sha256).digest())
    authorization = "SharedKey {}:{}".format(customer_id,encoded_hash)
    return authorization

The build signature function takes a few parameters. All of the parameters are passed to it by the next function. build_signature will build a SHA256 hash based on our keys, the date and time (RFC 1123 format), and the length of the content being passed. This allows us to generate a 1-way hash that can be used by Azure to authenticate the payload once it arrives without having to pass our key.

def post_data(customer_id, shared_key, body, log_type):
    method = 'POST'
    content_type = 'application/json'
    resource = '/api/logs'
    rfc1123date = datetime.datetime.utcnow().strftime('%a, %d %b %Y %H:%M:%S GMT')
    content_length = len(body)
    signature = build_signature(customer_id, shared_key, rfc1123date, content_length, method, content_type, resource)
    uri = 'https://' + customer_id + '.ods.opinsights.azure.com' + resource + '?api-version=2016-04-01'

    headers = {
        'content-type': content_type,
        'Authorization': signature,
        'Log-Type': log_type,
        'x-ms-date': rfc1123date
    }

    response = requests.post(uri,data=body, headers=headers)

post_data is the function that we interact with directly to send the information to Azure. It will call build_signature and form the payload. It takes our Workspace ID, Workspace Key, Body (JSON formated string), and name of the log. Since we store our data or payload in JSON format, we don’t have to do much else other than read the json file and pass it along.

def OnKill(signum, frame):
    global run
    run = False

signal.signal(signal.SIGINT, OnKill)
signal.signal(signal.SIGTERM, OnKill)

This is our exit handler. Since the script runs in the background and we don’t directly interact with it, we need a way to tell it to exit and what to do when we tell it to do so. We use the signal library to handle these. First we define our function that gets called (OnKill) and then we register handlers to call it depending on which type of signal was sent to the application. The first registers it to handle “INT” signals. In our case, this would be like hitting Ctrl+C from an SSH terminal to kill the running application or using kill -2 <pid>. The second intercepts “TERM” signals or kill -15 <pid>. If either of these signals is sent to our program, tell our program to set the run variable to false which causes our loop in the next code block to stop looping.

while run:
    try:
        #Read Current Data Points
        f = open('/var/www/html/py/data.json', 'r')
        data = json.load(f)
        f.close()

        body = json.dumps(data)

        post_data(customer_id, shared_key, body, log_type)
    except Exception as e:
        now = datetime.datetime.now()
        print(e)
        f = open('/var/www/html/python_errors.log', 'a')
        f.write("%s - AZURE [%i] - %s\n" % (now.strftime("%Y-%m-%d %H:%M:%S"), sys.exc_info()[-1].tb_lineno, e))
        f.close()
    time.sleep(LoopDelay)

This loop will continue to run as long as run is set to True. If we receive one of the signals mentioned earlier (SIGINT or SIGTERM) we set run to False and kill the loop. The loop is simple. It pulls the latest data from the data.json file and passes it as the payload to Log Analytics using the REST API. Our data.json file looks like this:

{
    "MtAuto": 152,
    "HltAuto": 130,
    "BkStatus": 0,
    "BkMode": "M",
    "HltStatus": 1,
    "HltHsTemp": 25.93,
    "BkAuto": 8.7,
    "MtTemp": 66.65,
    "HltDelta": 1,
    "BkTemp": 66.988,
    "MtDelta": 1,
    "HltMan": 50,
    "BkMan": 50,
    "CpuTemp": 46.738,
    "BkHsTemp": 25.88,
    "HltCycle": 3,
    "BkCycle": 5,
    "HltTemp": 66.313,
    "HltMode": "A",
    "BkDelta": 1
}

This is the data that we pass to Log Analytics which can then be queried. In the below example, we pull a count of all entries where the measured HLT Temp was higher than the set point + delta and bin it in 5 minute intervals. These would be periods where we overshoot our temps.

BrewController_CL
| where HltTemp > (HltAuto + HltDelta)
| summarize count() by bin(TimeGenerated, 5m)

If you’re thinking to yourself that the azure code looks familiar, yes, it probably does. It’s the code that’s available on the Azure Monitor Data Collector API page. I’m using the python 2 version of the code.

tempcontrol.py

The temp control script is where most of the magic happens. This is where we read our sensor data, act on it (if needed) and write the data to a file for the UI or Azure script to ingest and process.

 while run:
        currTime = round(time.time(), 1)

        #Update existing values
        if UpdateTemps:
            #Load Data
            f = open('/var/www/html/py/data.json', 'r')
            data = json.load(f)
            f.close()

            #Update Data
            data['HltHsTemp'] = GetHeatsink(0)
            data['CpuTemp'] = GetCPU()
            data['BkHsTemp'] = GetHeatsink(1)
            data['HltTemp'] = THERMO.getTEMP(0,11)
            data['MtTemp'] = THERMO.getTEMP(0,10)
            data['BkTemp'] = THERMO.getTEMP(0,9)
            UpdateTemps = False

            #Update Uptime File
            f = open('/var/www/html/py/uptime', 'w')
            f.write(str(currTime - startTime))
            f.close()

            #Check for Updated Targets
                #Mode
            if os.path.exists('/var/www/html/py/mode.json'):
                f = open('/var/www/html/py/mode.json', 'r')
                NewData = json.load(f)
                f.close()
                if NewData['Target'] == 'hlt':
                    data['HltMode'] = NewData['NewMode']
                elif NewData['Target'] == 'bk':
                    data['BkMode'] = NewData['NewMode']
                os.remove('/var/www/html/py/mode.json')

                #Temp
            if os.path.exists('/var/www/html/py/temp.json'):
                f = open('/var/www/html/py/temp.json', 'r')
                NewData = json.load(f)
                f.close()
                if NewData['Target'] == 'hlt' and NewData['Mode'] == 'a':
                    data['HltAuto'] = NewData['Value']
                elif NewData['Target'] == 'hlt' and NewData['Mode'] == 'm':
                    data['HltMan'] = NewData['Value']
                elif NewData['Target'] == 'mt' and NewData['Mode'] == 'a':
                    data['MtAuto'] = NewData['Value']
                elif NewData['Target'] == 'bk' and NewData['Mode'] == 'a':
                    data['BkAuto'] = NewData['Value']
                elif NewData['Target'] == 'bk' and NewData['Mode'] == 'm':
                    data['BkMan'] = NewData['Value']
                os.remove('/var/www/html/py/temp.json')

                #Settings
            if os.path.exists('/var/www/html/py/settings.json'):
                f = open('/var/www/html/py/settings.json', 'r')
                NewData = json.load(f)
                f.close()
                data['HltCycle'] = NewData['HltCycle']
                data['HltDelta'] = NewData['HltDelta']
                data['MtDelta'] = NewData['MtDelta']
                data['BkCycle'] = NewData['BkCycle']
                data['BkDelta'] = NewData['BkDelta']
                os.remove('/var/www/html/py/settings.json')

        else:
            UpdateTemps = True

        #HLTControl
        if data['HltMode'] == 'A':
            HLTOn = ActionCount + 1
            if data['HltTemp'] < (data['HltAuto'] - data['HltDelta']):
                TurnHLTOn()
            else:
                TurnHLTOff()
        else:
            if HLTOn == ActionCount:
                TurnHLTOn()
                HltCycleLen = data['HltCycle'] * 2
                HLTOn = ActionCount + HltCycleLen
                HLTOff = ActionCount + ((float(data['HltMan']) / 100) * HltCycleLen)
            if HLTOff == ActionCount:
                TurnHLTOff()
            elif HLTOn < ActionCount and HLTOff < ActionCount:
                HLTOn = ActionCount + 1
        
        #BKControl
        if data['BkMode'] == 'A':
            BKOn = ActionCount + 1
            if data['BkTemp'] < (data['BkAuto'] - data['BkDelta']):
                TurnBKOn()
            else:
                TurnBKOff()
        else:
            if BKOn == ActionCount:
                TurnBKOn()
                BkCycleLen = data['BkCycle'] * 2
                BKOn = ActionCount + BkCycleLen
                BKOff = ActionCount + ((float(data['BkMan']) / 100) * BkCycleLen)
            if BKOff == ActionCount:
                TurnBKOff()
            elif BKOn < ActionCount and BKOff < ActionCount:
                BKOn = ActionCount + 1

        ActionCount += 1

        #Save Data
        f = open('/var/www/html/py/data.json', 'w')
        json.dump(data, f)
        f.close()

        #Sleep
        time.sleep(0.5)
except Exception as e:
    now = datetime.now()
    print(e)
    f = open('/var/www/html/python_errors.log', 'a')
    f.write("%s - TEMP CONTROL [%i] - %s\n" % (now.strftime("%Y-%m-%d %H:%M:%S"), sys.exc_info()[-1].tb_lineno, e))
    f.close()

We start with reading the temp probe data starting on line 5. Since our DS18B20 probes only update once per second, and our loop runs every 1/2 second, there is no need to update them every loop. We use a simple bool flag to run this portion of code every other loop. Next, we check our HLT and BK to determine whether to cycle them on, off, or do nothing. If we’re in automatic mode, we check the current temp (Pv) and compare it against the target temp (Sv) and hysteresis (Delta). If Pv < (Sv – Delta) then we turn the element on. If not, we turn it off. On a true PID controller, the element might remain on longer in order to stabilise the temperature and reduce the variance. So far, I’ve found that I haven’t needed the more advanced logic of a true PID controller to keep my temps where I need them, so I’ve opted to keep things simple. I may end up switching it to PID logic should I find that not the case, but the first couple batches that I’ve done on the system have been really good. If the elements are in manual mode, we take a look at the cycle length, compare it to on percentage and calculate the number of loops the element should be on and off. We then set each into a variable that tells it when to turn on.

Azure Monitor Workbook

So, we’re brewing beer and putting data into Azure. How do we use that data? I have a TV in my brew area that I pull the workbook up on on brew day. This allows me to monitor data and trends and make adjustments whether it be for that brew session, or the next one.

Our Brewery Controller workbook

What does this mean?

I’ve marked a few points on the image above which show some interesting points in our brew day.

  1. You can see the blue line on the HLT graph set to our strike temperature which was 165℉ for this recipe. You can also see the greenish line that indicates that actual temperature of the water. For this, I was heating around 18 Gal of water from room temp (about 68℉) to our strike temperature.
  2. At point 2, we hit our strike temp and can perform our mash in. You’ll notice that a few minutes after we hit temp, it plumets to around 130℉ and then starts to rise again. After I mashed in, I added another 5 Gal of water to the HLT to ensure I had enough for mash out/sparge. Since the water I added was also at room temp, it dropped the temperature of the HLT down which then gets heated back up. You’ll also notice that I reduced the set point of the HLT at this time. I make use of a HERMS coil to maintain mash temperatures, so once mashing in is taken care of, I set the temperature of my HLT to whatever temperature I want my mash to be and then re-circulate it through the HERMS (pictures below).
  3. The area around point 3 is where I ran into a problem. Some of the rice hulls that I was using as a filter bed got sucked into the hoses. It took me a while to figure out what was going on. The sprayer that I was using to evenly distribute water across the top of the grain bed got clogged by the rice hulls. Water was still coming out, but it wasn’t re-circulating enough water to affect the mash temperature. I swapped out the end of my re-circulation sprayer with a different style which included a larger opening. Once in place, temperatures were back where I wanted them.
  4. Number 4 is where I performed my mash out. I raised to the HLT temp to 168℉ to halt conversion and began sparging. The point where the temperature starts to drop is the point that I shut the element override off from the switch on the control panel. The controller doesn’t see this which is why we have a divergence from the set value and process value.
  5. Point 5 is where I added and heated up some additional water to use for cleaning.
  6. Point 6 calls out the light blue lines. These are used to indicate when the element is “on” as far as the controller is concerned. On the HLT it’s pretty easy to see the periods where the element is on and off. On the BK, it’s a bit more condensed since it’s in manual mode and automatically firing quite rapidly. I have found so far that with a 5 second cycle length and the element on 50%, I can maintain a nice boil without boiling over.
  7. You’ll notice at Point 7, there is a sharp hook, a dip, and then a climb before it levels off. Before this point, I had my element set to something like 75% or 80%. This was a bit too aggressive and caused a near boil-over. The spike is the point where I turned the BK override off, made some adjustments, and turned it back on.

Dashboard Template

{
  "version": "Notebook/1.0",
  "items": [
    {
      "type": 9,
      "content": {
        "version": "KqlParameterItem/1.0",
        "parameters": [
          {
            "id": "ee40ef7f-bcea-4d1c-a3da-fca1355319cd",
            "version": "KqlParameterItem/1.0",
            "name": "TimeRange",
            "label": "Time Range",
            "type": 4,
            "isRequired": true,
            "value": {
              "durationMs": 43200000,
              "endTime": "2020-12-27T01:35:00.000Z"
            },
            "typeSettings": {
              "selectableValues": [
                {
                  "durationMs": 300000
                },
                {
                  "durationMs": 900000
                },
                {
                  "durationMs": 1800000
                },
                {
                  "durationMs": 3600000
                },
                {
                  "durationMs": 14400000
                },
                {
                  "durationMs": 43200000
                },
                {
                  "durationMs": 86400000
                }
              ],
              "allowCustom": true
            },
            "timeContext": {
              "durationMs": 86400000
            }
          },
          {
            "id": "9c69ebb2-0553-43b3-9662-162cd0a650b0",
            "version": "KqlParameterItem/1.0",
            "name": "FocusTime",
            "label": "Focused Time (min)",
            "type": 1,
            "isRequired": true,
            "value": "10",
            "typeSettings": {
              "paramValidationRules": [
                {
                  "regExp": "[\\d]*",
                  "match": true,
                  "message": "Must be a Number"
                }
              ]
            },
            "timeContext": {
              "durationMs": 86400000
            }
          }
        ],
        "style": "pills",
        "queryType": 0,
        "resourceType": "microsoft.operationalinsights/workspaces"
      },
      "name": "parameters - 5"
    },
    {
      "type": 3,
      "content": {
        "version": "KqlItem/1.0",
        "query": "BrewDay_CL\r\n| top 1 by TimeGenerated\r\n| extend Title = \"Last Message:\"\r\n| project TimeGenerated, Title",
        "size": 4,
        "timeContext": {
          "durationMs": 43200000,
          "endTime": "2020-12-27T01:35:00.000Z"
        },
        "timeContextFromParameter": "TimeRange",
        "queryType": 0,
        "resourceType": "microsoft.operationalinsights/workspaces",
        "visualization": "tiles",
        "tileSettings": {
          "titleContent": {
            "columnMatch": "Title"
          },
          "leftContent": {
            "columnMatch": "TimeGenerated",
            "formatter": 6
          },
          "showBorder": false,
          "size": "full"
        }
      },
      "name": "query - 4"
    },
    {
      "type": 3,
      "content": {
        "version": "KqlItem/1.0",
        "query": "BrewDay_CL\r\n| project TimeGenerated, HltAuto_d, HltMan_d, OnOff = HltStatus_d * 100, HltTemp_d, HltCycle_d, HltDelta_d",
        "size": 0,
        "aggregation": 5,
        "showAnalytics": true,
        "title": "Hot Liquor Tank",
        "timeContext": {
          "durationMs": 43200000,
          "endTime": "2020-12-27T01:35:00.000Z"
        },
        "timeContextFromParameter": "TimeRange",
        "queryType": 0,
        "resourceType": "microsoft.operationalinsights/workspaces",
        "visualization": "linechart"
      },
      "customWidth": "33",
      "showPin": true,
      "name": "HltOverall"
    },
    {
      "type": 3,
      "content": {
        "version": "KqlItem/1.0",
        "query": "BrewDay_CL\r\n| where (TimeGenerated > {TimeRange:start}) and (TimeGenerated < {TimeRange:end})\r\n| project TimeGenerated, MtAuto_d, MtTemp_d, MtDelta_d",
        "size": 0,
        "showAnalytics": true,
        "title": "Mash Tun",
        "timeContext": {
          "durationMs": 43200000,
          "endTime": "2020-12-27T01:35:00.000Z"
        },
        "timeContextFromParameter": "TimeRange",
        "queryType": 0,
        "resourceType": "microsoft.operationalinsights/workspaces",
        "visualization": "linechart"
      },
      "customWidth": "33",
      "showPin": true,
      "name": "MtOverall"
    },
    {
      "type": 3,
      "content": {
        "version": "KqlItem/1.0",
        "query": "BrewDay_CL\r\n| project TimeGenerated, BkAuto_d, BkMan_d, OnOff = BkStatus_d * 100, BkTemp_d, BkCycle_d, BkDelta_d",
        "size": 0,
        "aggregation": 5,
        "showAnalytics": true,
        "title": "Boil Kettle",
        "timeContext": {
          "durationMs": 43200000,
          "endTime": "2020-12-27T01:35:00.000Z"
        },
        "timeContextFromParameter": "TimeRange",
        "queryType": 0,
        "resourceType": "microsoft.operationalinsights/workspaces",
        "visualization": "linechart"
      },
      "customWidth": "33",
      "showPin": true,
      "name": "BkOverall"
    },
    {
      "type": 3,
      "content": {
        "version": "KqlItem/1.0",
        "query": "BrewDay_CL\r\n| where TimeGenerated > ago({FocusTime:value}m)\r\n| project TimeGenerated, HltAuto_d, HltMan_d, OnOff = HltStatus_d * 100, HltTemp_d, HltCycle_d, HltDelta_d",
        "size": 0,
        "title": "Hot Liquor Tank - Last {FocusTime:value} min",
        "queryType": 0,
        "resourceType": "microsoft.operationalinsights/workspaces",
        "visualization": "linechart"
      },
      "customWidth": "33",
      "name": "HltFocused"
    },
    {
      "type": 3,
      "content": {
        "version": "KqlItem/1.0",
        "query": "BrewDay_CL\r\n| where TimeGenerated > ago({FocusTime:value}m)\r\n| project TimeGenerated, MtAuto_d, MtTemp_d, MtDelta_d",
        "size": 0,
        "title": "Mash Tun - Last {FocusTime:value} min",
        "queryType": 0,
        "resourceType": "microsoft.operationalinsights/workspaces",
        "visualization": "linechart"
      },
      "customWidth": "33",
      "name": "MtFocused"
    },
    {
      "type": 3,
      "content": {
        "version": "KqlItem/1.0",
        "query": "BrewDay_CL\r\n| where TimeGenerated > ago({FocusTime:value}m)\r\n| project TimeGenerated, BkAuto_d, BkMan_d, OnOff = BkStatus_d * 100, BkTemp_d, BkCycle_d, BkDelta_d",
        "size": 0,
        "title": "Boil Kettle - Last {FocusTime:value} Min",
        "queryType": 0,
        "resourceType": "microsoft.operationalinsights/workspaces",
        "visualization": "linechart"
      },
      "customWidth": "33",
      "name": "BkLastFive"
    },
    {
      "type": 3,
      "content": {
        "version": "KqlItem/1.0",
        "query": "BrewDay_CL\r\n| project TimeGenerated, Target=HltAuto_d, Temperature=HltTemp_d, Manual=HltMan_d",
        "size": 0,
        "aggregation": 5,
        "showAnalytics": true,
        "title": "Hot Liquor Tank",
        "timeContext": {
          "durationMs": 172800000
        },
        "queryType": 0,
        "resourceType": "microsoft.operationalinsights/workspaces",
        "visualization": "linechart",
        "chartSettings": {
          "seriesLabelSettings": [
            {
              "seriesName": "Target",
              "color": "blue"
            },
            {
              "seriesName": "Temperature",
              "color": "turquoise"
            },
            {
              "seriesName": "Manual",
              "color": "greenDark"
            }
          ]
        }
      },
      "showPin": true,
      "name": "query - 8"
    },
    {
      "type": 3,
      "content": {
        "version": "KqlItem/1.0",
        "query": "BrewDay_CL\r\n| project TimeGenerated, Target=MtAuto_d, Temperature=MtTemp_d",
        "size": 0,
        "aggregation": 5,
        "showAnalytics": true,
        "title": "Mash Tun",
        "timeContext": {
          "durationMs": 172800000
        },
        "queryType": 0,
        "resourceType": "microsoft.operationalinsights/workspaces",
        "visualization": "linechart",
        "chartSettings": {
          "seriesLabelSettings": [
            {
              "seriesName": "Target",
              "color": "redBright"
            },
            {
              "seriesName": "Temperature",
              "color": "yellow"
            }
          ]
        }
      },
      "showPin": true,
      "name": "query - 9"
    },
    {
      "type": 3,
      "content": {
        "version": "KqlItem/1.0",
        "query": "BrewDay_CL\r\n| project TimeGenerated, Target = BkAuto_d, Temperature = BkTemp_d, Manual = BkMan_d",
        "size": 0,
        "aggregation": 5,
        "showAnalytics": true,
        "title": "Boil Kettle",
        "timeContext": {
          "durationMs": 172800000
        },
        "queryType": 0,
        "resourceType": "microsoft.operationalinsights/workspaces",
        "visualization": "linechart",
        "chartSettings": {
          "seriesLabelSettings": [
            {
              "seriesName": "Target",
              "color": "purple"
            },
            {
              "seriesName": "Temperature",
              "color": "green"
            },
            {
              "seriesName": "Manual",
              "color": "blue"
            }
          ]
        }
      },
      "showPin": true,
      "name": "query - 10"
    }
  ],
  "fallbackResourceIds": [
    "/subscriptions/<SUBSCRIPTIONID>/resourcegroups/<RESOURCEGROUP>/providers/Microsoft.OperationalInsights/workspaces/<WORKSPACENAME>"
  ],
  "$schema": "https://github.com/Microsoft/Application-Insights-Workbooks/blob/master/schema/workbook.json"
}

You’ll need to replace <SUBSCRIPTIONID>, <RESOURCEGROUP>, and <WORKSPACENAME> with appropriate values should you want to utilize this in your environment. From a data standpoint, I’ve found that a single brew day uses less than 200KB of quota. Even if you brew every day, you’ll likely never exceed the 5GB of free usage to incur a charge (based on current pricing plans as of 2021-01-01).

Brewery Pictures

This is the finished result. From left to right, HLT, MT, BK. Two pumps (March 809’s with Center Inlet TC Conversion heads from brewershardware.com and the plate chiller from my old setup with the connection fittings swapped to TC. For the pumps, I found that I was able to remove the nuts that hold the top of the table on, and use couplers and some stainless M6 threaded rod to create a “shelf” for the pumps to attach to. This allowed me to attach the pumps and chiller without drilling into or permanently altering the table.
This is the inside of my HLT. You can see the stainless wavy element below the HERMS coil. I ended up going with a 3 coil parallel one from brewpi. This reduces the height of the coil while keeping the surface area of a larger coil. The reduced height allows me to keep it submerged with substantially less water.
The inside of the Mash Tun. False bottom on the…. bottom.
This is what I’ve devised for a sparge arm/recirc arm. It’s a center pickup from Spike that I took a bit of the stainless that I cut from the control panel and fashioned a little flapper thing to spread the water over the top. The rubber band currently holds the flapper on. I may try to use some sort of a screw or something in the side to hold it up, but my machining capabilities for metal are a bit limited.

Build Costs & Parts List

The total cost of the build wasn’t actually as bad as I thought it might have been. The panel ended up a just over $1,600 and the brewery itself was just over $4,100. Those prices do not include shipping, but for most items, shipping was free. I would add probably another $100 to the overall total for shipping from places that it wasn’t free. The build sheet can be found on my OneDrive. There are two separate tabs; one for the panel, and one for the brewery. On the brewery side of things, there were many pieces of equipment that I already had from my last brewery, so this is not a comprehensive list for everything you need if you have nothing (things like hydrometers, scales, grain mill, etc). All of the source code is on my GitHub.

Thanks

I’d like to give a special thanks to a few people.

  • Mike Cymerman from Spike Brewing who helped with the design and port locations for the custom kettles.
  • Jerry Wasinger from Pi-Plates who helped troubleshoot some of the initial issues I ran into getting the code working for the THERMOplate.
  • Matt Dunlop for drinking with me and keeping me company on my first brew day with the new system and telling me it was cool.

Performance PowerShell: Part 1


About Performance

This past November, I had the privilege of speaking with Nathan Ziehnert at MMS Jazz Edition in New Orleans, LA about PowerShell Performance. If you haven’t heard of his blog (Z-Nerd), you should definitely check it out. I’ve been meaning to write more blog posts, and I though that our topic would make a great post.

PowerShell script performance is often an afterthought, if it’s even a thought at all. Many times, we look at a script and say “It’s Works! My job here is done.” And for some instances, that’s good enough. Perhaps I’m just writing a single run script to do a one time task and then it’ll get tossed. But what about the rest of our PowerShell scripts? What about all the scripts that run our automation, task sequences, and daily tasks?

In my opinion, performance can be broken down into three categories: Cost, Maintainability, and Speed. Cost in this regard is not how much money it costs to run a script in say Azure Automation, but rather what is the resource cost on the machine that’s running it. Think CPU, memory, and disk utilization. Maintainability is the time it takes to maintain or modify the script. As our environment evolves, solutions get upgraded, or modules and snap-ins get code updates, we need to give our script the occasional tune-up. This is where format, and good coding practices come into play. Lastly, we have speed. This first blog post will focus on speed, or how fast our scripts run.

Disclaimer

What I’m covering here is general. While it may apply to 99% of use cases, be sure to test for your individual script. There are a lot of factors that can affect the speed of a script including, but not limited to, OS, PowerShell version, hardware, and run behavior. What works for one script may not work for the next. What works on one machine, may not work on the rest. What works when you tested it, will never work in production (Murphy’s Law)

What Affects Performance?

There are quite a few things that can affect performance. Some of them we can control, others we can’t. For the things we can’t control, there are often ways to mitigate or work around the constraints. I’ll take a look at some of both in this post such as:

  • Test Order
  • Loop Execution
  • .NET Methods vs. Native Cmdlets
  • Strong Typed/Typecasting
  • Syntax
  • Output
Testing method

For testing/measuring, I wrote a couple of functions. The first is a function named Test-Performance. It returns PSObject with all of our test data that we can use to generate visualizations. The second is a simple function that takes the mean or median results from two sets, compares them, and returns a winner along with how much faster it was. The functions:

function Test-Performance {
    [CmdletBinding()]
    Param(
        [Parameter(Mandatory=$true,Position=1)]
        [ValidateRange(5,50000)]
        [int]$Count,
        [Parameter(Mandatory=$true,Position=2)]
        [ScriptBlock]$ScriptBlock
    )
    $Private:Occurrence = [System.Collections.Generic.List[Double]]::new()
    $Private:Sorted = [System.Collections.Generic.List[Double]]::new()
    $Private:ScriptBlockOutput = [System.Collections.Generic.List[string]]::new()
    [Double]$Private:Sum = 0
    [Double]$Private:Mean = 0
    [Double]$Private:Median = 0
    [Double]$Private:Minimum = 0
    [Double]$Private:Maximum = 0
    [Double]$Private:Range = 0
    [Double]$Private:Variance = 0
    [Double]$Private:StdDeviation = 0
    $Private:ReturnObject = '' | Select-Object Occurrence,Sorted,Sum,Mean,Median,Minimum,Maximum,Range,Variance,StdDeviation,Output

    #Gather Results
    for ($i = 0; $i -lt $Count; $i++) {
        $Timer = [System.Diagnostics.Stopwatch]::StartNew()
        #$Private:Output = Invoke-Command -ScriptBlock $ScriptBlock
        $Private:Output = $ScriptBlock.Invoke()
        $Timer.Stop()
        $Private:Result = $Timer.Elapsed
        $Private:Sum += $Private:Result.TotalMilliseconds
        [void]$Private:ScriptBlockOutput.Add($Private:Output)
        [void]$Private:Occurrence.Add($Private:Result.TotalMilliseconds)
        [void]$Private:Sorted.Add($Private:Result.TotalMilliseconds)
    }
    $Private:ReturnObject.Sum = $Private:Sum
    $Private:ReturnObject.Occurrence = $Private:Occurrence
    if (($Private:ScriptBlockOutput -notcontains "true") -and ($Private:ScriptBlockOutput -notcontains "false") -and ($Private:ScriptBlockOutput -notcontains $null)) {
        $Private:ReturnObject.Output = $Private:ScriptBlockOutput
    } else {
        $Private:ReturnObject.Output = $null
    }
    #Sort
    $Private:Sorted.Sort()
    $Private:ReturnObject.Sorted = $Private:Sorted

    #Statistical Calculations
    #Mean (Average)
    $Private:Mean = $Private:Sum / $Count
    $Private:ReturnObject.Mean = $Private:Mean

    #Median
    if (($Count % 2) -eq 1) {
        $Private:Median = $Private:Sorted[([Math]::Ceiling($Count / 2))]
    } else {
        $Private:Middle = $Count / 2
        $Private:Median = (($Private:Sorted[$Private:Middle]) + ($Private:Sorted[$Private:Middle + 1])) / 2
    }
    $Private:ReturnObject.Median = $Private:Median

    #Minimum
    $Private:Minimum = $Private:Sorted[0]
    $Private:ReturnObject.Minimum = $Private:Minimum

    #Maximum
    $Private:Maximum = $Private:Sorted[$Count - 1]
    $Private:ReturnObject.Maximum = $Private:Maximum

    #Range
    $Private:Range = $Private:Maximum - $Private:Minimum
    $Private:ReturnObject.Range = $Private:Range

    #Variance
    for ($i = 0; $i -lt $Count; $i++) {
        $x = ($Private:Sorted[$i] - $Private:Mean)
        $Private:Variance += ($x * $x)
    }
    $Private:Variance = $Private:Variance / $Count
    $Private:ReturnObject.Variance = $Private:Variance

    #Standard Deviation
    $Private:StdDeviation = [Math]::Sqrt($Private:Variance)
    $Private:ReturnObject.StdDeviation = $Private:StdDeviation
    
    return $Private:ReturnObject
}
Function Get-Winner {
    [CmdletBinding()]
    Param(
        [Parameter(Mandatory=$true,Position=1)]
        [ValidateNotNullOrEmpty()]
        [string]$AName,
        [Parameter(Mandatory=$true,Position=2)]
        [ValidateNotNullOrEmpty()]
        [Double]$AValue,
        [Parameter(Mandatory=$true,Position=3)]
        [ValidateNotNullOrEmpty()]
        [string]$BName,
        [Parameter(Mandatory=$true,Position=4)]
        [ValidateNotNullOrEmpty()]
        [Double]$BValue
    )
    if ($ClearBetweenTests) {
        Clear-Host
    }

    $blen = $AName.Length + $BName.Length + 12
    $Border = ''
    for ($i = 0; $i -lt $blen; $i++) {
        $Border += '*'
    }

    if ($OutToFile) {
        Out-File -FilePath $OutFileName -Append -Encoding utf8 -InputObject $Border
        Out-File -FilePath $OutFileName -Append -Encoding utf8 -InputObject ([string]::Format('**  {0} vs {1}  **', $AName, $BName))
        Out-File -FilePath $OutFileName -Append -Encoding utf8 -InputObject $Border
    }
    Write-Host $Border -ForegroundColor White
    Write-Host ([string]::Format('**  {0} vs {1}  **', $AName, $BName)) -ForegroundColor White
    Write-Host $Border -ForegroundColor White

    if ($AValue -lt $BValue) {
        $Faster = $BValue / $AValue
        if ($Faster -lt 1.05) {
            $Winner = 'Tie'
            $AColor = [ConsoleColor]::White
            $BColor = [ConsoleColor]::White
        } else {
            $Winner = $AName
            $AColor = [ConsoleColor]::Green
            $BColor = [ConsoleColor]::Red
        }
    } elseif ($AValue -gt $BValue) {
        $Faster = $AValue / $BValue
        if ($Faster -lt 1.05) {
            $Winner = 'Tie'
            $AColor = [ConsoleColor]::White
            $BColor = [ConsoleColor]::White
        } else {
            $Winner = $BName
            $AColor = [ConsoleColor]::Red
            $BColor = [ConsoleColor]::Green
        }
    } else {
        $Winner = 'Tie'
        $AColor = [ConsoleColor]::White
        $BColor = [ConsoleColor]::White
        $Faster = 0
    }
    
    $APad = ''
    $BPad = ''
    if ($AName.Length -gt $BName.Length) {
        $LenDiff = $AName.Length - $BName.Length
        for ($i = 0; $i -lt $LenDiff; $i++) {
            $BPad += ' '
        }
    } else {
        $LenDiff = $BName.Length - $AName.Length
        for ($i = 0; $i -lt $LenDiff; $i++) {
            $APad += ' '
        }
    }

    $AValue = [Math]::Round($AValue, 2)
    $BValue = [Math]::Round($BValue, 2)
    $Faster = [Math]::Round($Faster, 2)
    
    if ($OutToFile) {
        Out-File -FilePath $OutFileName -Append -Encoding utf8 -InputObject ([string]::Format('{0}:  {1}{2}ms', $AName, $APad, $AValue))
        Out-File -FilePath $OutFileName -Append -Encoding utf8 -InputObject ([string]::Format('{0}:  {1}{2}ms', $BName, $BPad, $BValue))
        Out-File -FilePath $OutFileName -Append -Encoding utf8 -InputObject ([string]::Format('WINNER: {0} {1}x Faster`r`n', $Winner, $Faster))
    }
    Write-Host ([string]::Format('{0}:  {1}{2}ms', $AName, $APad, $AValue)) -ForegroundColor $AColor
    Write-Host ([string]::Format('{0}:  {1}{2}ms', $BName, $BPad, $BValue)) -ForegroundColor $BColor
    Write-Host ([string]::Format('WINNER: {0} {1}x Faster', $Winner, $Faster)) -ForegroundColor Yellow
    if ($PauseBetweenTests -eq $true) {
        Pause
    }
}

Now, you may be wondering why I went through all of that trouble when there is a perfectly good Measure-Command cmdlet available. The reason is two fold. One, I wanted the statistics to be calculated without having to call a separate function. Two, Measure-Command does not handle output, and I wanted to be able to test and capture output if needed. If you haven’t tried to use Write-Output withing a Measure-Command script block before let me show you what I’m talking about:

PS C:> Measure-Command {Write-Host "Write-Host"}
Write-Host


Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 12
Ticks             : 128366
TotalDays         : 1.48571759259259E-07
TotalHours        : 3.56572222222222E-06
TotalMinutes      : 0.000213943333333333
TotalSeconds      : 0.0128366
TotalMilliseconds : 12.8366



PS C:> Measure-Command {Write-Output "Write-Output"}                                                  

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 0
Milliseconds      : 0
Ticks             : 4005
TotalDays         : 4.63541666666667E-09
TotalHours        : 1.1125E-07
TotalMinutes      : 6.675E-06
TotalSeconds      : 0.0004005
TotalMilliseconds : 0.4005

Notice that when we call Write-Output that we don’t see the output? With my Test-Performance function, we can grab that output and still capture it. The output from the two functions looks like this:

 Occurrance   : {4.0353, 1.0091, 0, 0…}
 Sorted       : {0, 0, 0, 1.0091…}
 Sum          : 5.0444
 Mean         : 1.00888
 Median       : 1.0091
 Minimum      : 0
 Maximum      : 4.0353
 Range        : 4.0353
 Variance     : 2.4425469256
 StdDeviation : 1.56286497356618
 Output       : {Write-Output, Write-Output, Write-Output, Write-Output…}

 ******************************
 **  Filter vs Where-Object  **
 ******************************
 Filter:        22ms
 Where-Object:  1007.69393ms
 WINNER: Filter 45.80x Faster

One other thing to mention is that I did not simply use $Start = Get-Date, $Stop = (Get-Date)-$Start. Why? Because some things happen so fast that I need to measure the speed in ticks or microseconds. Get-Date only measures time down to the millisecond, so anything less than a millisecond will be rounded to either 0 or 1 millisecond.

Test Order

With that out of the way, let’s look at test order first. Test Order is the order in which conditions are evaluated within a flow control block such as if/then or do/while. The compiler or engine will evaluate the conditions from left to right while respecting the order of operations. Why is this important? Let’s say you have the following if statement:

if (($haystack -contains "needle") -and ($x -eq 5)) {
    Do-Stuff
}

We have two conditions: Does $Haystack contain the string “needle” and does $x equal 5. With the and statement we tell the engine that both must be true to meet the conditions of the if statement. The engine will evaluate the first statement, and if true, will continue through the remaining statements until it reaches either a false statement or has evaluated all statements. Let’s take a quick look at how long it takes to evaluate a few different types of conditions.

$v = 5000
$a = 1..10000
$s = 'reg.exe'
$as = (Get-ChildItem -Path C:\Windows\System32 -Filter '*.exe').Name

Test-Performance -Count 100 -ScriptBlock {$v -eq 5000}
Test-Performance -Count 100 -ScriptBlock {$a -contains 5000}
Test-Performance -Count 100 -ScriptBlock {$s -eq 'reg.exe'}
Test-Performance -Count 100 -ScriptBlock {$as -contains 'reg.exe'}

That gives me the following output:

 Occurrence   : {0.9741, 0, 0, 5.9834…}
 Sorted       : {0, 0, 0, 0…}
 Sum          : 40.8437
 Mean         : 0.408437
 Median       : 0
 Minimum      : 0
 Maximum      : 5.9834
 Range        : 5.9834
 Variance     : 0.538503541531
 StdDeviation : 0.733828005414757
 Output       :

 Occurrence   : {0.9977, 0.9969, 0, 0.9971…}
 Sorted       : {0, 0, 0, 0…}
 Sum          : 67.7895
 Mean         : 0.677895
 Median       : 0.9934
 Minimum      : 0
 Maximum      : 3.9467
 Range        : 3.9467
 Variance     : 0.450743557675
 StdDeviation : 0.671374379668304
 Output       :

 Occurrence   : {0.989, 0.9973, 0, 0…}
 Sorted       : {0, 0, 0, 0…}
 Sum          : 52.9174
 Mean         : 0.529174
 Median       : 0
 Minimum      : 0
 Maximum      : 8.9804
 Range        : 8.9804
 Variance     : 1.222762781524
 StdDeviation : 1.10578604690238
 Output       :

 Occurrence   : {0.997, 0, 0, 1.0292…}
 Sorted       : {0, 0, 0, 0…}
 Sum          : 74.7425
 Mean         : 0.747425
 Median       : 0.957
 Minimum      : 0
 Maximum      : 6.9484
 Range        : 6.9484
 Variance     : 1.391867727275
 StdDeviation : 1.1797744391514
 Output       :

What we see is that comparing if something is equal to something else is a lot faster than checking to see if an array contains an object. Now, I know, you’re thinking it’s just a couple milliseconds, but, checking if $v is equal to 5000 is almost twice as fast as checking if $as contains “reg.exe”. Keep in mind, that depending on where in the array our match is and how big our array is, that number can go up or down quite a bit. I’m just doing some simple synthetic tests to illustrate that there is a difference. When doing conditional statements like this, try to have your quicker conditions evaluated first and try to have statements that are most likely to fail evaluated first. Example:

Test-Performance -Count 100 -ScriptBlock {
    if (($x -eq 100) -or ($a -contains -5090) -or ($s -eq 'test.fake') -or ($as -contains 'reg.exe')) {
        $t = Get-Random
    }
}

Test-Performance -Count 100 -ScriptBlock {
    if (($as -contains 'reg.exe') -or ($x -eq 100) -or ($a -contains -5090) -or ($s -eq 'test.fake')) {
        $t = Get-Random
    }
}

Gives me the following:

Occurrence   : {0.9858, 0, 0.9969, 0…}
 Sorted       : {0, 0, 0, 0…}
 Sum          : 36.8537
 Mean         : 0.368537
 Median       : 0
 Minimum      : 0
 Maximum      : 3.9959
 Range        : 3.9959
 Variance     : 0.390509577731
 StdDeviation : 0.624907655362774
 Output       :

 Occurrence   : {0.9974, 0, 0.9971, 0…}
 Sorted       : {0, 0, 0, 0…}
 Sum          : 54.8193
 Mean         : 0.548193
 Median       : 0.4869
 Minimum      : 0
 Maximum      : 3.9911
 Range        : 3.9911
 Variance     : 0.425326705251
 StdDeviation : 0.652170763873236
 Output       :

We can see that by changing the order of the evaluated conditions, our code runs in about 2/3 the time. Again, these are generic tests to illustrate the effect that test order has on execution time, but they illustrate some basic guidelines that should be able to be applied to most situations. Be sure to test your code.

Loop Execution

Now that I’ve gotten test order out of the way, let’s start with loop execution. Sometimes when we are working with a loop, like say stepping through an array, we don’t need to do something for every element. Sometimes, we are looking for a specific element and don’t care about anything after that. In these cases, break is our friend. For our first example, I’ll create an array of all years in the 1900’s. I’ll then loop through each one and write some output when I find 1950.

$Decade = 1900..1999
$TargetYear = 1950

$NoBreakResult = Test-Performance -Count 10 -ScriptBlock {
    for ($i = 0; $i -lt $Decade.Count; $i++) {
        if ($Decade[$i] -eq 1950) {
            Write-Output "Found 1950"
        }
    }
}

$BreakResult = Test-Performance -Count 10 -ScriptBlock {
    for ($i = 0; $i -lt $Decade.Count; $i++) {
        if ($Decade[$i] -eq 1950) {
            Write-Output "Found 1950"
            break
        }
    }
}

Get-Winner "No Break" $NoBreakResult.Median "Break" $BreakResult.Median

Our output looks as follows:

 ************************* 
 **  No Break vs Break  **
 *************************
 No Break:  0.38ms
 Break:     0.28ms
 WINNER: Break 1.35x Faster

$NoBreakResult
 Occurrence   : {0.8392, 0.4704, 0.4566, 0.444…}
 Sorted       : {0.3425, 0.3438, 0.3442, 0.3445…}
 Sum          : 47.8028
 Mean         : 0.478028
 Median       : 0.38175
 Minimum      : 0.3425
 Maximum      : 2.6032
 Range        : 2.2607
 Variance     : 0.127489637016
 StdDeviation : 0.357056910052165
 Output       : {Found 1950, Found 1950, Found 1950, Found 1950…}

$BreakResult
 Occurrence   : {3.2739, 0.3445, 0.32, 0.3167…}
 Sorted       : {0.2657, 0.266, 0.266, 0.2662…}
 Sum          : 40.0342
 Mean         : 0.400342
 Median       : 0.2871
 Minimum      : 0.2657
 Maximum      : 3.2739
 Range        : 3.0082
 Variance     : 0.182262889036
 StdDeviation : 0.426922579674582
 Output       : {Found 1950, Found 1950, Found 1950, Found 1950…}

As expected, the instance with the break commandwas about 25% faster. Next I’ll take a look at a different method that I don’t see in too many peoples code, the While/Do-While loop.

$DoWhileResult = Test-Performance -Count 100 -ScriptBlock {
    $i = 0
    $Year = 0
    do {
        $Year = $Decade[$i]
        if ($Year -eq 1950) {
            Write-Output "Found 1950"
        }
        $i++
    } While ($Year -ne 1950)
}

Which nets me the following:

 ****************************
 **  Do-While vs No Break  **
 ****************************
 Do-While:  0.24ms
 No Break:  0.38ms
 WINNER: Do-While 1.57x Faster

$DoWhileResult
 Occurrence   : {0.9196, 0.313, 0.2975, 0.2933…}
 Sorted       : {0.2239, 0.224, 0.2242, 0.2243…}
 Sum          : 33.8217
 Mean         : 0.338217
 Median       : 0.2436
 Minimum      : 0.2239
 Maximum      : 5.0187
 Range        : 4.7948
 Variance     : 0.262452974211
 StdDeviation : 0.512301643771519
 Output       : {Found 1950, Found 1950, Found 1950, Found 1950…}

As we can see, Do-While is also faster than running through the entire array. My example above does not have any safety mechanism for running beyond the end of the array or not finding the element I’m searching for. In practice, be sure to include such a condition/catch in your loop. Next, I’m going to compare the performance between a few different types of loops. Each loop will run through an array of numbers from 1 to 10,000 and calculate the square root of each. I’ll use the basic for loop as the baseline to compare against the other methods.

$ForLoop = Test-Performance -Count 100 -ScriptBlock {
    $ForArray = 1..10000
    for ($i = 0; $i -lt 10000; $i++) {
        $sqrt = [Math]::Sqrt($Array[$i])
    }
}

$ForEachLoop = Test-Performance -Count 100 -ScriptBlock {
    $ForEachArray = 1..10000
    foreach ($item in $ForEachArray) {
        $sqrt = [Math]::Sqrt($item)
    }
}

$DotForEachLoop = Test-Performance -Count 100 -ScriptBlock {
    $DotForEachArray = 1..10000
    $DotForEachArray.ForEach{
        $sqrt = [Math]::Sqrt($_)
    }
}

$ForEachObjectLoop = Test-Performance -Count 100 -ScriptBlock {
    $ForEachObjectArray = 1..10000
    $ForEachObjectArray | ForEach-Object {
        $sqrt = [Math]::Sqrt($_)
    }
}

So how do they fare?

 ***********************
 **  For vs For-Each  **
 ***********************
 For:       3ms
 For-Each:  1.99355ms
 WINNER: For-Each 1.50x Faster
 
 ***********************
 **  For vs .ForEach  **
 ***********************
 For:       3ms
 .ForEach:  1150.95495ms
 WINNER: For 383.65x Faster
 
 *****************************
 **  For vs ForEach-Object  **
 *****************************
 For:             3ms
 ForEach-Object:  1210.7644ms
 WINNER: For 403.59x Faster

Quite a bit of difference. Let’s take a look at the statistics for each of them.

$ForLoop
 Occurrence   : {38.8952, 3.9984, 2.9752, 2.966…}
 Sorted       : {0, 0, 1.069, 1.9598…}
 Sum          : 330.7618
 Mean         : 3.307618
 Median       : 2.9731
 Minimum      : 0
 Maximum      : 38.8952
 Range        : 38.8952
 Variance     : 26.100085284676
 StdDeviation : 5.10882425658546
 Output       :

$ForEachLoop
 Occurrence   : {7.0133, 1.9972, 1.9897, 0.9678…}
 Sorted       : {0.9637, 0.9678, 0.9927, 0.9941…}
 Sum          : 187.5277
 Mean         : 1.875277
 Median       : 1.99355
 Minimum      : 0.9637
 Maximum      : 7.0133
 Range        : 6.0496
 Variance     : 0.665303603371
 StdDeviation : 0.815661451443551
 Output       :

$DotForEachLoop
 Occurrence   : {1225.7258, 1169.9073, 1147.9007, 1146.9384…}
 Sorted       : {1110.0618, 1110.0688, 1113.9906, 1114.0656…}
 Sum          : 114948.9291
 Mean         : 1149.489291
 Median       : 1150.95495
 Minimum      : 1110.0618
 Maximum      : 1225.7258
 Range        : 115.664
 Variance     : 534.931646184819
 StdDeviation : 23.1285893686757
 Output       :

$ForEachObjectLoop
 Occurrence   : {1217.7802, 1241.7037, 1220.686, 1249.688…}
 Sorted       : {1181.8081, 1188.8231, 1188.8291, 1191.7818…}
 Sum          : 121345.8078
 Mean         : 1213.458078
 Median       : 1210.7644
 Minimum      : 1181.8081
 Maximum      : 1274.6289
 Range        : 92.8208000000002
 Variance     : 318.356594303116
 StdDeviation : 17.8425501065043
 Output       :

If you notice on the for and for-each loops, the first run is significantly higher than the other entries (in fact, it’s the slowest run in the batch for each) whereas the method and cmdlet versions are much more consistent. This is due to the behavior of those methods. With a for and for-each loop, the method loads the entire collection into memory before processing. This causes the first run of the loop to take a bit longer, although, it’s still faster than the method or cmdlet. The cmdlet and method are slower as they load one iteration into memory at a time which is slower than loading the sum all at once (think random read/write vs sequential). The for loop is slightly slower than for-each because it has to evaluate the condition before proceeding through the next iteration.

.NET Methods vs. Cmdlets

Next, I’ll take a look at some of the differences between some of the common “native” PowerShell cmdlets and their .NET counterparts. I’m going to start with what will likely be the most common things that you’ll encounter in your scripts or scripts that you use, the array. We’ve probably all used them, and maybe even continue to use them. But should you? First, Let’s look at adding items to an array. We frequently start with blank arrays and add items to them as we go along.

$ArrayResult = Test-Performance -Count 100 -ScriptBlock {
    $Array = @()
    for ($i =0; $i -lt 10000; $i ++) {
        $Array += $i
    }
}

$ListResult = Test-Performance -Count 100 -ScriptBlock {
    $List = [System.Collections.Generic.List[PSObject]]::new()
    for ($i =0; $i -lt 10000; $i ++) {
        [void]$List.Add($i)
    }
}

Get-Winner "Array" $ArrayResult.Median "List" $ListResult.Median
 *********************
 **  Array vs List  **
 *********************
 Array:  2274ms
 List:   2.97945ms
 WINNER: List 763.23x Faster

$ArrayResult
 Occurrence   : {2407.5676, 2311.8239, 2420.5336, 2268.9383…}
 Sorted       : {2190.1917, 2200.1205, 2219.1807, 2223.0887…}
 Sum          : 228595.7729
 Mean         : 2285.957729
 Median       : 2274.42135
 Minimum      : 2190.1917
 Maximum      : 2482.3996
 Range        : 292.2079
 Variance     : 2527.01010551066
 StdDeviation : 50.2693754239165
 Output       :

$ListResult
 Occurrence   : {24.9343, 19.9729, 3.9623, 5.9836…}
 Sorted       : {0.9974, 1.9776, 1.9925, 1.994…}
 Sum          : 373.999
 Mean         : 3.73999
 Median       : 2.97945
 Minimum      : 0.9974
 Maximum      : 51.8617
 Range        : 50.8643
 Variance     : 37.0781465771
 StdDeviation : 6.0891827511662
 Output       :

Modifying an existing array can be a VERY expensive operation. Arrays are fixed length and can not be expanded or contracted. When we add or subtract an element, the engine first has to create a new array of size n + 1 or n – 1 and then copy each of the elements from the old array into the new one. This is slow, and can consume a lot of memory while the new array is being created and contents copied over. Lists on the other hand are not statically sized. The advantage of an array however is that they have a smaller memory footprint. Since an array is stored as a whole consecutively in memory, it’s size can roughly be calculated as SizeOf(TypeOf(Element))*NumElements. A Linked list on the other hand is not stored consecutively within memory and is a bit larger since each element contains a pointer to the next object. It’s size can roughly be calculated as (SizeOf(TypeOf(Element)) + SizeOf(Int)) * NumElements. You might be thinking, well, if an array is stored in a consecutive memory blocks, once the array is established, it should be faster to work with right? I’ll test.

[int[]]$Array = 1..10000
$List = [System.Collections.Generic.List[int]]::new()
for ($i = 1; $i -lt 10001; $i++) {
    [void]$List.Add($i)
}

$ArrayForEachResult = Test-Performance -Count 100 -ScriptBlock {
    foreach ($int in $Array) {
        $int = 5
    }
}

$ListForEachResult = Test-Performance -Count 100 -ScriptBlock {
    foreach ($int in $List) {
        $int = 5
    }
}

First, we create an array of 10,000 elements with the numbers 1 through 10,000 inclusive. We declare the array as an integer array to ensure we are comparing to objects of the same type so to speak. We then create a list<int> and fill with the same values. So how do they fare?

 ***************************************
 **  Array For-Each vs List For-Each  **
 ***************************************
 Array For-Each:  1.47ms
 List For-Each:   0.83ms
 WINNER: List For-Each 1.77x Faster

$ArrayForEachResult
 Occurrence   : {4.643, 1.4673, 1.4029, 1.3336…}
 Sorted       : {1.3194, 1.3197, 1.3255, 1.3272…}
 Sum          : 156.4036
 Mean         : 1.564036
 Median       : 1.47125
 Minimum      : 1.3194
 Maximum      : 4.643
 Range        : 3.3236
 Variance     : 0.136858367504
 StdDeviation : 0.369943735592319
 Output       : {, , , …}

$ListForEachResult
 Occurrence   : {7.233, 1.723, 0.8305, 0.8632…}
 Sorted       : {0.6174, 0.6199, 0.6203, 0.6214…}
 Sum          : 164.8467
 Mean         : 1.648467
 Median       : 0.83335
 Minimum      : 0.6174
 Maximum      : 71.705
 Range        : 71.0876
 Variance     : 50.017547074011
 StdDeviation : 7.07230846852787
 Output       : {, , , …}

As we can see, the list still out-performs the array, although by less of a margin than it did during the manipulation test. I suspect that this is due to the cmdlet having to load the entirety of the array as opposed to just pointers with the list. Now let’s compare .NET Regex vs the PowerShell method. For this, I’m going to be replacing text instead of just checking for the match. Let’s look at the code.

$Haystack = "The Quick Brown Fox Jumped Over the Lazy Dog 5 Times"
$Needle = "\ ([\d]{1})\ "

$NetRegexResult = Test-Performance -Count 1000 -ScriptBlock {
    [regex]::Replace($Haystack, $Needle, " $(Get-Random -Minimum 2 -Maximum 9) ")
    Write-Output $Haystack
}

$PoshRegexResult = Test-Performance -Count 1000 -ScriptBlock {
    $Haystack -replace $Needle, " $(Get-Random -Minimum 2 -Maximum 9) "
    Write-Output $Haystack
}

Get-Winner ".NET RegEx" $NetRegexResult.Median "PoSh RegEx" $PoshRegexResult.Median

Nothing too fancy here. We take our haystack (the sentence), look for the needle (the number of times the fox jumped over the dog) and replace it with a new single digit random number.

 ********************************
 **  .NET RegEx vs PoSh RegEx  **
 ********************************
 .NET RegEx:  0.23ms
 PoSh RegEx:  0.23ms
 WINNER: Tie 1x Faster

$NetRegexResult
 Occurrence   : {0.7531, 0.2886, 0.3572, 0.3181…}
 Sorted       : {0.2096, 0.2106, 0.211, 0.2189…}
 Sum          : 282.3331
 Mean         : 0.2823331
 Median       : 0.23035
 Minimum      : 0.2096
 Maximum      : 2.2704
 Range        : 2.0608
 Variance     : 0.03407226235439
 StdDeviation : 0.184586733961003
 Output       : {The Quick Brown Fox Jumped Over the Lazy Dog 5 Times The Quick Brown Fox Jumped Over the Lazy Dog 5 Times, The Quick Brown Fox Jumped Over the Lazy Dog 8 Times   
                The Quick Brown Fox Jumped Over the Lazy Dog 5 Times, The Quick Brown Fox Jumped Over the Lazy Dog 4 Times The Quick Brown Fox Jumped Over the Lazy Dog 5 Times,   
                The Quick Brown Fox Jumped Over the Lazy Dog 4 Times The Quick Brown Fox Jumped Over the Lazy Dog 5 Times…}

$PoshRegexResult
 Occurrence   : {0.7259, 0.2546, 0.2513, 0.2486…}
 Sorted       : {0.2208, 0.2209, 0.2209, 0.2211…}
 Sum          : 279.0913
 Mean         : 0.2790913
 Median       : 0.231
 Minimum      : 0.2208
 Maximum      : 2.1124
 Range        : 1.8916
 Variance     : 0.03001781767431
 StdDeviation : 0.173256508317321
 Output       : {The Quick Brown Fox Jumped Over the Lazy Dog 8 Times The Quick Brown Fox Jumped Over the Lazy Dog 5 Times, The Quick Brown Fox Jumped Over the Lazy Dog 6 Times   
                The Quick Brown Fox Jumped Over the Lazy Dog 5 Times, The Quick Brown Fox Jumped Over the Lazy Dog 7 Times The Quick Brown Fox Jumped Over the Lazy Dog 5 Times,   
                The Quick Brown Fox Jumped Over the Lazy Dog 2 Times The Quick Brown Fox Jumped Over the Lazy Dog 5 Times…}

Surprisingly, or not surprisingly, the two methods are pretty much dead even. This is one of the cases where I suspect the PowerShell cmdlet is pretty much just a wrapper/alias for the corresponding .NET equivalent. You might ask yourself why you would use the .NET methods if the PowerShell cmdlets net the same performance. The answer (and this applies in a lot of cases) is that the PowerShell Cmdlets don’t always offer the same options as their .NET counterparts. I’m going to use String.Split as an example. Take a look at the two documentation pages for “String” -Split and String.Split. You may have noticed that they aren’t entirely the same. Far from it actually. In most cases, they will return you with the same results, but they don’t both support the same options. For example, if you want to remove blank entries, you’ll need to use the .Split() method. But what about performance?

$SplitString = "one,two,three,four,five,six,seven,eight,nine,ten,"
$DashSplitResult = Test-Performance -Count 10000 -ScriptBlock {
    $SplitArray = $SplitString -Split ','
}

$DotSplitResult = Test-Performance -Count 10000 -ScriptBlock {
    $SplitArray = $SplitString.Split(',')
}

Get-Winner "-Split" $DashSplitResult.Median ".Split()" $DotSplitResult.Median
 **************************
 **  -Split vs .Split()  **
 **************************
 -Split:    0.13ms
 .Split():  0.12ms
 WINNER: .Split() 1.1x Faster

$DashSplitResult
 Occurrence   : {0.4855, 0.1837, 0.2387, 0.1916…}
 Sorted       : {0.1128, 0.113, 0.1131, 0.1131…}
 Sum          : 1613.99049999999
 Mean         : 0.161399049999999
 Median       : 0.125
 Minimum      : 0.1128
 Maximum      : 2.1112
 Range        : 1.9984
 Variance     : 0.0234987082460975
 StdDeviation : 0.153292883872988
 Output       : {, , , …}

$DotSplitResult
 Occurrence   : {0.552, 0.1339, 0.1245, 0.1227…}
 Sorted       : {0.1052, 0.1056, 0.1056, 0.1057…}
 Sum          : 1485.38330000001
 Mean         : 0.148538330000001
 Median       : 0.1162
 Minimum      : 0.1052
 Maximum      : 1.9226
 Range        : 1.8174
 Variance     : 0.0186857188798111
 StdDeviation : 0.136695716391594
 Output       : {, , , …}

Pretty close, but .Split does edge out -Split by just a hair. Is it worth re-writing all of your code? Doubtful. But if you use string splitting methods frequently, it may be worth doing some testing with your common use cases to see if there could be an impact. And with that, I’m going to wrap up the first part of this post.

Sending TiltPi Data to Azure Log Analytics via PHP

I use a couple Tilt Hydrometers to track and log the fermentation data for my beers. While there are a few different ways that one can upload the data to the internet, I noticed that there wasn’t an automated way to upload to Azure Log Analytics. TiltPi is built on a customized variant of Raspbian. The current version utlizes Raspbian Stretch, which in turn is based on Debian Stretch. Microsoft supports Linux and has an OMS agent available for both x86 and x64 variants of Linux. They don’t, however, currently support the ARM platform which the Raspberry Pi boards are built on. Attempting to install the OMS Agent via the script from GitHub fails. While the agent is built on Ruby and Fluentd which are both compatable with ARM64, other components such as omi are not. If all you need is syslogs, there is a fluentd plugin for OMS which Microsoft supports. Unfortunately, the Tilt app does not write to the syslog. It writes to it’s own CSV formatted log. That leaves us with two options: write a daemon to forward new entries to syslog or utilize the cloudurl and write the data directly to Log Analytics using the REST API. The first sounded like more effort than I wanted to put into this project and would likely be purpose built for this and this alone. The second method though, sounded a bit easier, and more flexible. Thankfully, Microsoft has a nice bit of documentation regarding the HTTP Data Collector API on their docs website.

I was able to use this information to build a framework for writing data directly to Azure via a PHP script. Microsoft’s documentation thankfully has all the information we need.  The examples helped with some of the basic pieces that I needed to get going, however, I did have to do a bit of playing around to get the signature to work since translating from one language to another is not always straight forward.  .NET for example requires byte arrays when calling encode/decode/hash functions whereas PHP just wants the raw string.  When calling the API, part of the HTTP header is a Signature that authorizes you and tells Log Analytics where to put the data.  The basic formula is base64_encode(sha256_hmac(base64_decode(<SharedKey>), <HashData>)).  Let’s break that down.  First, we need to create our HashData.  This is a string formatted as follows:  <SendMethod>\n<PayloadLength>\n<ContentType>\nx-ms-date:<RFC1123Date>\n<SendResource>.  All of these are detailed in the documentation, but essentially we have “POST\n<PayloadLength>\napplication/json\nx-ms-date: <RFC1123Date>\n/api/logs” where <PayloadLength> is the length of our data we are going to send and <RFC1123Date> is the current date/time in RFC 1123 format.  Once we have that string, we need to ensure that it is encoded as utf8.  Once we have the utf8 encoded version of the string, we pass that to hash_hmac and have it create a sha256 hash using that and our base64 decoded key.  We then do a base64_encode on the resulting hash and create our signature; “SharedKey <WorkspaceID>:<EncodedHash>”.  After we have that, we can use the curl libraries to send the data to OMS.  The code for our function looks like this:

public function Send_Log_Analytics_Data($Payload) {
	#Get current timestamp (UTC) and Payload length
	$Rfc1123Date = gmdate('D, d M Y H:i:s T');
	$PayloadLength = strlen($Payload);
	
	#Generate the Signature
	$StringToSign = "{$this->SendMethod}\n{$PayloadLength}\n{$this->SendContentType}\nx-ms-date:{$Rfc1123Date}\n{$this->SendResource}";
	$Utf8String = utf8_encode($StringToSign);
	$Hash = hash_hmac('sha256', $Utf8String, base64_decode($this->PrimaryKey), true);
	$EncodedHash = base64_encode($Hash);
	$Signature = "SharedKey {$this->WorkspaceId}:{$EncodedHash}";
	
	#Build the URI and headers for cURL
	$Uri = "https://{$this->WorkspaceId}.ods.opinsights.azure.com{$this->SendResource}?api-version={$this->SendApiVersion}";
	$Handle = curl_init($Uri);
	$Headers = array("Authorization: {$Signature}", "Log-Type: {$this->LogType}", "x-ms-date: {$Rfc1123Date}", "time-generated-field: {$this->TimeStampField}", "Content-Type: {$this->SendContentType}");
	
	#Set cURL Options
	curl_setopt($Handle, CURLOPT_POST, true);
	curl_setopt($Handle, CURLOPT_HEADER, true) or die("Failed to set cURL Header");
	curl_setopt($Handle, CURLOPT_HTTPHEADER, $Headers) or die("Failed to set cURL Header");
	curl_setopt($Handle, CURLOPT_POSTFIELDS, $Payload);
	curl_setopt($Handle, CURLOPT_RETURNTRANSFER, true) or die("Failed to set cURL Return Transfer");
	
	#Execute cURL and return the result
	$Result = curl_exec($Handle);
	return curl_getinfo($Handle, CURLINFO_HTTP_CODE);
}

Now that we can send data, what data are we sending? The Tilt app sends the following pieces of information to whatever cloud url you specify in the logging tab:

  • Timepoint (Excel datetime value)
  • Temp (deg F)
  • SG (Specific Gravity)
  • Beer (Name of the Beer)
  • Color (Color of the Tilt)
  • Comment (blank unless you manually send data and add a comment)

We can take this post data and create our payload by converting it into a JSON formatted string.

$TimeStamp = gmdate('Y-m-d\TH:i:s.v\Z');
$Entry = "[{\"Timepoint\": \"{$TimeStamp}\", \"Temp\": {$Temp}, \"SG\": {$SG}, \"Beer\": \"{$Beer}\", \"Color\": \"{$Color}\", \"Comment\": \"{$Comment}\",\"Source\": \"TiltPHP\"}]";

Since we’re doing this in real time as data is sent, we are ignoring the timestamp that the Tilt app has created and instead using our own. This lets us skip having to convert the Excel formatted date into one that Log Analytics can use. If you need to use the Excel date, it’s just the number of days that have passed since 12:00AM 1900-01-01. The entry we created above is what we pass to the Send_Log_Analytics_Data as the $Payload parameter. And now, we can sit down, have a beer, and watch data flow into Log Analytics. If you’re interested in looking at the full code, I’ve posted it up on my GitHub space. Soon I’ll write something up on how to do something with that data 😉