Self hosted runners (for repositories)

Introduction

This document is an overview of deploying self-hosted Github runners that can be deployed to a repository.

Benefits and limitations

Self-hosted runners can be used:

to save costs when running Github Actions against private and internal repositories (these incur costs on the Github platform, where public repositories don’t)
to allow actions and their scripts to gain access to internal endpoints within the CloudPlatform cluster (to modify IP allowlists to permit all of the external Github runnner IP addresses would be cumbersome)

There are, however, limitations to self-hosted runners. Since they work within a Docker container on Kubernetes pods, they are not able to get root access, which is required for docker build commands (for example).

What this document is about

This document will take the example of hmpps-github-actions-runner to illustrate the process of installing and configuring a self-hosted Github Runner on Cloud Platform infrastructure.

Pre-requisites

The Github repository where the runner is built should have an associated Cloud Platforms namespace with a github service account, so that Kubernetes credentials are populated in the prod environment secrets.
A Github app is required to authenticate. This needs to have the following permissions (others may be required, but these are the basics, I believe. If this changes, I will update the document accordingly).
- Access to the repository for which it will be accepting requests
- Repository permissions:
  - Actions (Read & Write)
  - Contents (Read & Write)
  - Variables (Read)
- Organisation permissions:
  - Self-hosted runners (Read & Write)
The App ID (listed at the top of the app within the organisation’s Developer Settings -> Github Apps)
A valid private key for the App (generate this in the app’s settings page within the organisations Developer Settings -> Github Apps)
The Github App needs to have permissions to act on the repository to which the runner will be assigned (within the Repository access section of the Github App’s settings page).

Building and deploying the Github app

Copying the `hmpps-github-actions-runner` app

Make a copy/fork of the hmpps-github-actions-runner app.

The following files/directories will need to be changed:

/helm/deploy/hmpps-github-actions-runner/
/helm/deploy/hmpps-github-actions-runner/values.yaml
/helm/deploy/hmpps-github-actions-runner/values-dev.yaml (if required)
/helm/deploy/hmpps-github-actions-runner/values-prod.yaml
/Dockerfile

Most of these can be achieved with a global rename of hmpps-github-actions-runner to the new repository name, but in detail:

Helm configuration

Rename the /helm/deploy/hmpps-github-actions-runner directory to the name of the new repository
Edit the values.yaml file:

---
generic-service:
  nameOverride: hmpps-github-actions-runner

Change the nameOverride to the repository name

  replicaCount: 2 # we can start with one and do more

If you require more or fewer runners, this is where you set it

  image:
    repository: ghcr.io/ministryofjustice/hmpps-github-actions-runner

Change the image repository to match the repository name

generic-prometheus-alerts:
  targetApplication: hmpps-github-actions-runner

Change the target application to match the repository name

values-dev.yaml and values-prod.yaml

generic-prometheus-alerts:
  alertSeverity: hmpps-sre-alerts-dev

generic-prometheus-alerts:
  alertSeverity: hmpps-sre-alerts

These represeent severity labels associated with the Cloud Platforms Alertmanager configuration. Set these as appropriate for your app, follwing the instructions on the Cloud Platforms documentation site

Dockerfiles

It’s only the first set of lines that really needs to be changed to match the runner you’re installing

LABEL org.opencontainers.image.vendor="Ministry of Justice" \
      org.opencontainers.image.authors="HMPPS DPS" \
      org.opencontainers.image.title="Actions Runner" \
      org.opencontainers.image.description="Actions Runner image for HMPPS DPS" \
      org.opencontainers.image.url="https://github.com/ministryofjustice/hmpps-github-actions-runner"

Change the names as appropriate.

Note: Github requires that the Github Actions Runners versions are kept up-to-date; if an old version is deployed, there is a good chance it will be unable to register because it’s too old.

    ACTIONS_RUNNER_VERSION="2.321.0" \
    ACTIONS_RUNNER_PKG_SHA="ba46ba7ce3a4d7236b16fbe44419fb453bc08f866b24f04d549ec89f1722a29e"

Use the latest version of the runner and SHA from the Github Actions Runner releases page - the checksum will be the one corresponding to actions-runner-linux-x64

Secrets and variables

The following repository secrets are required within the newly created repository:

GH_APP_PRIVATE_KEY (generated within the settings of your Github App)
(this is the full multiline key - it doesn’t need to be in base64)

The following repository variables are required within the newly created repository:

GH_APP_ID (found within the settings of your Github App)
GH_REPOSITORY (the repository to which the runner will be associated)
RUNNER_LABELS (the labels that will be required to match workflows to a particular set of runners)

Deployment

The Github Actions Runner is deployed using Helm to the Cloud Platforms environment.
The current deployment pipeline configuration only deploys to production - since the runners are associated with a repository
and not an environment, there’s no real point in running it in dev. However, if you want to swap to a dev namespace,
simply comment out the deploy_to_prod section of .github/workflows/pipeline.yaml and uncomment the deploy_to_dev section.

# deploy_to_dev:
  #   name: Deploy to dev
  #   uses: ./.github/workflows/deploy.yml
  #   needs: build
  #   with:
  #     environment: development
  #     version: $
  #   secrets: inherit

  # Only need to deploy to production nowadays
  deploy_to_prod:
    if : github.ref == 'refs/heads/main'
    name: Deploy to prod
    uses: ./.github/workflows/deploy.yml
    needs:
      - build
    with:
      environment: production
      version: $
    secrets: inherit

Checking the deployment

Running kubectl get pods --namespace {your_namespace} will, if the deployments have been successful,
list the runners - these should also correspond to a set of runners in the repository to which they have been assigned,
which should be labelled ‘idle’ (look in settings/actions/runners within the repo site):

ubectl get pods
NAME                                           READY   STATUS      RESTARTS       AGE
.
.
hmpps-github-actions-runner-6f64d8f98f-6sdl4   1/1     Running     0              170m
hmpps-github-actions-runner-6f64d8f98f-bvlwj   1/1     Running     0              170m
.
.

Troubleshooting

If the status of the pods isn’t Running, diagnose any issues using kubectl logs {pod_name} -
the top part of the log should list the variables that have been passed in,
and whether the access token has successfully retrieved a registration token:

Runner parameters:
  Repository: ministryofjustice/hmpps-project-bootstrap
  Runner Name: hmpps-github-actions-runner-6f64d8f98f-6sdl4
  Runner Labels: hmpps-github-actions-runner
Obtaining registration token
Checking if registration token exists
Registration token obtained successfully

Don’t worry too much if the status checks (Internet Conection, Github Actions Connection)
are marked as ‘FAIL’ - these appear to be benign errors.

If the first part succeeds, but the runner fails with an error like:

# Authentication

Http response code: NotFound from 'POST https://api.github.com/actions/runner-registration' (Request Id: AA80:3E8BD1:1057DCF:13105EE:67489A88)
{"message":"Not Found","documentation_url":"https://docs.github.com/rest","status":"404"}
Response status code does not indicate success: 404 (Not Found).

…this is a sign that the Github Access token (generated by the github-app-jwt-token action during deployment) does not have the required permissions to deploy a runner to the repository.

This may be because Github App doesn’t have access to the repository (it will need to be added specifically - access to the entire organisation’s repositories is strongly discouraged),
or the the Github App doesn’t have the right permissions to run the API call successfully.
As you’ll see from this document the permissions are many and complicated;
if more permissions are required than initially described in this document, please update it with the revised collection.

Further development

While these runners are suitable for simple self-hosted operations on a single repository, there are other tools available, such as those mentioned here - for example Kaniko.
Organisation-based runners are possibility, too, to reduce the idle time that would be faced by runners in a single repository.
To assist with this, there is also Actions Runner Control that can scale runners across an organisation.

APPENDIX 1: github-app-jwt-token

Reference document: Generating an installation access token for a Github App

With previous deployments, the create-github-app-token action hosted by Github was used to generate an access token to authenticate as a Github Installation.

However, this didn’t seem to be sufficient to allow the Github App to do things like deploying a runner to anther repository.

Instead, I discovered this variant - jamestrousdale/github-app-jwt-token - which generates a JWT token as well as an accces token that has the permissions that are required.

This is therefore recommended when attempting to authenticate using a Github App.

APPENDIX 2: Runner clean-up

As detailed above, the runners are currently registered using an ephemeral access token, passed through an environment variable during deployment.

Consequently, if the Kubernetes pod fails (after an hour) and a new one is spun up, the registration token is no longer valid, which will cause the runner to fail to be registered.

Furthermore, if a new runner is deployed (for example when the version of the github actions runner is updated), the old runner will not be able to use the access token to deregister it as it terminates, since the token isn’t valid then, either.

Alternatives

Rather than sending a token at deploymen time, the APP_ID and APP_PRIVATE_KEY could be sent, and a fresh token generated by the app at startup.
Sending this sort of privileged information may be considered a low security option.

The APP_ID and APP_PRIVATE_KEY could be assigned to secrets within the namespace, and once again, the runner could request a fresh token on start-up (or termination) each time.
This idea hasn’t been pursued because at this stage, having everything in Github is quite convenient.

The runner cleanup script

Removing offline runners is currently carried out by the actions/runner-cleanup action within the runner’s repository.
This authenticates using the access token, and runs a small Python script that creates a list of the offline runners associated with the repository referred to in the GH_REPOSITORY variable.

It’s run right at the end of the deployment stage.

_{This page was last reviewed on 28-Nov-2024, next review will be on 28-Feb-2025.

Edit this page here.}