$ cd ..

Fastest multi-arch Docker builds on Github Actions

📅 2025-10-26

⌛ 139 days ago

📖 4 min read

In 2022 I hacked together a hybrid setup for @SuperSeriousBot: keep GitHub’s managed x86 runners, bolt on my own arm64 box over SSH, and let buildx juggle them both. It was janky, but it delivered a 10x speedup over emulating arm64 locally.

Now that GitHub ships first-party arm64 runners, the obvious question: can I ditch the self-hosted machine and still smash my multi-arch build times?

Short answer: yes. The old workflow took 10 minutes 20 seconds. The new one lands at 2 minutes 45 seconds, with no extra hardware to babysit.

Screenshot comparing multi-arch build times: 10m20s with QEMU vs 2m45s with native arm64 runners

Where things stood

To support linux/amd64 + linux/arm64 images, I previously leaned on QEMU emulation and a single Buildx invocation. It looked like this:

name: Release

on:
  push:
    branches:
      - "master"

jobs:
  publish:
    name: Build Docker Image
    runs-on: ubuntu-latest
    permissions:
      packages: write
      contents: read

    steps:
      - uses: actions/checkout@v4

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v3

      - name: Set up Docker Buildx
        id: builder
        uses: docker/setup-buildx-action@v3

      - name: Login to GHCR
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Build and push
        uses: docker/build-push-action@v5
        with:
          context: .
          platforms: linux/amd64,linux/arm64
          file: ./Dockerfile
          push: true
          tags: |
            ghcr.io/obviyus/gotm-remix:${{ github.sha }}
            ghcr.io/obviyus/gotm-remix:latest
          cache-from: type=gha
          cache-to: type=gha,mode=max

It got the job done, but every arm64 layer had to run under emulation. Even with aggressive caching (cache-to hit 1.2 GB at one point), each release still idled for ten minutes while QEMU ground through Bun + Remix builds.

Enter GitHub’s arm64 runners

GitHub now offers ARM-powered hosts via runs-on: ubuntu-24.04-arm. That means I can schedule a real linux/arm64 job without self-hosting or SSH tunnels. Pair that with the existing x86 fleet and we can parallelise the build matrix properly.

The upgraded workflow splits the heavy lifting into two stages: build each architecture on native metal, then stitch manifests together.

name: Release

on:
  push:
    branches:
      - master

env:
  IMAGE: ghcr.io/obviyus/gotm-remix

jobs:
  build:
    strategy:
      fail-fast: false
      matrix:
        include:
          - platform: linux/amd64
            runner: ubuntu-latest
            artifact: linux-amd64
          - platform: linux/arm64
            runner: ubuntu-24.04-arm
            artifact: linux-arm64
    runs-on: ${{ matrix.runner }}
    permissions:
      contents: read
      packages: write
    steps:
      - uses: actions/checkout@v5
      - uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - uses: docker/setup-buildx-action@v3
      - name: Build & push by digest
        id: build
        uses: docker/build-push-action@v6
        with:
          context: .
          file: ./Dockerfile
          platforms: ${{ matrix.platform }}
          outputs: type=image,name=${{ env.IMAGE }},push-by-digest=true,name-canonical=true,push=true
          cache-from: type=gha,scope=${{ github.ref_name }}-gotm-remix
          cache-to: type=gha,mode=max,scope=${{ github.ref_name }}-gotm-remix
          provenance: mode=max
          sbom: true
      - name: Export digest
        run: |
          mkdir -p ${{ runner.temp }}/digests
          echo "${{ steps.build.outputs.digest }}" | sed 's/^sha256://' | xargs -I{} touch "${{ runner.temp }}/digests/{}"
      - uses: actions/upload-artifact@v4
        with:
          name: digests-${{ matrix.artifact }}
          path: ${{ runner.temp }}/digests/*
          retention-days: 1

  merge:
    needs: build
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
    steps:
      - uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - uses: docker/setup-buildx-action@v3
      - uses: actions/download-artifact@v4
        with:
          path: ${{ runner.temp }}/digests
          pattern: digests-*
          merge-multiple: true
      - name: Create and push manifest
        working-directory: ${{ runner.temp }}/digests
        run: |
          docker buildx imagetools create \
            -t $IMAGE:latest \
            -t $IMAGE:${{ github.sha }} \
            $(printf "$IMAGE@sha256:%s " *)
      - name: Inspect
        run: docker buildx imagetools inspect $IMAGE:latest

Why this is faster

Native arm64 silicon — Bun’s compiler and Vite’s asset pipeline execute on Neoverse cores instead of QEMU’s TCG interpreter, so every syscall and file watch stays in-kernel rather than jumping through software translation.
Parallel execution — BuildKit instances run on separate runners, letting docker/build-push-action fan out the layer graph; pushes complete once per architecture with no cross-platform coordination inside a single builder.
Digest-first publishing — push-by-digest=true writes OCI payloads once per platform and defers manifest creation; the merge step replays imagetools create against cached digests, so a cache miss on arm64 no longer forces a full multi-arch upload.
Scoped cache — Cache exporters are keyed to ${{ github.ref_name }}-gotm-remix, keeping branch builds sandboxed and retaining the hot path for master instead of thrashing a global cache namespace.

The end result is a 3.7x improvement (for this specific case!) versus the already-optimised 2022 setup—without the maintenance overhead of a bespoke runner.

Caveats worth noting

The arm64 pool is still smaller than the x86 fleet. Queue times have been fine so far, but I expect longer waits around big releases or US daytime.
docker/build-push-action@v6 requires BuildKit features that older Dockerfiles might not expect (SBOMs, provenance).
The final artifact juggling looks awkward, but it keeps things clean: each job uploads just the digest string, so the the merge job only needs login to the target registry and the digests; it doesn’t need the per-arch builders or local images.