#18228 · @jerome-benoit · opened Mar 19, 2026 at 11:17 AM UTC · last updated Mar 21, 2026 at 11:29 AM UTC

fix: add concurrency control to nix-hashes workflow

appfix
82
+401 files

Score breakdown

Impact

9.0

Clarity

10.0

Urgency

9.0

Ease Of Review

9.0

Guidelines

9.0

Readiness

9.0

Size

10.0

Trust

10.0

Traction

2.0

Summary

This PR fixes a critical race condition in the nix-hashes workflow, which previously caused stale hashes and build failures by overwriting correct hash values with older ones. The fix implements workflow-level concurrency control to ensure only the most recent run's hashes are committed.

Open in GitHub

Description

Issue for this PR

Closes #18227

Type of change

  • [x] Bug fix
  • [ ] New feature
  • [ ] Refactor / code improvement
  • [ ] Documentation

What does this PR do?

Adds a workflow-level concurrency group with cancel-in-progress: true to the nix-hashes workflow to prevent a race condition where two concurrent runs can overwrite each other's hashes.

The problem: When two pushes happen in quick succession (e.g. two commits within 2 minutes that both modify bun.lock or package.json), both trigger a nix-hashes run. The 4 matrix compute-hash jobs run at different speeds across platforms (darwin runners are significantly slower than linux). The update-hashes job that finishes last commits its hashes — which is not necessarily from the most recent commit.

Concrete evidence from 2026-03-18: Run 1 (SHA 81be5449, triggered 23:52) and Run 2 (SHA 5d2f8d77, triggered 23:54) ran concurrently. Run 2's update-hashes completed at 00:07:35, but Run 1's update-hashes completed at 00:08:33 — overwriting Run 2's correct hashes with stale ones from an older commit. This left x86_64-linux with hash sha256-yfA50QKqylmaioxi+6d++W8Xv4Wix1hl3hEF6Zz7Ue0= when the correct value is sha256-b0IXNtTj5geRLZGtCI5DxOXyqBJoxuwVf++bUgY3dco=.

The fix: concurrency.cancel-in-progress: true at workflow level cancels the entire older run (matrix jobs + update-hashes) when a newer push triggers the workflow. Combined with the existing git pull --rebase defense in the commit step, this eliminates the race condition. Workflow-level (not job-level) concurrency is used because job-level on update-hashes alone would still allow stale matrix results to queue up.

How did you verify your code works?

  • Audited the GitHub Actions concurrency model documentation: workflow-level cancel-in-progress sends SIGINT to all running jobs, and needs: dependency prevents update-hashes from starting if compute-hash is cancelled
  • Verified the concurrency.group uses github.workflow + github.ref so dev and beta runs don't cancel each other
  • Confirmed workflow_dispatch triggers are also covered by the concurrency group (same workflow name + ref)
  • Cross-validated against production patterns in other repos using matrix → aggregate → commit workflows

Screenshots / recordings

N/A — CI workflow change only.

Checklist

  • [x] I have tested my changes locally
  • [x] I have not included unrelated changes in this PR

Linked Issues

#18227 Race condition in nix-hashes workflow causes stale hashes

View issue

Comments

PR comments

b0o

+1. I think this might explain why certain opencode nix builds have been failing recently for me.

Changed Files

.github/workflows/nix-hashes.yml

+40
@@ -17,6 +17,10 @@ on:
- "patches/**"
- ".github/workflows/nix-hashes.yml"
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
# Native runners required: bun install cross-compilation flags (--os/--cpu)
# do not produce byte-identical node_modules as native installs.