The Starting Point
When you're working with the DICOM standard, you end up accumulating a collection of related libraries and tools. modalware keeps the following repositories as forks for reference and development:
| Repository | What it is |
|---|---|
| modalware/dicom-validator-ts | TypeScript DICOM validator |
| modalware/cornerstone3D | Web-based medical image viewer |
| modalware/dcmjs | DICOM for the browser |
| modalware/dwv | DICOM Web Viewer |
| modalware/dcmtk | The classic DICOM toolkit |
| modalware/dicomweb-client | DICOMweb client library |
| modalware/dicom-validator | DICOM conformance validator |
| modalware/pydicom | Python DICOM library |
| modalware/pynetdicom | Python DICOM networking |
| modalware/dicomParser | Lightweight DICOM parser |
| modalware/DVTk | DICOM Validation Toolkit |
Left unattended, forks drift from their upstreams. Comparing PRs or reading the latest source becomes increasingly awkward. The manual alternative — opening each repo, clicking "Sync fork," repeating — doesn't scale.
The solution: a dedicated management repository, modalware/fork-sync, that handles everything through GitHub Actions. The forks themselves stay untouched.
Part 1: Syncing Real Forks
gh repo sync --force
GitHub CLI ships a gh repo sync command that aligns a fork with its upstream in one shot. The --force flag matters here: without it, the command refuses to overwrite if the fork has any divergent commits. Since these repos are meant to be pure mirrors, force is the right default.
gh repo sync modalware/dicom-validator-ts --force
The workflow
Eleven repositories, one workflow, matrix strategy. fail-fast: false ensures that if one sync fails (say, a transient network issue), the other ten still run.
# .github/workflows/sync-forks.yml
name: Sync Upstream Forks
on:
schedule:
- cron: '0 2 * * *' # Daily at UTC 02:00 (JST 11:00)
workflow_dispatch: # Manual trigger available
jobs:
sync:
runs-on: ubuntu-latest
strategy:
matrix:
repo:
- dicom-validator-ts
- cornerstone3D
- dcmjs
- dwv
- dcmtk
- dicomweb-client
- dicom-validator
- pydicom
- pynetdicom
- dicomParser
- DVTk
fail-fast: false
max-parallel: 4
name: Sync ${{ matrix.repo }}
steps:
- name: Sync fork with upstream
run: gh repo sync modalware/${{ matrix.repo }} --force
env:
GH_TOKEN: ${{ secrets.SYNC_TOKEN }}
GITHUB_TOKEN — the default token a workflow gets automatically — only has write access to its own repository. Writing to other repos in the organization requires a Personal Access Token, stored here as SYNC_TOKEN.
The Workflows scope you'll miss if you don't look for it
The first test run failed on most repositories with:
Upstream commits contain workflow changes, which require the `workflow` scope
or permission to merge.
A Fine-grained PAT with only Contents: Write isn't enough. When upstream commits touch .github/workflows/ files, GitHub requires an additional Workflows permission. This separation is intentional: workflow files define what code runs in CI, so GitHub draws a deliberate permission boundary around them. Easy to miss because most token documentation doesn't highlight it.
Part 2: The Repo That Wasn't a Fork
One of the twelve repositories — dicom3tools — turned out not to be a GitHub fork at all. It was created by manually extracting an upstream tarball and pushing the contents. gh repo sync confirmed this immediately:
can't determine source repository for modalware/dicom3tools because repository is not fork
What dicom3tools is
dicom3tools is a suite of DICOM utilities maintained by David Clunie, who has been involved in shaping the DICOM standard itself for decades. There is no official GitHub repository. Instead, snapshots are published on his site as .tar.bz2 files with timestamps baked into the filename:
dicom3tools_1.00.snapshot.20250525134203.tar.bz2
dicom3tools_1.00.snapshot.20250526102624.tar.bz2
dicom3tools_1.00.snapshot.20260320044638.tar.bz2
modalware/dicom3tools started from one of these snapshots. The goal is to keep it current as new snapshots appear — each one becoming its own commit tagged 1.00.snapshot.YYYYMMDDHHMMSS.
The snapshot workflow
The approach: scrape the index page for available snapshot filenames, compare against the latest git tag in the repository, download and apply any newer ones in chronological order.
# .github/workflows/sync-dicom3tools.yml
name: Sync dicom3tools Snapshots
on:
schedule:
- cron: '0 3 1 * *' # Monthly on the 1st at UTC 03:00 (JST 12:00)
workflow_dispatch:
jobs:
sync:
runs-on: ubuntu-latest
steps:
- name: Checkout dicom3tools
uses: actions/checkout@v4
with:
repository: modalware/dicom3tools
token: ${{ secrets.SYNC_TOKEN }}
fetch-depth: 0
- name: Find and apply new snapshots
shell: bash
run: |
set -euo pipefail
BASE_URL="https://dclunie.com/dicom3tools/workinprogress"
# Get the timestamp of the latest tag already in the repo
LATEST_TS=$(git tag --sort=version:refname | grep -oE '[0-9]{14}' | tail -1 || echo "")
# Scrape available snapshots from the index page
SNAPSHOTS=$(curl -sf "$BASE_URL/index.html" \
| grep -oE 'dicom3tools_1\.00\.snapshot\.[0-9]{14}\.tar\.bz2' \
| sort -u)
# Keep only snapshots newer than the latest tag
NEW_SNAPSHOTS=""
while IFS= read -r snap; do
TS=$(echo "$snap" | grep -oE '[0-9]{14}')
if [ -z "$LATEST_TS" ] || [[ "$TS" > "$LATEST_TS" ]]; then
NEW_SNAPSHOTS+="$snap"$'\n'
fi
done <<< "$SNAPSHOTS"
NEW_SNAPSHOTS=$(printf '%s' "$NEW_SNAPSHOTS" | grep -v '^$' | sort)
[ -z "$NEW_SNAPSHOTS" ] && echo "No new snapshots." && exit 0
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
while IFS= read -r snap; do
[ -z "$snap" ] && continue
TAG=$(echo "$snap" | sed 's/^dicom3tools_//; s/\.tar\.bz2$//')
curl -fL --retry 3 --retry-delay 10 --max-time 120 \
-o "$snap" "$BASE_URL/$snap"
# Temporarily disable pipefail — see note below
set +o pipefail
TOP_DIR=$(tar tf "$snap" 2>/dev/null | head -1 | sed 's|/.*||')
set -o pipefail
git rm -rf --quiet . || true
tar xf "$snap"
rm -f "$snap"
if [ -n "$TOP_DIR" ] && [ -d "$TOP_DIR" ]; then
shopt -s dotglob nullglob
mv "$TOP_DIR"/* . 2>/dev/null || true
rmdir "$TOP_DIR" 2>/dev/null || true
shopt -u dotglob nullglob
fi
git add -A
git commit -m "snapshot: $TAG"
git tag "$TAG"
done <<< "$NEW_SNAPSHOTS"
git push origin HEAD --tags
The first run caught up ten snapshots in one go. Subsequent monthly runs will pick up only what's newer than the latest tag, so reruns are safe.
The pipefail trap
Three consecutive runs failed with exit code 2. The download was succeeding — curl's progress bar confirmed the full 1 MB received — but the very next step fell over:
tar: stdout: write error
##[error]Process completed with exit code 2.
The culprit was this line:
TOP_DIR=$(tar tf "$snap" | head -1 | sed 's|/.*||')
head -1 reads one line and exits, closing the read end of the pipe. tar is still writing the rest of the file listing to that pipe — but the write end is now broken. It receives SIGPIPE and exits with code 2. With set -o pipefail active, a non-zero exit from any stage of a pipeline fails the whole pipeline. The command substitution inherits pipefail, so the failure propagates up and kills the script.
The fix is to bracket the offending pipeline with a temporary pipefail suspension:
set +o pipefail
TOP_DIR=$(tar tf "$snap" 2>/dev/null | head -1 | sed 's|/.*||')
set -o pipefail
This pattern applies to any pipeline intentionally truncated early — head -N, tail -N, awk 'NR==1{exit}'. If pipefail is on, pipe truncation will silently kill your script unless you guard around it.
Setting Up the Personal Access Token
Both workflows need write access to repositories in the modalware organization. The right tool is a Fine-grained Personal Access Token.
Token configuration
| Setting | Value |
|---|---|
| Token name | modalware-fork-sync (or anything descriptive) |
| Resource owner | modalware — select the organization, not your personal account |
| Repository access | Only select repositories → pick all 12 target repos |
| Contents | Read and write |
| Workflows | Read and write |
Registering the secret
gh secret set SYNC_TOKEN --repo modalware/fork-sync --body "github_pat_..."
Or add it manually via the repository's Settings → Secrets and variables → Actions.
Cost
GitHub Actions pricing in brief:
| Condition | Cost |
|---|---|
| Public repositories | Free, unlimited |
| Private repositories | 2,000 minutes/month free; $0.008/minute beyond that |
fork-sync is a public repository, so all Actions runs are free. In practice, syncing all eleven forks takes around 16 seconds. The monthly dicom3tools workflow adds a handful of seconds on months where new snapshots appear. Total compute cost: zero.
The Short Version
- Put the automation in a dedicated management repo; leave the target repos untouched.
gh repo sync owner/repo --forcehandles any genuine GitHub fork in one step.- A Fine-grained PAT needs both
Contents: WriteandWorkflows: Write— the workflow scope is a separate permission boundary, and upstreams frequently have.github/changes. - Repos created from manually pushed tarballs aren't forks. For those, scrape the upstream source, compare timestamps against git tags, download and commit in order.
set -o pipefail+head -1= SIGPIPE = exit code 2. Wrap withset +o pipefail/set -o pipefailaround any pipeline you intentionally truncate.
The full source is at modalware/fork-sync.
About Tatsuhiko Arai (新井 竜彦)
Embedded software engineer (Qt, C/C++, Python). Medical imaging (DICOM) contractor. AWS All Certifications Engineer – Japan (2024–2025).

