Forgejo setup — Phase 6A¶
Stand up self-hosted Forgejo on a dedicated LXC, front it with Caddy, migrate four GitHub repos with push-mirrors going back to GitHub, and tie in Discord notifications + docs. Forgejo becomes the source of truth; GitHub stays as a private mirror.
Phase 6A complete — 2026-05-04 / 2026-05-05
Forgejo 11.0.13 live at https://git.rampancy.cloud (CT 109, .249) — fronted by Caddy on edge (CT 107), covered by CrowdSec, behind the wildcard *.rampancy.cloud LE cert. CF DNS token Rolled (closed the 7D leak follow-up). All 4 repos (arrstack, homelab-ansible, mediabot, meat-helmet) imported with full history and configured with per-repo push-mirrors back to GitHub (fine-grained PATs, sync_on_commit: true). Local origins flipped on WSL; github retained as fallback remote. Discord push webhooks deferred as low-value for single-operator setup (procedure preserved below for future collaboration).
Stages¶
| Stage | Scope | Hold points |
|---|---|---|
| 6A.1 | LXC stand-up + role apply | After pct create; before role apply; before declaring done |
| 6A.2 | Edge integration + CF DNS token rotation | Before old token revocation (irreversible) |
| 6A.3 | Repo migrations + push mirrors | Per-repo PAT verification; before flipping local remotes |
| 6A.4 | Discord webhook + docs sweep | None high-risk |
Cross-phase decisions¶
- Deployment: native LXC (not Docker). Matches CT 105 / 106 / 107 precedent — one role per LXC.
- DB: SQLite. Switch to postgres only if scale ever demands it (won't for single-user).
- SSH push: deferred. HTTPS + PAT only.
- OIDC: deferred to Phase 7E (Pocket-ID). Local accounts now.
- Forgejo Actions: deferred. meat-helmet's GH-side Actions still fire on the mirrored push.
- PATs: 4 fine-grained GitHub PATs (one per repo), each scoped
Contents: Read and write+Metadata: Readon its single target repo. (GitHub fine-grained PATs only offer No access / Read / Read+Write — no write-only tier; on a single-repo-scoped token the practical blast radius is the same.) Lowest blast radius achievable. - Mirror interval: 1h fleet-wide (Forgejo also pushes on each commit; the interval is fallback).
Pre-flight gates¶
- CT 109 ID is free (
pct liston proxfold) - 192.168.1.249 is free (was VM 102 / nginx, decom'd 2026-05-03)
-
vault_forgejo_admin_passwordadded to vault (used in wizard, stored for rebuild docs) - 4 fine-grained GitHub PATs created (
vault_github_pat_mirror_arrstack,_homelab_ansible,_mediabot,_meat_helmet); each scopedContents: Read and write+Metadata: Readon its one repo - CF DNS token rotation plan understood (issue new → push → validate → revoke old; never zero-token window)
Stage 6A.1 — LXC stand-up¶
Create the LXC on proxfold¶
Mirrors CT 107 (edge) sizing — bump rootfs to 8 GB for repo headroom:
# on proxfold
pct create 109 local:vztmpl/debian-13-standard_13.1-2_amd64.tar.zst \
--hostname forgejo \
--cores 1 --memory 1024 --swap 512 \
--rootfs local-zfs:8 \
--net0 name=eth0,bridge=vmbr0,ip=192.168.1.249/24,gw=192.168.1.1 \
--features nesting=1 \
--unprivileged 1 \
--onboot 1 \
--ssh-public-keys /root/.ssh/authorized_keys \
--start 1
Hold point — confirm CT reachable¶
pct exec 109 -- hostname # → forgejo
ssh root@192.168.1.249 'hostname' # → forgejo (via SSH key inherited from host's authorized_keys)
Apply the role from CT 104¶
Per the drift-checks-from-CT104 rule:
# on CT 104
cd ~/homelab-ansible
ansible-playbook playbooks/forgejo.yml --check --diff --limit forgejo # preview
ansible-playbook playbooks/forgejo.yml --limit forgejo # apply
Hold point — service up¶
ssh root@192.168.1.249 'systemctl status forgejo' # active (running)
ssh root@192.168.1.249 'ss -tlnp | grep 3000' # forgejo listening
curl -sI http://192.168.1.249:3000/ | head -1 # 200 or 302 to /install
First-run wizard¶
Open http://192.168.1.249:3000 in a browser:
- Database: SQLite (default).
- Server settings: leave paths default. Forgejo Base URL:
http://192.168.1.249:3000/(flip tohttps://git.rampancy.cloud/after 6B). - Optional settings → Admin Account Settings: set username and paste password from
vault_forgejo_admin_password. - Install Forgejo.
Validate 6A done¶
- Web UI loads at
http://192.168.1.249:3000/ - Admin login works
-
forgejo doctor --allclean (pct exec 109 -- sudo -u forgejo forgejo doctor --all -c /etc/forgejo/app.ini) - Test repo created via UI;
git clone http://192.168.1.249:3000/<admin>/test.gitfrom CT 104 succeeds - Re-run of role under
--check --diffreportschanged=0
Escape¶
Stage 6A.2 — Edge + CF DNS token rotation¶
Add the vhost¶
In inventory/host_vars/edge.yml, append to caddy_proxy_hosts:
- name: git
fqdn: git.rampancy.cloud
upstream_scheme: http
upstream_host: 192.168.1.249
upstream_port: 3000
Apply with ansible-playbook playbooks/edge.yml --limit edge --tags caddy. Reload-caddy handler fires; wildcard cert covers the new vhost — no new cert request.
Add CF DNS record¶
CF dashboard → DNS → Records → Add record:
| Field | Value |
|---|---|
| Type | CNAME |
| Name | git |
| Target | rampancy.cloud |
| Proxy status | DNS only (gray cloud) |
| TTL | Auto |
Mirrors the existing per-subdomain CNAME pattern (kosync, n8n, dash, requests). No wildcard A record exists at the zone — each subdomain is added individually.
Roll the CF DNS token (the irreversible bit)¶
CF dashboard → My Profile → API Tokens → existing Caddy ... token → ⋯ → Roll. Atomic swap: same token ID + scopes, new secret. Old secret invalidated immediately.
Why Roll over create-new-then-revoke:
- Faster invalidation of the leaked value (the whole reason for rotation). Roll = zero-overlap window. Create-new-then-revoke leaves the leaked secret valid for the entire validation period.
- One CF action instead of two; same audit trail.
- Tradeoff: no overlap safety net. Mitigated by current cert validity (~50+ days), so we have weeks to recover if the new token is somehow misconfigured.
Capture the new token to /dev/shm/cf_token (RAM, mode 0600) so it never appears in chat:
Substitute in vault (no-leak script)¶
The new value substitutes the existing vault_caddy_cf_dns_token: line:
cd ~/homelab-ansible
TMPV=/dev/shm/v.tmp
TMPV2=/dev/shm/v.tmp.new
trap "shred -u $TMPV $TMPV2 2>/dev/null || rm -f $TMPV $TMPV2" EXIT
ansible-vault view inventory/group_vars/all/vault.yml > "$TMPV"
awk '
BEGIN { getline tok < "/dev/shm/cf_token"; sub(/[ \t\r\n]+$/, "", tok) }
/^vault_caddy_cf_dns_token:/ { print "vault_caddy_cf_dns_token: \"" tok "\""; next }
{ print }
' "$TMPV" > "$TMPV2"
# substitution sanity — exactly 1 occurrence
[ "$(grep -c '^vault_caddy_cf_dns_token:' "$TMPV2")" = "1" ] || { echo ERR; exit 1; }
ansible-vault encrypt --output=inventory/group_vars/all/vault.yml.new "$TMPV2"
mv inventory/group_vars/all/vault.yml.new inventory/group_vars/all/vault.yml
Apply caddy role + validate token via API¶
ansible-playbook playbooks/edge.yml --limit edge --tags caddy
# expect: env file changed; Restart caddy handler runs
Validate the new token without forcing a cert renewal — exercise the same scope Caddy needs (Zone:DNS:Edit) by writing+deleting a probe TXT record:
ssh root@192.168.1.244 "set -a; source /etc/caddy/caddy.env; set +a; python3 -" <<'PY'
import os, json, urllib.request
token = os.environ["CADDY_CLOUDFLARE_API_TOKEN"]
hdrs = {"Authorization": "Bearer " + token, "Content-Type": "application/json"}
# Token verify + zone lookup (Zone:Read)
zid = json.loads(urllib.request.urlopen(urllib.request.Request(
"https://api.cloudflare.com/client/v4/zones?name=rampancy.cloud",
headers=hdrs)).read())["result"][0]["id"]
# DNS:Edit probe — create + delete a test TXT record
data = json.dumps({"type":"TXT","name":"_dnsedit-probe","content":"validation","ttl":120}).encode()
r = json.loads(urllib.request.urlopen(urllib.request.Request(
"https://api.cloudflare.com/client/v4/zones/{}/dns_records".format(zid),
data=data, headers=hdrs, method="POST")).read())
print("CREATE TXT:", r.get("success"))
rid = r["result"]["id"]
r = json.loads(urllib.request.urlopen(urllib.request.Request(
"https://api.cloudflare.com/client/v4/zones/{}/dns_records/{}".format(zid, rid),
headers=hdrs, method="DELETE")).read())
print("DELETE TXT:", r.get("success"))
PY
Both True = token good. Shred /dev/shm/cf_token once validated.
If the API probe fails, re-Roll and try again — you haven't broken anything yet.
Update Forgejo for the new external URL¶
ssh root@192.168.1.249 "cp /etc/forgejo/app.ini /etc/forgejo/app.ini.6a2-backup && \
sed -i \
-e 's|^DOMAIN = .*|DOMAIN = git.rampancy.cloud|' \
-e 's|^ROOT_URL = .*|ROOT_URL = https://git.rampancy.cloud/|' \
-e 's|^DISABLE_SSH = .*|DISABLE_SSH = true|' \
/etc/forgejo/app.ini && \
systemctl restart forgejo"
DISABLE_SSH = true hides the SSH clone URL in the UI (we deferred SSH exposure; better to hide than show a broken git@<lan-ip>:... URL).
Validate from cellular¶
Per hairpin-NAT memory, LAN-side reachability tests are unreliable — use cellular:
curl -sI https://git.rampancy.cloud/ | head -1 # 200 or 302
echo | openssl s_client -servername git.rampancy.cloud \
-connect git.rampancy.cloud:443 2>/dev/null | openssl x509 -noout -subject
# → subject=CN = *.rampancy.cloud
Browser → https://git.rampancy.cloud/ should load Forgejo. The session cookie from the LAN-IP wizard run is domain-bound to 192.168.1.249, so a fresh login is required at the new hostname.
Validate 6A.2 done¶
-
git.rampancy.cloudresolves (CNAME → apex → public IP) -
https://git.rampancy.cloud/loads, valid wildcard cert (subject=CN = *.rampancy.cloud) - CrowdSec covers the new vhost (Caddy directive ordering preserved by template)
- CF token rolled; new secret validated via API probe (token verify + DNS:Edit TXT round-thru)
- Forgejo
[server] DOMAIN+ROOT_URLupdated;DISABLE_SSH = true; service restarted -
vault_caddy_cf_dns_tokenmarked rotated in this runbook + 7D follow-up + roadmap - Cellular curl + browser login at
https://git.rampancy.cloud(cosmetic — operator validates that CrowdSec doesn't false-positive a clean external request and that the new session-cookie-domain login works)
Escape¶
- Caddy: revert the
caddy_proxy_hostsedit, re-apply. - CF token: re-Roll generates a fresh secret (any time, any number of re-Rolls). Roll's history is in CF audit log.
- Forgejo URL change: restore from
/etc/forgejo/app.ini.6a2-backup+ restart forgejo.
Stage 6A.3 — Migrations + push mirrors¶
API-driven for repeatability and speed. Required tokens, dumped to /dev/shm/<name> (mode 0600) before running:
| File | Source | Scope | Lifetime |
|---|---|---|---|
gh_pat_arrstack |
GitHub fine-grained PAT | Contents: Read and write on arrstack only |
Long-lived |
gh_pat_homelab_ansible |
GitHub fine-grained PAT | Same, on homelab-ansible |
Long-lived |
gh_pat_mediabot |
GitHub fine-grained PAT | Same, on mediabot |
Long-lived |
gh_pat_meat_helmet |
GitHub fine-grained PAT | Same, on meat-helmet |
Long-lived |
gh_classic_import |
GitHub classic PAT | repo (full) |
Ephemeral — delete after step 4 |
forgejo_admin_token |
Forgejo PAT (avatar → Settings → Applications → Manage Access Tokens) | Repository: Read+Write | Long-lived |
GitHub fine-grained PATs only offer No access / Read / Read and write — there's no write-only tier. "Read and write" is the lowest scope that includes push capability; on a single-repo-scoped token the practical blast radius is the same.
Step 1 — Import all 4 repos via Forgejo migrate API¶
python3 - <<'PY'
import json, urllib.request, urllib.error, time
FJ = "https://git.rampancy.cloud/api/v1"
FJ_OWNER = "rampancy"
GH_OWNER = "rampantlemming"
REPOS = ["arrstack", "homelab-ansible", "mediabot", "meat-helmet"]
with open("/dev/shm/forgejo_admin_token") as f: fj_token = f.read().strip()
with open("/dev/shm/gh_classic_import") as f: gh_classic = f.read().strip()
def fj(method, path, body=None):
req = urllib.request.Request(f"{FJ}{path}",
data=(json.dumps(body).encode() if body else None), method=method)
req.add_header("Authorization", f"token {fj_token}")
req.add_header("Content-Type", "application/json")
try:
with urllib.request.urlopen(req, timeout=120) as r:
raw = r.read(); return r.status, (json.loads(raw) if raw else None)
except urllib.error.HTTPError as e:
return e.code, (json.loads(e.read()) if e.fp else None)
for repo in REPOS:
body = {"auth_token": gh_classic, "clone_addr": f"https://github.com/{GH_OWNER}/{repo}.git",
"issues": True, "labels": True, "lfs": True, "milestones": True,
"mirror": False, "private": True, "pull_requests": True, "releases": True,
"repo_name": repo, "repo_owner": FJ_OWNER, "service": "github", "wiki": True,
"description": f"Mirror of github.com/{GH_OWNER}/{repo}"}
code, resp = fj("POST", "/repos/migrate", body)
print(f"{repo}: HTTP {code} {'OK ' + resp.get('html_url') if code in (200,201) else resp}")
PY
mirror: false makes a standalone copy (we'll add the OUTBOUND push-mirror in step 2; Forgejo's "mirror: true" is a different feature — pull-mirror that follows GitHub). All 4 imports completed in ~50s on the 2026-05-05 run.
Step 2 — Configure push-mirror per repo¶
python3 - <<'PY'
import json, urllib.request, urllib.error
FJ = "https://git.rampancy.cloud/api/v1"
FJ_OWNER, GH_OWNER = "rampancy", "rampantlemming"
REPOS = ["arrstack", "homelab-ansible", "mediabot", "meat-helmet"]
with open("/dev/shm/forgejo_admin_token") as f: fj_token = f.read().strip()
def fj(m, p, b=None):
req = urllib.request.Request(f"{FJ}{p}", data=(json.dumps(b).encode() if b else None), method=m)
req.add_header("Authorization", f"token {fj_token}"); req.add_header("Content-Type", "application/json")
try:
with urllib.request.urlopen(req, timeout=30) as r: raw = r.read(); return r.status, (json.loads(raw) if raw else None)
except urllib.error.HTTPError as e: return e.code, (json.loads(e.read()) if e.fp else None)
for repo in REPOS:
with open(f"/dev/shm/gh_pat_{repo.replace('-','_')}") as f: pat = f.read().strip()
body = {"remote_address": f"https://github.com/{GH_OWNER}/{repo}.git",
"remote_username": GH_OWNER, "remote_password": pat,
"interval": "1h0m0s", "sync_on_commit": True}
code, resp = fj("POST", f"/repos/{FJ_OWNER}/{repo}/push_mirrors", body)
print(f"{repo}: HTTP {code}")
PY
Step 3 — Trigger initial sync + verify credentials¶
python3 - <<'PY'
import json, urllib.request, time
FJ, FJ_OWNER = "https://git.rampancy.cloud/api/v1", "rampancy"
REPOS = ["arrstack", "homelab-ansible", "mediabot", "meat-helmet"]
with open("/dev/shm/forgejo_admin_token") as f: t = f.read().strip()
for repo in REPOS:
req = urllib.request.Request(f"{FJ}/repos/{FJ_OWNER}/{repo}/push_mirrors-sync", method="POST")
req.add_header("Authorization", f"token {t}")
print(repo, "trigger:", urllib.request.urlopen(req).status)
time.sleep(10)
for repo in REPOS:
req = urllib.request.Request(f"{FJ}/repos/{FJ_OWNER}/{repo}/push_mirrors")
req.add_header("Authorization", f"token {t}")
m = json.loads(urllib.request.urlopen(req).read())[0]
print(f"{repo}: last_update={m['last_update']} last_error={m.get('last_error') or 'none'}")
PY
last_error=none across all four = credentials validated. last_update populated within the 10s sleep window confirms async sync ran.
Step 4 — Vault long-lived PATs, delete ephemeral¶
Add the 5 long-lived tokens (4 fine-grained + 1 Forgejo) to vault. The classic import PAT is not vaulted — delete it from GitHub after this step:
cd ~/homelab-ansible
TMPV=/dev/shm/v.tmp; TMPV2=/dev/shm/v.tmp.new
trap "shred -u $TMPV $TMPV2 2>/dev/null" EXIT
ansible-vault view inventory/group_vars/all/vault.yml > "$TMPV"
awk '
function read_strip(p, line) { getline line < p; close(p); sub(/[ \t\r\n]+$/, "", line); return line }
{ print }
END {
print ""
print "# Phase 6A.3 — Forgejo admin PAT + per-repo GitHub mirror PATs"
print "vault_forgejo_admin_api_token: \"" read_strip("/dev/shm/forgejo_admin_token") "\""
print "vault_github_pat_mirror_arrstack: \"" read_strip("/dev/shm/gh_pat_arrstack") "\""
print "vault_github_pat_mirror_homelab_ansible: \"" read_strip("/dev/shm/gh_pat_homelab_ansible") "\""
print "vault_github_pat_mirror_mediabot: \"" read_strip("/dev/shm/gh_pat_mediabot") "\""
print "vault_github_pat_mirror_meat_helmet: \"" read_strip("/dev/shm/gh_pat_meat_helmet") "\""
}
' "$TMPV" > "$TMPV2"
ansible-vault encrypt --output=inventory/group_vars/all/vault.yml.new "$TMPV2"
mv inventory/group_vars/all/vault.yml.new inventory/group_vars/all/vault.yml
Then delete the ephemeral classic GH PAT in GitHub Settings → Developer settings → Personal access tokens (classic).
Step 5 — Flip local remotes¶
For each working clone (homelab-ansible, arrstack, meat-helmet — mediabot lives on the arrstack VM, not on WSL):
git remote rename origin github
git remote add origin https://git.rampancy.cloud/rampancy/<repo>.git
git fetch origin
git branch --set-upstream-to=origin/main main
git remote rename updates branch.main.remote to github automatically — explicit --set-upstream-to=origin/main is required to make git push default to Forgejo.
Step 6 — HTTPS credentials for git.rampancy.cloud¶
Forgejo has DISABLE_SSH = true (set in 6A.2 since SSH push exposure is deferred). HTTPS push needs an authenticated remote. Use the credential.helper=store path with the Forgejo PAT:
git config --global credential.helper store
awk 'BEGIN { getline t < "/dev/shm/forgejo_admin_token"; sub(/[ \t\r\n]+$/, "", t); print "https://rampancy:" t "@git.rampancy.cloud" > "/home/<user>/.git-credentials" }' /dev/null
chmod 600 ~/.git-credentials
The PAT is read by awk directly from /dev/shm, written to ~/.git-credentials without going through a shell variable. Single-user dev machine: this is the same trust boundary as the vault.
Step 7 — Shred temps¶
Validate per repo¶
-
git push origin <branch>from WSL works (HTTPS + cred.helper) - Push appears on GitHub within ~10s (sync_on_commit fires immediately on Forgejo's receive-pack hook; 1h is the fallback for missed syncs)
- meat-helmet: GH-side scheduled Actions still fire on the mirrored push (push-event triggers don't apply — meat-helmet has none, only
cron/workflow_run/workflow_dispatch) - CF Pages: arrstack docs build still triggered via webhook on GitHub push (mirror push triggers it the same as a direct push)
Escape¶
- Per-repo local remote revert:
git remote remove origin && git remote rename github origin - Per-repo Forgejo mirror disable:
DELETE /api/v1/repos/{owner}/{repo}/push_mirrors/{id} - GitHub repos untouched throughout — re-imports overwrite Forgejo state, not GitHub.
Stage 6A.4 — Phase 6A close-out¶
Discord webhooks per repo — deferred (not skipped)¶
The original 6A.4 scope included a Discord webhook per repo (push events → Phase 5B #homelab-ops channel). On execution we deferred these as low-value: this is a single-operator setup, the only person pushing to these repos is the operator, and a self-notification on push is pure noise. Bundle adds minutes of work, no signal.
Trigger to revisit: if a collaborator joins (housemate or otherwise) and the operator wants out-of-band visibility into their pushes. At that point: API call per repo is in the runbook below.
# One-shot to add Discord webhooks across all 4 repos when collaboration warrants it
python3 - <<'PY'
import json, urllib.request, subprocess, yaml
FJ, FJ_OWNER = "https://git.rampancy.cloud/api/v1", "rampancy"
REPOS = ["arrstack", "homelab-ansible", "mediabot", "meat-helmet"]
v = yaml.safe_load(subprocess.check_output(["ansible-vault", "view", "inventory/group_vars/all/vault.yml"]))
fj_token = v["vault_forgejo_admin_api_token"]
discord_url = v["vault_discord_webhook_homelab_ops"]
for repo in REPOS:
body = {"type": "discord", "config": {"url": discord_url, "content_type": "json"},
"events": ["push"], "active": True}
req = urllib.request.Request(f"{FJ}/repos/{FJ_OWNER}/{repo}/hooks",
data=json.dumps(body).encode(), method="POST")
req.add_header("Authorization", f"token {fj_token}")
req.add_header("Content-Type", "application/json")
print(repo, urllib.request.urlopen(req).status)
PY
Docs sweep¶
-
docs/ansible/roles/forgejo.mdreflects actual role state -
docs/services/forgejo.mdrepo table populated -
docs/runbooks/forgejo-setup.md"Lessons" sections per stage -
docs/roadmap.mdPhase 6A all four sub-stages ticked -
docs/changelog.mddated entries per sub-stage + Phase 6A close-out -
mkdocs.ymlnav for forgejo role/service/runbook (added in 6A.1) - homelab-ansible README refreshed (host table + role table + repo structure to current state)
-
mkdocs --strictpasses;ansible-lintpasses; pre-commit hooks pass
Lessons from the 2026-05-05 run (6A.3)¶
Forgejo PAT scope: Repository: Read+Write doesn't include read:user¶
The PAT we created for API ops only has Repository scope. That's enough for the migrate + push_mirror endpoints, and for individual repo reads at /repos/{owner}/{repo}. But user-listing endpoints (/users/{owner}/repos, /user/repos) require read:user — both return:
Workaround: probe individual repo names via /repos/{owner}/{repo} (returns 200 if exists, 404 if not) rather than listing. Probably not worth adding read:user to the token just for the listing convenience — repository scope is enough for the actual work.
git remote rename carries upstream tracking with it¶
After git remote rename origin github, branch.main.remote automatically becomes github (git rewrites the config). Adding a NEW origin pointing at Forgejo doesn't switch the tracking back — git push would still default to GitHub. Explicit follow-up needed:
Without this, you keep pushing to GitHub instead of Forgejo and the mirror direction is silently broken.
Forgejo private repos + DISABLE_SSH = true = HTTPS-with-creds¶
We set DISABLE_SSH = true in 6A.2 (SSH push deferred). All Forgejo fetch/push goes via HTTPS, which needs auth for private repos. git config --global credential.helper store + ~/.git-credentials is the simplest path; the file is mode 0600 in $HOME (same trust boundary as the vault on a single-user dev machine).
Pattern that doesn't surface the PAT to scrollback:
git config --global credential.helper store
awk 'BEGIN {
getline t < "/dev/shm/forgejo_admin_token"
sub(/[ \t\r\n]+$/, "", t)
print "https://rampancy:" t "@git.rampancy.cloud" > "/home/<user>/.git-credentials"
}' /dev/null
chmod 600 ~/.git-credentials
mirror: false on the Forgejo migrate API¶
Forgejo's import API has both "mirror": true (creates a Forgejo pull-mirror that follows GitHub one-way) and "mirror": false (standalone copy). For our setup we want false — Forgejo is the new source of truth, and we add a separate outbound push-mirror (a different feature) that propagates Forgejo → GitHub. Easy to confuse the two.
Lessons from the 2026-05-05 run (6A.2)¶
CF API Token Roll preferred over create-new-then-revoke¶
The drafted procedure used create-new-then-revoke (with a hold point on revocation). On execution we switched to Roll because:
- Faster invalidation of the leaked secret. Roll = atomic; the leaked value stops working at click-time. Create-new-then-revoke leaves the leaked secret valid for the entire validation period.
- One CF action vs two; same audit trail (token ID + name preserved).
- Tradeoff: no overlap safety net. Mitigated by the wildcard cert's remaining validity (~50+ days at execution), so we have weeks of buffer to recover from a misconfigured new token.
Updated procedure now uses Roll. The "hold point before revocation" admonition no longer applies — Roll has no separate revocation step.
API-probe validation in lieu of forced cert renewal¶
The drafted procedure forced a wildcard cert renewal as the validation step (delete cert state files + reload). On execution we switched to a CF API probe that directly exercises Zone:DNS:Edit by creating + deleting a test TXT record. Reasons:
- Non-disruptive: Caddy keeps serving with its currently-loaded wildcard cert during validation.
- Strong signal: the probe creates a real DNS record using exactly the permission Caddy needs for DNS-01 challenges — failure is conclusive.
- Avoids the cert-state-removal class of bugs (Caddy may not behave well with partial cert state on disk).
If the API probe fails, re-Roll and retry — no harm done.
CNAME-per-subdomain, no wildcard A record¶
Worth noting for future vhosts: each subdomain in *.rampancy.cloud has its own CNAME pointing to the apex (rampancy.cloud), which has the public A record. There's no wildcard *.rampancy.cloud DNS record — adding a new vhost requires a CNAME entry alongside the Caddy config change. Caddy issues a single wildcard cert covering all subdomains regardless.
Forgejo cookie domain change forces re-login¶
Switching DOMAIN from 192.168.1.249 to git.rampancy.cloud invalidates the existing session cookie at the new hostname. Browser session from the wizard run won't carry over — fresh login at https://git.rampancy.cloud required after the app.ini change. Cosmetic, but worth knowing.
Lessons from the 2026-05-04 run (6A.1)¶
playbooks/forgejo.yml missing beszel_agent — operator-noticed gap¶
The drafted playbook for 6A.1 included common + security + forgejo but missed the beszel_agent trailing role that every other per-host playbook (arrstack, plex, pbs, edge, n8n, vintage, control) carries. Result: CT 109 didn't show up in the Beszel hub, and the operator manually installed the agent before the gap was caught (2026-05-05).
Fix: added beszel_agent to playbooks/forgejo.yml. --check --diff post-fix flagged only one drift: the role wants the upstream installer cached at /root/install-beszel-agent.sh for audit-trail/version-bump purposes — the manual install bypassed this. Reconciled cleanly; idempotent post-apply at ok=19 changed=0.
Generalised reminder: every new per-host playbook should end with beszel_agent as the trailing role. See beszel role doc for the canonical pattern.
auto_updates not picked up by per-host bring-up — drift catches it next morning¶
playbooks/forgejo.yml runs common + security + forgejo against the new host, but does not include the fleet auto-updates.yml play (which is hosts: all, imported once at the foot of site.yml). Result: CT 109 came up without unattended-upgrades configured, and the next-morning drift run flagged it as a changed host until reconciled with site.yml --limit edge,forgejo.
Fix (already applied): new-host bring-up convention captured in Ansible Playbooks doc — bring up new hosts with site.yml --limit <newhost> (not the per-host playbook) so fleet roles land alongside the host-specific ones. Documented after the fact rather than added to this runbook's Stage 6A.1 commands; future runs of this runbook should follow the new convention.
apt_repository module breaks on fresh Debian 13 LXC — needs apt-key/gpg¶
First role apply failed at the source-list step: Either apt-key or gpg binary is required, but neither could be found. The apt-key CLI has been deprecated and removed in modern Debian and derivatives, you might want to use "deb822_repository" instead.
The standard Debian 13 LXC template ships neither apt-key (deprecated) nor gnupg (not pulled in by default). ansible.builtin.apt_repository runs a sanity check that fails closed even when only writing a sources.list line.
Fix: switched to a templated .sources file in deb822 format via ansible.builtin.copy, with an explicit cache-refresh task gated on the source's changed state. The cache-refresh has to run mid-play (before the forgejo-sqlite install task), so a handler-flush pattern doesn't fit — it's annotated # noqa: no-handler with a citation comment per the homelab-ansible CLAUDE.md exception path.
forgejo-sqlite package ships a stub app.ini — file-existence is a useless probe¶
After install, /etc/forgejo/app.ini exists with five lines:
# Empty default config file for forgejo-deb
# Forgejo's installer will populate this file with appropriate defaults.
# See also: https://forgejo.org/docs/latest/admin/config-cheat-sheet
The role's first-run-reminder gate (when: not app_ini.stat.exists) was therefore meaningless — it skipped the reminder on a brand-new install where the wizard hadn't actually run yet.
Fix: real signal is INSTALL_LOCK = true in app.ini, which the wizard writes at completion. Switched to a shell: grep -q "^INSTALL_LOCK *= *true" /etc/forgejo/app.ini probe with check_mode: false + failed_when: false (grep returns 1 when not found — that's the "not yet locked" non-error state). Reminder now fires correctly pre-wizard, silent post-wizard. Idempotency verified at ok=16 changed=0 post-wizard.
Debian 13 LXC template doesn't ship sudo¶
Verified manually with sudo -u forgejo forgejo doctor check from an SSH session — sudo: command not found. Use su forgejo -s /bin/bash -c "..." for ad-hoc forgejo-user invocations. Not worth adding sudo to common_packages fleet-wide just for runbook ergonomics.
Stale known_hosts on .249 from decom'd VM 102¶
The IP was previously nginx (decom'd 2026-05-03). WSL's ~/.ssh/known_hosts had a stale ECDSA entry that conflicted with CT 109's new ED25519 host key. Cleaned with ssh-keygen -R 192.168.1.249. Note: ansible itself doesn't care (host_key_checking is off in ansible.cfg), but raw SSH from WSL or scripts will fail until the entry is removed.
auto_updates not picked up by per-host bring-up — drift catches it next morning¶
Bring-up via playbooks/forgejo.yml standalone applied common, security, and forgejo, but not auto_updates — that role lives in its own fleet-wide play (playbooks/auto-updates.yml: hosts: all) imported only at the foot of site.yml. The drift check the next morning surfaced three changes on forgejo: install unattended-upgrades + 12 deps, drop the reboot-required postinst hook, render 90-ansible-unattended-upgrades. Reconciled 2026-05-05 with ansible-playbook playbooks/site.yml --limit edge,forgejo.
This is by design — the single-source-of-policy comment in auto-updates.yml is intentional. The convention going forward: bring up new hosts with ansible-playbook playbooks/site.yml --limit <newhost> rather than the per-host playbook directly, so the fleet-wide policy plays at the bottom of site.yml apply on first run. See also Playbooks — new host bring-up.