Role: caddy¶
Caddy reverse proxy on the edge LXC (CT 107, 192.168.1.244). Single static binary, Caddyfile templated from inventory, automatic HTTPS via Let's Encrypt DNS-01 against Cloudflare, no Docker.
Hosts: edge (only).
Phase 5D executed — 2026-05-02
Role landed alongside the edge LXC bring-up. Caddy serving four hosts (requests, dash, n8n, kosync) with a single LE wildcard *.rampancy.cloud. Replaces the hand-clicked NPM install on retired nginx VM 102. See edge-cutover runbook for the cutover procedure and execution-time lessons.
Validation gap
Apply succeeded against edge from WSL on 2026-05-02. Site-wide --check --diff from CT 104 has not yet run as of this docs commit — pending the homelab-ansible commit + CT 104 pull. Will report changed=0 once code lands on the drift runner.
Architecture¶
flowchart LR
Internet[Public client] --> UDM[UDM<br/>port-forward 80/443]
UDM --> Caddy[Caddy on edge<br/>192.168.1.244]
Caddy -->|TLS terminate<br/>LE wildcard| n8n[n8n VM<br/>:5678]
Caddy --> beszel[Beszel hub<br/>:8090]
Caddy --> seerr[Overseerr<br/>arrstack:5055]
Caddy --> kosync[korrosync<br/>arrstack:3030]
Caddy <-.->|ACME DNS-01| CF[Cloudflare API<br/>TXT record]
Cert issuance and renewal happen via Cloudflare DNS-01 — Caddy POSTs a TXT record to _acme-challenge.rampancy.cloud, Let's Encrypt validates, Caddy deletes the record. No inbound port 80 dependency for renewal.
Tasks¶
| Task | Tag |
|---|---|
| Ensure caddy group + user (system, nologin) | caddy, install |
Create /etc/caddy, /var/lib/caddy, /var/log/caddy |
caddy, install |
Install pinned binary from roles/caddy/files/caddy-<version> |
caddy, install |
Render systemd unit (no --environ — see Gotchas) |
caddy, service |
Render /etc/caddy/caddy.env with vaulted CF token (no_log: true) |
caddy, config |
Render /etc/caddy/Caddyfile from inventory |
caddy, config |
Validate Caddyfile (with env var passed via environment:) |
caddy, config |
Enable + start caddy.service |
caddy, service |
Key variables¶
Defaults at roles/caddy/defaults/main.yml. Per-host overrides at inventory/host_vars/edge.yml.
| Variable | Source | Default / value |
|---|---|---|
caddy_version |
defaults | 2.11.2 |
caddy_build_revision |
defaults | cs1 (bump when xcaddy plugin set changes; see Binary build flow) |
caddy_binary_local |
defaults | caddy-{{ caddy_version }}-{{ caddy_build_revision }} (in roles/caddy/files/) |
caddy_install_dir |
defaults | /usr/local/bin |
caddy_config_dir |
defaults | /etc/caddy |
caddy_data_dir |
defaults | /var/lib/caddy |
caddy_acme_email |
defaults | newlingd@gmail.com |
caddy_acme_dns_provider |
defaults | cloudflare |
caddy_listen_alt_ports |
defaults | true (bind 8080/8443) — flipped to false in host_vars/edge.yml post-cutover |
caddy_cf_dns_token_var |
defaults | vault_caddy_cf_dns_token |
caddy_crowdsec_enabled |
defaults | false — flipped per-host. Gates the bouncer block in Caddyfile.j2 and the CROWDSEC_BOUNCER_API_KEY line in caddy.env.j2. |
caddy_crowdsec_bouncer_key_var |
defaults | vault_caddy_crowdsec_bouncer_key |
caddy_crowdsec_lapi_url |
defaults | http://127.0.0.1:8080 — stock LAPI listen address. The agent's local_api_credentials.yaml is hardcoded to this by the installer; moving LAPI requires re-templating that too, so don't optimise for hypothetical alt-port collisions. |
caddy_zone |
host_vars | rampancy.cloud |
caddy_proxy_hosts |
host_vars | list of {name, fqdn, upstream_scheme, upstream_host, upstream_port} |
caddy_matrix_enabled |
defaults | false — flipped to true on edge for Phase 6E. Gates an apex {{ caddy_zone }} block (well-known matrix delegation) plus a matrix.{{ caddy_zone }} block in the template, neither expressible via caddy_proxy_hosts (apex needs its own cert; matrix subdomain needs path-routed CrowdSec exclusion on /_matrix/federation/*). |
caddy_matrix_upstream |
host_vars | 192.168.1.243:81 on edge — single upstream because matrix_federation_public_port: 443 in the matrix-deploy vars.yml collapses Traefik's federation entrypoint into the web entrypoint, so client + federation share the host-bind port. |
The vaulted CF API token (vault_caddy_cf_dns_token) is scoped to Zone:Read + DNS:Edit on rampancy.cloud and nothing else.
Binary build flow¶
The binary is not built on the target host. Pre-built on WSL via xcaddy, version-pinned, committed to roles/caddy/files/:
mkdir -p ~/build/caddy && cd ~/build/caddy
docker run --rm -v "$PWD:/output" -w /output caddy:2.11.2-builder \
xcaddy build v2.11.2 \
--with github.com/caddy-dns/cloudflare \
--with github.com/hslatman/caddy-crowdsec-bouncer/http \
--output caddy
./caddy version # confirm v2.11.2
./caddy list-modules | grep cloudflare # confirm dns.providers.cloudflare
./caddy list-modules | grep crowdsec # confirm http.handlers.crowdsec + crowdsec app
cp caddy ~/homelab-ansible/roles/caddy/files/caddy-2.11.2-cs1
Why pre-built rather than xcaddy on edge:
- Reproducibility — committed binary is bit-identical across deploys
- Speed — xcaddy build on a 1 GiB LXC is slow or OOMs during go build
- Footprint — no Go toolchain on edge (~500 MB saved)
The -cs1 filename suffix is caddy_build_revision — bumped when the xcaddy plugin set changes (e.g. adding the CrowdSec bouncer module for Phase 7D). Caddy upstream version stays accurate; the suffix is the audit trail for plugin-set changes. Ansible's copy module would also detect content changes via checksum, but a new filename makes the rebuild visible in git diff.
To bump Caddy: rebuild via xcaddy with a new version, drop the binary into roles/caddy/files/, bump caddy_version in defaults/main.yml, re-apply. The role's Install caddy binary task will detect the change and notify the Restart caddy handler.
To change the plugin set without bumping Caddy: rebuild with the new --with flags, drop the new file into roles/caddy/files/, bump caddy_build_revision in defaults/main.yml. Old binary stays in tree for one revision (handy for rollback).
Caddyfile template¶
roles/caddy/templates/Caddyfile.j2 produces (with caddy_crowdsec_enabled: true on edge for Phase 7D):
{
email newlingd@gmail.com
acme_dns cloudflare {env.CADDY_CLOUDFLARE_API_TOKEN}
# CrowdSec bouncer. order is required -- the directive isn't in
# Caddy's standard order list. api_key uses {$ENV} (parse-time
# substitution) -- {env.X} ends up as a literal in this module's
# loaded config (verified via /config/ admin API).
order crowdsec first
crowdsec {
api_url http://127.0.0.1:8080
api_key {$CROWDSEC_BOUNCER_API_KEY}
ticker_interval 15s
}
}
*.rampancy.cloud {
# HTTP access log -- site-block `log`, not global. Goes to journal.
log {
output stdout
format json
}
@requests host requests.rampancy.cloud
handle @requests {
crowdsec
reverse_proxy http://192.168.1.252:5055
}
# ... one matcher per host ...
handle {
respond "Host not configured" 404
}
}
The *.rampancy.cloud block + per-host matchers is the modern idiom for "one wildcard cert covers many hosts." Each matcher pattern (@name host fqdn) routes by SNI; a final handle { respond ... 404 } catches unconfigured subdomains. Caddy issues a single LE wildcard cert for *.rampancy.cloud rather than one per host.
The crowdsec directive in each handle block is gated on caddy_crowdsec_enabled in the Jinja template — flip it false in host_vars/edge.yml to disable enforcement without touching the engine. See the crowdsec_engine role page for the engine side.
Indented with tabs to match caddy fmt canonical style. If caddy fmt /etc/caddy/Caddyfile reports differences after a template change, mirror the changes back into the Jinja template.
Adding a new proxied host¶
-
Add an entry to
caddy_proxy_hostsininventory/host_vars/edge.yml: -
Add the corresponding CNAME in Cloudflare DNS (point to apex
rampancy.cloud, gray-cloud). - Re-apply:
ansible-playbook playbooks/edge.yml --limit edge --tags caddy.
The Caddyfile reload is non-disruptive (no connection drop). The wildcard cert already covers the new hostname — no additional ACME flow.
Gotchas captured during execution¶
Phase 6E: caddy_matrix_enabled special-case isn't reusable via caddy_proxy_hosts¶
Matrix routing has two characteristics that don't fit the caddy_proxy_hosts data model:
- Apex site block —
rampancy.cloud(no subdomain prefix) needs its own LE cert; the wildcard*.rampancy.clouddoesn't cover the apex. Has to be a sibling Caddy site block, not a@hostmatcher inside the wildcard block. - Path-routed CrowdSec exclusion — federation traffic to
/_matrix/federation/*is server-to-server (other homeservers connecting). Putting CrowdSec on it risks blocking legitimate federation peers whose IPs land on community blocklists. CrowdSec on/_matrix/federation/*is off; on every other path it's on. This requires path-based matchers, not the single-reverse_proxy-per-host shapecaddy_proxy_hostsenforces.
The role's template adds both blocks behind the caddy_matrix_enabled gate, kept out of the standard wildcard handler. If a similar future service has the same shape (apex serving statics + a subdomain with path-routed middleware exclusion), copy this pattern rather than trying to generalise caddy_proxy_hosts.
Caddy upstream systemd unit ships --environ flag — leaks env vars to journalctl¶
Caddy's official systemd unit includes --environ on ExecStart. With EnvironmentFile=/etc/caddy/caddy.env containing the CF API token, the token gets printed to journalctl on every restart. Detected during the 5D cutover; the role's systemd template drops --environ for this reason.
If you need to debug env-var resolution, run Caddy manually with --environ once (e.g. sudo -u caddy CADDY_CLOUDFLARE_API_TOKEN=test caddy run --environ --config /etc/caddy/Caddyfile) rather than baking it into the systemd unit.
caddy validate doesn't see EnvironmentFile¶
systemd's EnvironmentFile= directive only scopes to ExecStart (and ExecStop/ExecReload). Ad-hoc caddy validate invocations don't pick up caddy.env — the {env.CADDY_CLOUDFLARE_API_TOKEN} in the Caddyfile substitutes to empty, and Caddy errors out with API token '' appears invalid. Same applies to CROWDSEC_BOUNCER_API_KEY once the bouncer is enabled — first hit during Phase 7D 2026-05-04 with crowdsec API key must not be empty.
The role's Validate Caddyfile task passes both tokens explicitly via Ansible's environment: parameter:
- name: Validate Caddyfile
ansible.builtin.command: >
{{ caddy_install_dir }}/caddy validate
--config {{ caddy_config_dir }}/Caddyfile
--adapter caddyfile
environment:
CADDY_CLOUDFLARE_API_TOKEN: "{{ lookup('vars', caddy_cf_dns_token_var) }}"
CROWDSEC_BOUNCER_API_KEY: "{{ lookup('vars', caddy_crowdsec_bouncer_key_var) if caddy_crowdsec_enabled else '' }}"
changed_when: false
check_mode: false
when: not ansible_check_mode
Why a separate command task rather than template.validate: — first-run --check --diff would fail before the binary is in place. Keeping validate separate (and gated when: not ansible_check_mode) keeps the role idempotent across check-mode and real runs.
{env.X} works for some directives, {$X} is the safe choice for module config¶
Caddy substitutes {env.X} at runtime, but only for directives that explicitly support it (e.g. the ACME acme_dns argument does; the crowdsec module's api_key field does not). The crowdsec module reads its config before Caddy's runtime substitution pass fires, so {env.CROWDSEC_BOUNCER_API_KEY} ends up as the literal string in the loaded config — verified via curl localhost:2019/config/apps/crowdsec/ showing the literal {env.X} text.
The fix: use {$X} (parse-time substitution by Caddy's caddyfile adapter), which expands before the module sees its config. The Caddyfile template uses {$CROWDSEC_BOUNCER_API_KEY} for this reason. The CF token stays on {env.X} because the ACME directive supports it and runtime substitution is preferred there (no token in the on-disk JSON config dump).
no_log: true on the env-file render task¶
The env-file template task renders the CF API token. With --diff enabled (the default for site-wide drift runs), Ansible would print the rendered template to stdout — including the token. no_log: true suppresses the diff output. Trade-off: you can't see what changed in the env file from a drift run; if you need to debug, run the task manually with -v and inspect the file directly.
Alt-port mode (Phase 1 parallel-run)¶
caddy_listen_alt_ports: true in defaults makes Caddy bind 8080/8443 instead of 80/443. Used during the cutover so Caddy could parallel-run alongside NPM (which still owned 80/443 on .249 until the UDM swap). Flipped to false in host_vars/edge.yml for production binding. Leave the toggle in defaults — useful for any future migration that wants the same parallel-run pattern.
Validation path post-deploy¶
For ongoing drift, the user's pattern is ansible-playbook playbooks/site.yml --check --diff from CT 104 — see Drift Detection and Ansible homelab-ansible CLAUDE.md. The first-time apply was from WSL (vault + keys present, edge brand-new). Once homelab-ansible commits land on CT 104, drift will validate the role end-to-end.
Related¶
- edge-cutover runbook — cutover procedure + lessons
- Roadmap §Phase 5D
- Network overview — public-services map
- crowdsec_engine role — Phase 7D engine side; pairs with the bouncer module loaded into the Caddy binary
- Accepted risk: edge security gap until CrowdSec — why no orange-cloud
- Roadmap §Phase 7D — CrowdSec on edge — closes the gap
- Caddy upstream docs ·
caddy-dns/cloudflaremodule