Skip to content

Role: crowdsec_engine

CrowdSec security engine on the edge LXC (CT 107, 192.168.1.244). Installs the engine from upstream packagecloud, enrolls the Caddy collection, binds LAPI to a non-default port, and ships a Restart=on-failure drop-in. Pairs with the caddy role which loads the hslatman bouncer module at request time.

Hosts: edge (only).

Phase 7D executed — 2026-05-04

Engine + Caddy bouncer module live on edge. Bouncer registration is a one-time operator step (see Bootstrap), not an Ansible task — cscli bouncers add is non-idempotent. End-to-end validation via cellular phone confirmed: blocked IP got 403, removed IP got 200. See crowdsec-validation runbook for the proof procedure and the lessons captured during first execution.

Why standalone (no Wazuh forwarding, no Lynis)

Per project memory project_phase7d_scoping.md (decisions 2026-05-04):

  • Wazuh forwarding deferred — the original roadmap §7D bundled CrowdSec ↔ Wazuh integration. Wazuh (7A/B) is gated on Phase 4B (CPU + RAM upgrade), which is deprioritised. Edge gap is more urgent than the broader SIEM build; closing 7D standalone unblocks it. The forwarding hook is documented here as a future one-line add when Wazuh exists.
  • Lynis split out — host-hardening audit is functionally orthogonal to edge bouncing. Moved into the Wazuh phase scope, where it pairs naturally with Wazuh's SCA module.

Architecture

flowchart LR
    Internet[Public client] --> UDM[UDM<br/>port-forward 80/443]
    UDM --> Caddy[Caddy on edge<br/>+ hslatman bouncer module]
    Caddy -.->|per-request IP check<br/>http://127.0.0.1:6060| LAPI[CrowdSec LAPI<br/>on edge]
    LAPI -.->|decisions stream| Caddy
    LAPI -->|community pull| CSCAPI[CrowdSec CAPI<br/>federated reputation]
    Caddy -->|verdict: allow| Upstream[Upstream apps<br/>requests/dash/n8n/kosync]
    Caddy -->|verdict: deny| Block[403 Forbidden]

The bouncer module checks each incoming HTTP request's source IP against the local LAPI's in-memory decision cache. Decisions arrive via streaming pull (default 15s ticker). On LAPI unreachable the module fails openenable_hard_fails defaults off in the upstream README. Acceptable for our risk model: a CrowdSec outage shouldn't black-hole the public apps.

Tasks

Task Tag
Ensure /etc/apt/keyrings exists crowdsec, install
Stat-gated fetch of packagecloud GPG keyring crowdsec, install
Render apt source list (/etc/apt/sources.list.d/crowdsec_crowdsec.list) crowdsec, install
Refresh apt cache (only when source changed; changed_when: false) crowdsec, install
apt install crowdsec crowdsec, install
Render LAPI override (/etc/crowdsec/config.yaml.local) crowdsec, config
Render Restart=on-failure systemd drop-in crowdsec, config
Stat-gated cscli collections install for the configured collection list crowdsec, config
Enable + start crowdsec.service crowdsec, service

Key variables

Defaults at roles/crowdsec_engine/defaults/main.yml. Per-host overrides at inventory/host_vars/edge.yml.

Variable Source Default / value
crowdsec_engine_apt_repo_url defaults https://packagecloud.io/crowdsec/crowdsec/debian
crowdsec_engine_apt_repo_suite defaults any (workaround for trixie 404 — see Gotchas)
crowdsec_engine_keyring_url defaults https://packagecloud.io/crowdsec/crowdsec/gpgkey
crowdsec_engine_keyring_path defaults /etc/apt/keyrings/crowdsec_crowdsec-archive-keyring.gpg
crowdsec_engine_collections defaults [crowdsecurity/caddy] (additive — cscli setup auto-installs more on first install; see Auto-discovery)

Auto-discovery via cscli setup

CrowdSec's apt install post-install runs cscli setup once, which inspects the host for known services (caddy, sshd, syslog, postfix, etc.) and writes generated acquisition snippets into /etc/crowdsec/acquis.d/setup.<service>.yaml. It also installs the matching hub collections.

On edge as of 2026-05-04 this covered:

Auto-generated acquis Auto-installed collections
setup.caddy.yaml (journald, _SYSTEMD_UNIT=caddy.service) crowdsecurity/caddy, crowdsecurity/base-http-scenarios, crowdsecurity/http-cve
setup.sshd.yaml (journald, _SYSTEMD_UNIT=ssh.service) crowdsecurity/sshd
setup.linux.yaml (file, /var/log/{messages,syslog,kern.log}) crowdsecurity/linux
setup.postfix.yaml (file, /var/log/mail.log) crowdsecurity/postfix (harmless if no postfix; nothing to parse)

Plus crowdsecurity/whitelist-good-actors for default-whitelisting common false-positive sources (uptime monitors, search engines).

The crowdsec_engine_collections role default is therefore additive — it explicitly ensures crowdsecurity/caddy is present (because the auto-discovery is opaque and we want the dependency expressed in code), but the rest of the common stack arrives automatically. To opt out of an auto-installed collection, cscli collections remove <name> on the host and remove its setup.<name>.yaml file. Drift won't reinstall it (the role only checks the explicit list).

The auto-generated files have a cscli-checksum: header — cscli won't overwrite them on subsequent setup runs unless you delete the checksum line. Safe to leave Ansible-out-of-the-loop here; the auto-setup is opinionated but reasonable.

Bootstrap (one-time, before flipping caddy_crowdsec_enabled: true)

The bouncer LAPI key is generated by cscli bouncers add, which is non-idempotent — the key is printed only on first creation. Putting that in a task would either break --check --diff (cscli writes state) or silently drift (re-running creates a second key). The supported pattern is a one-time operator step:

# 1. Apply the role with caddy_crowdsec_enabled still false. The engine
#    installs and starts; Caddy keeps running without the bouncer block.
ansible-playbook playbooks/edge.yml --limit edge

# 2. SSH to edge and register the bouncer once.
ssh root@192.168.1.244
cscli bouncers add caddy-edge -o raw
# Prints a single line -- the API key. Copy it.

# 3. Add the key to the vault.
ansible-vault edit inventory/group_vars/all/vault.yml
# Add: vault_caddy_crowdsec_bouncer_key: "<the key>"

# 4. Flip caddy_crowdsec_enabled: true in inventory/host_vars/edge.yml.

# 5. Re-apply the caddy role only -- engine is already up.
ansible-playbook playbooks/edge.yml --limit edge --tags caddy,config

After step 5, cscli bouncers list should show caddy-edge with a recent last_pull. Validation procedure: see crowdsec-validation runbook.

Rollback

Fastest first:

  1. Disable the bouncer block in Caddy — flip caddy_crowdsec_enabled: false in host_vars/edge.yml, re-apply with --tags caddy,config. Caddyfile re-renders without bouncer; caddy reload fires. Engine keeps running, no enforcement. ETA ~10s + Ansible run.
  2. In-place panic button — SSH edge, comment the crowdsec line(s) in /etc/caddy/Caddyfile, systemctl reload caddy. Will be reverted by next drift run.
  3. Stop enginesystemctl stop crowdsec. Bouncer fails open per upstream default; traffic continues. Use only if the engine itself is misbehaving.

Gotchas captured during execution

packagecloud debian/trixie returns 404 — use any/any

The packagecloud repo for the trixie codename is broken upstream (returns 404 Not Found on the Release file). Confirmed live 2026-05-04: trixie 404, bookworm 302, any resolves via apt. Tracked in crowdsec issue #3909 (unresolved as of role install). Upstream's own install docs recommend the any/any repo line as the workaround.

The path component and the suite must both be anycrowdsec/crowdsec/any + suite any resolves; crowdsec/crowdsec/debian + suite any returns HTTP 422 / "repository not signed". Verified empirically during first apply (caught it after the first apt-update attempt failed; one-line role default fix). Don't switch to a different distro path expecting the any suite to still work.

LAPI stays on stock port 8080 — don't move it

Stock LAPI listens on 127.0.0.1:8080. Tempting to move it (e.g. to 6060) to dodge a hypothetical collision with Caddy's caddy_listen_alt_ports: true default. Don't. The CrowdSec agent process registers itself with LAPI on first install via /etc/crowdsec/local_api_credentials.yaml, which the installer hardcodes to 127.0.0.1:8080. Overriding LAPI's listen_uri via config.yaml.local without also re-templating the agent's credentials file leaves the agent unable to authenticate, and the engine fails to start with connection refused.

If a future Caddy alt-port flip really collides with 8080, change Caddy's alt-port (rare, scoped) rather than LAPI (touches engine + agent + bouncer + every doc). Caught + reverted during 2026-05-04 first apply.

cscli is non-idempotent — gate with stat-markers

cscli bouncers add and cscli collections install both mutate state on every invocation. The role's collection-install task gates on the on-disk hub manifest (/etc/crowdsec/hub/collections/<author>/<name>.yaml) and only runs when absent. Bouncer registration isn't in the role at all — it's bootstrap (above).

Bouncer key bootstrap is once-and-only-once

If someone re-runs cscli bouncers add caddy-edge, you get a second registered bouncer with a different key, the first key keeps working but the vault entry doesn't match the second one — silent drift. If a key needs to be rotated: cscli bouncers delete caddy-edge, then re-add and re-vault.

Restart=on-failure drop-in is essential

Stock crowdsec.service doesn't restart on crash. Without the drop-in (rendered to /etc/systemd/system/crowdsec.service.d/restart-on-failure.conf), an engine crash strands Caddy with an unreachable LAPI — the bouncer fails open by default so traffic still flows, but enforcement stops silently and you don't notice unless you're checking cscli metrics.

Validate edge bouncing from outside the LAN — hairpin NAT lies

Hairpin NAT on the UDM rewrites the source IP for any LAN-internal request that hits the public hostname (requests.rampancy.cloud etc. all resolve to the WAN IP, which port-forwards back to edge). Caddy's access log shows client_ip: 192.168.1.1 (the UDM's LAN IP), not your real WAN IP. The bouncer correctly allows that LAN traffic, so a cscli decisions add --ip <my-WAN-IP> followed by a curl from WSL/CT104 will always look like the bouncer isn't working — it is.

Always validate from outside the LAN: cellular phone with WiFi off, a remote SSH host, or any source whose egress IP isn't your home WAN. Caught after ~30 min of false-lead debugging during 2026-05-04 first apply; runbook now leads with this as a hard constraint.

Future hooks (deferred to Phase 7A/B)

When Wazuh AIO lands:

  1. Add a crowdsec_engine_wazuh_forward: true default + wazuh_manager_url var.
  2. Render /etc/crowdsec/notifications/http.yaml with the Wazuh manager endpoint.
  3. Add a notification profile in /etc/crowdsec/profiles.yaml to route bans to the http notifier.
  4. Wazuh-side: custom decoder + rule for the JSON CrowdSec emits.

Lynis is a separate role, will land alongside the Wazuh work.