Skip to content

Role: matrix_deploy_notifier

Monthly fetch-and-notify timer for the vendored spantaleev/matrix-docker-ansible-deploy checkout on the control node (CT 104). Runs git fetch against upstream and POSTs any pending commits to the #homelab-updates Discord channel. Never pulls, never applies.

Host: control (CT 104) only. The role is enabled via matrix_deploy_notifier_enabled: true in control host_vars; a no-op on any other host (default false).

Why fetch-only, never apply

Three reasons, all spantaleev-specific. See maintenance-upgrading-services.md for the upstream stance.

  1. Migration-validation gate. The upstream playbook ships a matrix_playbook_migration_expected_version variable. When a breaking change lands, that value bumps and the playbook refuses to run until the operator updates matrix_playbook_migration_validated_version in vars.yml. This gate is the whole point — it forces a human to read CHANGELOG.md before re-apply. Auto-apply would defeat it.
  2. 9-container version coupling. Tuwunel, Traefik, LiveKit, lk-jwt-service, matrix-static-files, Postgres, Element Web, container-socket-proxy, exim-relay all carry pinned tags that spantaleev keeps mutually compatible. Floating any one is a regression risk.
  3. Stateful tier. Tuwunel's RocksDB is the message store. Mid-pull container bounces are tolerable but not casual; bouncing during active sessions has user impact (red-banner send failures, federation re-handshakes).

Behaviour

Aspect Value
Trigger matrix-deploy-fetch.timerOnCalendar=Mon *-*-01..07 09:00:00 (first Monday of each month, 09:00 local)
Persistence Persistent=true (runs on next boot if the host was down at trigger time)
Jitter RandomizedDelaySec=30m
Side effects git fetch origin master against /root/matrix-deploy/. No pull, no checkout, no apply.
Notification trigger Only if git log HEAD..origin/master returns non-empty (i.e. there's actually something to look at)
Channel Discord #homelab-updates via vault_discord_webhook_homelab_updates
Embed payload Up to 25 most recent pending commits (oneline format) + total count + reminder to check matrix_playbook_migration_expected_version

Operator workflow when the notification fires

See the matrix-maintenance runbook for the full procedure. Short version:

  1. Read matrix-docker-ansible-deploy CHANGELOG.md for the date range shown.
  2. SSH to CT 104.
  3. cd /root/matrix-deploy && git pull && rm -rf roles/galaxy && ansible-galaxy install -r requirements.yml -p roles/galaxy/ --force.
  4. Compare matrix_playbook_migration_expected_version (in defaults/main.yml of the spantaleev role) against matrix_playbook_migration_validated_version (in your inventory/host_vars/matrix.rampancy.cloud/vars.yml). If they differ, the playbook will fail with an actionable error linking to the breaking-change entry; review, adapt vars, bump the validated version.
  5. ansible-playbook -i inventory/hosts setup.yml --tags=install-all,start --vault-password-file /root/.vault_pass | tee /tmp/matrix-apply.log (the tee matters — --quiet + pipeline failure silently swallowed real errors during the Phase 6E run).

Key variables

Variable Default Purpose
matrix_deploy_notifier_enabled false Master switch. true on control only
matrix_deploy_notifier_repo_path /root/matrix-deploy Where the spantaleev playbook is vendored on this host
matrix_deploy_notifier_branch master Upstream branch tracked
matrix_deploy_notifier_webhook_var vault_discord_webhook_homelab_updates Vault key — separate from homelab_ops webhook so revocation is independent
matrix_deploy_notifier_oncalendar "Mon *-*-01..07 09:00:00" Systemd OnCalendar — first Monday of the month, 09:00 local
matrix_deploy_notifier_max_commits 25 Cap on commits in the embed (Discord embed limit is 4096 chars; 25 commit lines fit comfortably)

Implementation notes

  • The script (/usr/local/sbin/matrix-deploy-fetch.sh) embeds the webhook URL at render time, so the template task uses no_log: true to keep the secret out of --check --diff output. Same pattern as auto_updates' reboot-required-notify.
  • If the webhook vault var is empty or undefined, the script exits 0 silently — the role can be applied before the vault is populated without erroring.
  • If the repo isn't a git checkout (.git missing), the script exits 0 silently.
  • All git operations use --quiet and tolerate failure (|| exit 0) — a flaky network on Monday morning shouldn't generate a noisy error embed.