Skip to content

Role: pbs

Proxmox Backup Server — runs in a dedicated LXC on proxfold, with the datastore on an NFS mount from the QNAP TS-269L.

Hosts: pbs (CT 105, 192.168.1.246)

Phase 5A executed — 2026-04-23

PBS LXC live, datastore + schedules + PBS user all codified, first full backup and verify both clean (~95 GiB on NAS across 5 guests). Daily 02:00 backup job runs from proxfold via the proxmox role's pbs_client.yml subfile.

Why these choices

  • LXC over VM — PBS is a standalone Debian service with no Docker dependency. Fits the established LXC-for-standalone / VM-for-Docker split.
  • Privileged LXC — sidesteps the uid/gid shift for the backup user (uid 34) writing to the NFS share with no_root_squash. Matches the CT 100 (plex) pattern.
  • Debian 13 template — PBS 4.x is trixie-only. The pct create call requires features: nesting=1 or systemd 257 leaves core units (journald, logind, networkd, gettys) failed on first boot.
  • Datastore on NFS, not CIFS — the TS-269L exports both, and NFSv3 is materially better-behaved than CIFS as a PBS datastore backend (no uid/gid remapping, no reported .chunks inode collisions).
  • Datastore on NAS, not on proxfold — DR requires a second failure domain. A local datastore would die with proxfold or stash.
  • PBS not on the NAS itself — TS-269L is Atom D2701 with ≤3GB RAM on EOL QTS 4.3.4. Can't run PBS 4.x.

Tasks

Task Tag Notes
Install nfs-common pbs, install
Stat Proxmox archive keyring pbs, repo gates the next task; prevents check-mode drift
Fetch Proxmox archive keyring pbs, repo from enterprise.proxmox.com/debian/proxmox-release-trixie.gpg
Add PBS no-subscription deb822 repo pbs, repo
Disable PBS enterprise repo (.list + .sources) pbs, repo both formats handled
Install proxmox-backup-server pbs, install
Mask zfs-zed, zfs-mount, zfs-share pbs, install non-functional inside LXC; prevents noise in systemctl --failed / Beszel
Ensure datastore mount point (backup:backup owner) pbs, datastore uid 34
NFS mount via /etc/fstab pbs, datastore nfsvers=3, x-systemd.automount, nofail
Create PBS datastore (idempotent via .chunks/ stat) pbs, datastore one-shot
Query datastore configuration pbs, schedule for GC idempotency
Set datastore GC schedule pbs, schedule only if different from desired
List + create prune job pbs, schedule 03:00 daily, gated on job ID lookup
List + create verify job pbs, schedule sun 04:00, gated on job ID lookup
List PBS users pbs, auth for user-create idempotency
Create PBS client user pbs, auth throwaway password; real auth is the token
Ensure datastore ACLs for client user pbs, auth changed_when: false — PBS acl update is idempotent
Token creation reminder pbs, auth debug-only; token generation is deliberately manual
Include notifications.yml (Discord webhook endpoint + match-all matcher) — gated on pbs_discord_endpoint_name is defined pbs, notifications

Manual step on first install: generate the API token

The PBS API token is shown only once on creation — it cannot be retrieved later. After the role finishes, SSH to the PBS host and run:

proxmox-backup-manager user generate-token pbs-pve@pbs pve
proxmox-backup-manager cert info | grep -i Fingerprint

Capture the value field into vault_pbs_token_secret and the fingerprint into vault_pbs_fingerprint, then re-run the proxmox role's pbs tag to register storage + create the backup job.

Key variables

Variable Source Value
pbs_datastore_name defaults nas-primary
pbs_datastore_path defaults /mnt/pbs-datastore
pbs_nfs_server defaults 192.168.1.253
pbs_nfs_export defaults /backup/pbs-datastore
pbs_nfs_options defaults defaults,nofail,x-systemd.automount,nfsvers=3
pbs_prune_keep_daily defaults 7
pbs_prune_keep_weekly defaults 4
pbs_prune_keep_monthly defaults 6
pbs_schedule_prune defaults 03:00 — plain HH:MM (PBS rejects daily HH:MM)
pbs_schedule_verify defaults sun 04:00
pbs_schedule_gc defaults mon 04:00
pbs_prune_job_id defaults nas-primary-prune
pbs_verify_job_id defaults nas-primary-verify
pbs_archive_keyring_url defaults https://enterprise.proxmox.com/debian/proxmox-release-trixie.gpg
pbs_archive_keyring_path defaults /usr/share/keyrings/proxmox-archive-keyring.gpg
pbs_client_userid defaults pbs-pve@pbs
pbs_client_firstname defaults PVE Client (proxfold)
pbs_client_roles defaults [DatastoreBackup, DatastoreAudit]
pbs_discord_endpoint_name defaults discord-ops
pbs_discord_matcher_name defaults ops-all
vault_discord_webhook_homelab_ops vault shared with PVE + Beszel + ZED
vault_pbs_token_id vault PVE → PBS client token id (e.g. pbs-pve@pbs!pve)
vault_pbs_token_secret vault PVE → PBS client token secret
vault_pbs_fingerprint vault PBS TLS cert fingerprint for PVE trust

Gotchas captured during execution

  • PBS 4.x ACLs need BOTH user AND token auth-ids on the same path. Corrected 2026-05-06 during Phase 5E execution after a longstanding misreading. The original belief — that granting on the user alone caused the token to inherit — is wrong: the user grant resolves correctly for the user auth-id but the token auth-id resolves to {} permissions until it is also explicitly granted. The Phase 5A pbs-pve ACL works because /etc/proxmox-backup/acl.cfg has both authids on one comma-separated line (pbs-pve@pbs,pbs-pve@pbs!pve:DatastoreAudit,DatastoreBackup) — the original 5A operator manually added the second grant. Fixed in the role 2026-05-06: a new Ensure datastore ACLs for PBS client TOKEN auth-id task grants the same role to {{ pbs_client_userid }}!{{ pbs_client_token_name }}, gated on vault_pbs_token_id is defined (so it's a no-op on the first role run before the operator generates the token, then activates on re-run). New default pbs_client_token_name: "pve" controls the token name. Diagnostic: proxmox-backup-manager user permissions <authid> --path <p> --output-format json returns {} when broken vs the role dict when working.
  • DatastoreBackup alone is not enough for pvesm add pbs. The PVE-side registration probes /admin/datastore/<name>/status which needs Datastore.Audit. Grant both DatastoreBackup and DatastoreAudit.
  • PBS schedule parser rejects daily HH:MM. daily is a systemd OnCalendar macro (= *-*-* 00:00:00) and can't be combined with a time. Use plain HH:MM (interpreted as *-*-* HH:MM:00). Weekday + time (sun 04:00) works fine.
  • Debian 13 LXCs need features: nesting=1. Without it, systemd 257 fails ~24 core units on first boot (gettys, journald, logind, networkd, tmpfiles). pct create prints a warning but doesn't auto-apply.
  • get_url in check mode reports changed even with force: false. Stat-gate the fetch on file absence (same pattern as the docker role's armored-key task).
  • proxmox-backup-server pulls in zfsutils-linux, which ships zfs-zed, zfs-mount, and zfs-share systemd units. Inside an LXC there are no zpools visible to the container, so all three exit 1/FAILURE on start. Mask them (the role handles this) or they show as failed in systemctl --failed and trip Beszel's failed-services alerting.
  • PBS notification CLI quirksproxmox-backup-manager notification endpoint webhook create takes <name> as a positional argument (not --name), uses per-type endpoint webhook list for idempotency probes (not a global list), and despite the CLI help text saying --body is "The HTTP body to send. Supports templating." (no mention of encoding), the body must still be base64-encoded or the endpoint stores but fails at delivery with "could not decode base64 value" — same quirk as PVE. --header value is also base64. Covered in roles/pbs/tasks/notifications.yml.

Handlers

  • restart proxmox-backup-proxy
  • restart proxmox-backup

Multi-user / multi-namespace pattern (post-Phase 5E)

The role currently codifies a single PBS user (pbs_client_userid) for the PVE-side guest backup integration. Phase 5E (host-level file backup, 2026-05-06) introduced a second PBS user (host-backup@pbs) scoped to a separate namespace (host/proxfold) — created manually as a one-shot during bootstrap rather than generalising this role into a list-of-users abstraction. Rationale: one extra user didn't justify the refactor surface.

If/when a third PBS client appears (e.g. a second host backing up its own configs, or an off-site sync target), revisit: turn pbs_client_userid / pbs_client_roles into a list pbs_client_users[] and iterate the user/ACL/notification tasks. Bootstrap procedure for the manual second-user pattern lives in backup-restore runbook §host-level file backup bootstrap.

The role's existing verify-job is datastore-wide (no --ns filter), so it covers the new host/proxfold namespace automatically. The prune-job is namespace-scoped — nas-primary-prune only touches the root namespace; the host namespace gets its own nas-primary-host-prune with longer retention created during the Phase 5E bootstrap.