Role: pbs¶
Proxmox Backup Server — runs in a dedicated LXC on proxfold, with the datastore on an NFS mount from the QNAP TS-269L.
Hosts: pbs (CT 105, 192.168.1.246)
Phase 5A executed — 2026-04-23
PBS LXC live, datastore + schedules + PBS user all codified, first full backup and verify both clean (~95 GiB on NAS across 5 guests). Daily 02:00 backup job runs from proxfold via the proxmox role's pbs_client.yml subfile.
Why these choices¶
- LXC over VM — PBS is a standalone Debian service with no Docker dependency. Fits the established LXC-for-standalone / VM-for-Docker split.
- Privileged LXC — sidesteps the uid/gid shift for the
backupuser (uid 34) writing to the NFS share withno_root_squash. Matches the CT 100 (plex) pattern. - Debian 13 template — PBS 4.x is trixie-only. The
pct createcall requiresfeatures: nesting=1or systemd 257 leaves core units (journald, logind, networkd, gettys) failed on first boot. - Datastore on NFS, not CIFS — the TS-269L exports both, and NFSv3 is materially better-behaved than CIFS as a PBS datastore backend (no uid/gid remapping, no reported
.chunksinode collisions). - Datastore on NAS, not on proxfold — DR requires a second failure domain. A local datastore would die with proxfold or
stash. - PBS not on the NAS itself — TS-269L is Atom D2701 with ≤3GB RAM on EOL QTS 4.3.4. Can't run PBS 4.x.
Tasks¶
| Task | Tag | Notes |
|---|---|---|
Install nfs-common |
pbs, install |
|
| Stat Proxmox archive keyring | pbs, repo |
gates the next task; prevents check-mode drift |
| Fetch Proxmox archive keyring | pbs, repo |
from enterprise.proxmox.com/debian/proxmox-release-trixie.gpg |
| Add PBS no-subscription deb822 repo | pbs, repo |
|
Disable PBS enterprise repo (.list + .sources) |
pbs, repo |
both formats handled |
Install proxmox-backup-server |
pbs, install |
|
Mask zfs-zed, zfs-mount, zfs-share |
pbs, install |
non-functional inside LXC; prevents noise in systemctl --failed / Beszel |
Ensure datastore mount point (backup:backup owner) |
pbs, datastore |
uid 34 |
NFS mount via /etc/fstab |
pbs, datastore |
nfsvers=3, x-systemd.automount, nofail |
Create PBS datastore (idempotent via .chunks/ stat) |
pbs, datastore |
one-shot |
| Query datastore configuration | pbs, schedule |
for GC idempotency |
| Set datastore GC schedule | pbs, schedule |
only if different from desired |
| List + create prune job | pbs, schedule |
03:00 daily, gated on job ID lookup |
| List + create verify job | pbs, schedule |
sun 04:00, gated on job ID lookup |
| List PBS users | pbs, auth |
for user-create idempotency |
| Create PBS client user | pbs, auth |
throwaway password; real auth is the token |
| Ensure datastore ACLs for client user | pbs, auth |
changed_when: false — PBS acl update is idempotent |
| Token creation reminder | pbs, auth |
debug-only; token generation is deliberately manual |
Include notifications.yml (Discord webhook endpoint + match-all matcher) — gated on pbs_discord_endpoint_name is defined |
pbs, notifications |
Manual step on first install: generate the API token
The PBS API token is shown only once on creation — it cannot be retrieved later. After the role finishes, SSH to the PBS host and run:
proxmox-backup-manager user generate-token pbs-pve@pbs pve
proxmox-backup-manager cert info | grep -i Fingerprint
Capture the value field into vault_pbs_token_secret and the fingerprint into vault_pbs_fingerprint, then re-run the proxmox role's pbs tag to register storage + create the backup job.
Key variables¶
| Variable | Source | Value |
|---|---|---|
pbs_datastore_name |
defaults | nas-primary |
pbs_datastore_path |
defaults | /mnt/pbs-datastore |
pbs_nfs_server |
defaults | 192.168.1.253 |
pbs_nfs_export |
defaults | /backup/pbs-datastore |
pbs_nfs_options |
defaults | defaults,nofail,x-systemd.automount,nfsvers=3 |
pbs_prune_keep_daily |
defaults | 7 |
pbs_prune_keep_weekly |
defaults | 4 |
pbs_prune_keep_monthly |
defaults | 6 |
pbs_schedule_prune |
defaults | 03:00 — plain HH:MM (PBS rejects daily HH:MM) |
pbs_schedule_verify |
defaults | sun 04:00 |
pbs_schedule_gc |
defaults | mon 04:00 |
pbs_prune_job_id |
defaults | nas-primary-prune |
pbs_verify_job_id |
defaults | nas-primary-verify |
pbs_archive_keyring_url |
defaults | https://enterprise.proxmox.com/debian/proxmox-release-trixie.gpg |
pbs_archive_keyring_path |
defaults | /usr/share/keyrings/proxmox-archive-keyring.gpg |
pbs_client_userid |
defaults | pbs-pve@pbs |
pbs_client_firstname |
defaults | PVE Client (proxfold) |
pbs_client_roles |
defaults | [DatastoreBackup, DatastoreAudit] |
pbs_discord_endpoint_name |
defaults | discord-ops |
pbs_discord_matcher_name |
defaults | ops-all |
vault_discord_webhook_homelab_ops |
vault | shared with PVE + Beszel + ZED |
vault_pbs_token_id |
vault | PVE → PBS client token id (e.g. pbs-pve@pbs!pve) |
vault_pbs_token_secret |
vault | PVE → PBS client token secret |
vault_pbs_fingerprint |
vault | PBS TLS cert fingerprint for PVE trust |
Gotchas captured during execution¶
- PBS 4.x ACLs need BOTH user AND token auth-ids on the same path. Corrected 2026-05-06 during Phase 5E execution after a longstanding misreading. The original belief — that granting on the user alone caused the token to inherit — is wrong: the user grant resolves correctly for the user auth-id but the token auth-id resolves to
{}permissions until it is also explicitly granted. The Phase 5Apbs-pveACL works because/etc/proxmox-backup/acl.cfghas both authids on one comma-separated line (pbs-pve@pbs,pbs-pve@pbs!pve:DatastoreAudit,DatastoreBackup) — the original 5A operator manually added the second grant. Fixed in the role 2026-05-06: a newEnsure datastore ACLs for PBS client TOKEN auth-idtask grants the same role to{{ pbs_client_userid }}!{{ pbs_client_token_name }}, gated onvault_pbs_token_id is defined(so it's a no-op on the first role run before the operator generates the token, then activates on re-run). New defaultpbs_client_token_name: "pve"controls the token name. Diagnostic:proxmox-backup-manager user permissions <authid> --path <p> --output-format jsonreturns{}when broken vs the role dict when working. DatastoreBackupalone is not enough forpvesm add pbs. The PVE-side registration probes/admin/datastore/<name>/statuswhich needsDatastore.Audit. Grant bothDatastoreBackupandDatastoreAudit.- PBS schedule parser rejects
daily HH:MM.dailyis a systemd OnCalendar macro (=*-*-* 00:00:00) and can't be combined with a time. Use plainHH:MM(interpreted as*-*-* HH:MM:00). Weekday + time (sun 04:00) works fine. - Debian 13 LXCs need
features: nesting=1. Without it, systemd 257 fails ~24 core units on first boot (gettys, journald, logind, networkd, tmpfiles).pct createprints a warning but doesn't auto-apply. get_urlin check mode reportschangedeven withforce: false. Stat-gate the fetch on file absence (same pattern as the docker role's armored-key task).proxmox-backup-serverpulls inzfsutils-linux, which shipszfs-zed,zfs-mount, andzfs-sharesystemd units. Inside an LXC there are no zpools visible to the container, so all three exit1/FAILUREon start. Mask them (the role handles this) or they show as failed insystemctl --failedand trip Beszel's failed-services alerting.- PBS notification CLI quirks —
proxmox-backup-manager notification endpoint webhook createtakes<name>as a positional argument (not--name), uses per-typeendpoint webhook listfor idempotency probes (not a globallist), and despite the CLI help text saying--bodyis "The HTTP body to send. Supports templating." (no mention of encoding), the body must still be base64-encoded or the endpoint stores but fails at delivery with "could not decode base64 value" — same quirk as PVE.--header valueis also base64. Covered inroles/pbs/tasks/notifications.yml.
Handlers¶
restart proxmox-backup-proxyrestart proxmox-backup
Multi-user / multi-namespace pattern (post-Phase 5E)¶
The role currently codifies a single PBS user (pbs_client_userid) for the PVE-side guest backup integration. Phase 5E (host-level file backup, 2026-05-06) introduced a second PBS user (host-backup@pbs) scoped to a separate namespace (host/proxfold) — created manually as a one-shot during bootstrap rather than generalising this role into a list-of-users abstraction. Rationale: one extra user didn't justify the refactor surface.
If/when a third PBS client appears (e.g. a second host backing up its own configs, or an off-site sync target), revisit: turn pbs_client_userid / pbs_client_roles into a list pbs_client_users[] and iterate the user/ACL/notification tasks. Bootstrap procedure for the manual second-user pattern lives in backup-restore runbook §host-level file backup bootstrap.
The role's existing verify-job is datastore-wide (no --ns filter), so it covers the new host/proxfold namespace automatically. The prune-job is namespace-scoped — nas-primary-prune only touches the root namespace; the host namespace gets its own nas-primary-host-prune with longer retention created during the Phase 5E bootstrap.
Related¶
- Roadmap — Phase 5A
- Roadmap — Phase 5E
- Role: proxmox — owns the PVE-side storage registration (
pbs_client.yml), guest backup job, and host file-level backup (host_backup.yml, Phase 5E) - Backup & Restore runbook