Proxfold PVE 8.4 → 9.x Upgrade & ZFS RAIDZ Expansion Runsheet¶
Host: proxfold (Dell R430, 192.168.1.250)
Current: Proxmox VE 8.4.1 / OpenZFS 2.2.x / Kernel 6.8.x
Target: Proxmox VE 9.x / OpenZFS 2.3.x / Kernel 6.14+
Date Prepared: 2026-04-13
Executed: 2026-04-19 (Phase 0) → 2026-04-20 (Phases 1–2) — outcome: PVE 9.1.7 running, kernel 6.14.11-6-pve, Nvidia 550.163.01 via DKMS, Plex HW transcode verified. See Lessons from the April 2026 run below before re-using this runbook.
Estimated Downtime: 1–2 hours (upgrade + driver reinstall), plus ongoing I/O impact during RAIDZ expansion
Phase 0 — Pre-Flight Checks¶
0.1 — Confirm Current State¶
# Record current versions
pveversion -v | tee ~/pre-upgrade-pveversion.txt
# ZFS state
zpool status | tee ~/pre-upgrade-zpool-status.txt
zpool list -v | tee ~/pre-upgrade-zpool-list.txt
zfs list | tee ~/pre-upgrade-zfs-list.txt
# GPU state
nvidia-smi | tee ~/pre-upgrade-nvidia-smi.txt
# LXC container configs
for ct in 100 104; do
echo "=== CT $ct ===" >> ~/pre-upgrade-ct-configs.txt
cat /etc/pve/lxc/${ct}.conf >> ~/pre-upgrade-ct-configs.txt
echo "" >> ~/pre-upgrade-ct-configs.txt
done
# VM configs
for vm in 101 102; do
echo "=== VM $vm ===" >> ~/pre-upgrade-vm-configs.txt
cat /etc/pve/qemu-server/${vm}.conf >> ~/pre-upgrade-vm-configs.txt
echo "" >> ~/pre-upgrade-vm-configs.txt
done
# Network config
cat /etc/network/interfaces | tee ~/pre-upgrade-network.txt
ip a | tee ~/pre-upgrade-ip.txt
# Sysctl / modprobe customisations
cat /etc/sysctl.conf > ~/pre-upgrade-sysctl.txt 2>/dev/null
ls /etc/sysctl.d/ >> ~/pre-upgrade-sysctl.txt 2>/dev/null
cat /etc/modprobe.d/zfs.conf > ~/pre-upgrade-modprobe-zfs.txt 2>/dev/null
# Cron jobs (fstrim, etc.)
crontab -l | tee ~/pre-upgrade-crontab.txt
# Kernel parameters
cat /etc/default/grub | tee ~/pre-upgrade-grub.txt
cat /proc/cmdline | tee ~/pre-upgrade-cmdline.txt
0.2 — Backups¶
- Verify all CT/VM backups are current (vzdump to NAS backup target)
- Back up CT 100 (Plex) — the ZFS datasets (
stash,stash/plex-data) persist independently of the upgrade and do not need a vzdump backup - Back up VM 101 (arrstack) — verify Dockhand stacks and
.envat/opt/mediaserver/dockhand/stacks/homelab-ansible/.env - Back up VM 102 (nginx) — NPM config (no Cloudflare tunnel; tunnel was never configured here, original line was a planning artifact)
- Back up CT 104 (Ansible control node)
- Copy pre-upgrade state files off-box (SCP to FULLBRIGHT or NAS)
- Ensure
homelab-ansiblerepo is pushed to GitHub with all current changes - Image the boot drive to the NAS. The 840 PRO boot disk is a single
non-mirrored LVM — if the upgrade bricks it, the only rollback is a
fresh install + full restore. Boot a Clonezilla USB, image
/dev/sdatonasbackup. ~20 minutes, saves hours of restore work if needed. Alternative from a live USB:dd if=/dev/sda bs=4M status=progress | gzip > /mnt/nasbackup/proxfold-boot-pre-pve9.img.gz
0.3 — BIOS Preparation (iDRAC)¶
CRITICAL for Dell R430: Kernel 6.17 has known boot issues on Dell 13G servers. These BIOS settings resolve the issue. Set them now even if initially pinning kernel 6.14.
- Access iDRAC web interface (or F2 at boot)
- Navigate to System BIOS → Processor Settings
- Set X2APIC to Enabled
- Set I/OAT DMA Engine to Enabled
- Optionally enable SR-IOV Global Enable (under Integrated Devices)
- Note current BIOS version — consider updating to latest Dell R430 BIOS if not already current
- Save and reboot to apply BIOS changes
0.4 — Pin Network Interface Names¶
# Prevent interface name shuffling during the Debian Trixie upgrade
pve-network-interface-pinning generate
# Review output — confirm it reflects your current interface assignments
# Link files land in /usr/local/lib/systemd/network/50-pve-<name>.link
# and take effect on the next reboot.
0.5 — Migrate sysctl Settings¶
# PVE 9's systemd-sysctl no longer reads /etc/sysctl.conf
# Check if you have any custom sysctl values
cat /etc/sysctl.conf
# If any non-default values exist, migrate them:
# cp /etc/sysctl.conf /etc/sysctl.d/99-custom.conf
# Then remove or comment out the entries in /etc/sysctl.conf
0.6 — Run Pre-Upgrade Checklist¶
# Update to absolute latest PVE 8.4.x first
apt update && apt dist-upgrade -y
pveversion # Must show >= 8.4.1
# Run the official upgrade checker
pve8to9 --full
- Address all FAIL items (red) before proceeding
- Review all WARN items — decide on action for each
- Confirm no Ceph installed (N/A for this host)
- Confirm no co-installed PBS (N/A for this host)
0.7 — Pre-Upgrade Readiness Checklist¶
- iDRAC access confirmed and working
- BIOS settings applied (X2APIC, I/OAT DMA)
- All backups verified
-
pve8to9 --fullpasses with no failures - Pre-upgrade state files copied off-box
- Nvidia
.runinstaller file location noted:/root/NVIDIA-Linux-x86_64-<VERSION>.run(or download latest) - NVENC patch location confirmed:
/opt/nvidia-patch - IPMI fan response disable command documented for re-application if needed
- At least 10GB free on root filesystem:
df -h /
Phase 1 — The Upgrade¶
1.1 — Stop All Guests¶
# Gracefully stop VMs
qm shutdown 101 && qm shutdown 102
# Gracefully stop LXC containers
pct shutdown 104 && pct shutdown 100
# Verify all stopped
qm list
pct list
1.2 — Connect via iDRAC Console or tmux¶
# If using SSH, start a tmux session first
tmux new -s upgrade
# Do NOT use the Proxmox web console — it will disconnect during the upgrade
1.3 — Update Repository Sources¶
# Replace bookworm with trixie in all repo files
sed -i 's/bookworm/trixie/g' /etc/apt/sources.list
sed -i 's/bookworm/trixie/g' /etc/apt/sources.list.d/*.list 2>/dev/null
sed -i 's/bookworm/trixie/g' /etc/apt/sources.list.d/*.sources 2>/dev/null
# Verify — check for any remaining bookworm references
grep -r "bookworm" /etc/apt/sources.list /etc/apt/sources.list.d/
# If using no-subscription repo, ensure correct format:
cat > /etc/apt/sources.list.d/pve-no-subscription.sources << 'EOF'
Types: deb
URIs: http://download.proxmox.com/debian/pve
Suites: trixie
Components: pve-no-subscription
Signed-By: /usr/share/keyrings/proxmox-archive-keyring.gpg
EOF
# Remove any old .list-format repo files that have been replaced by .sources
# (Review first, then remove duplicates)
ls /etc/apt/sources.list.d/
1.4 — Perform the Upgrade¶
# Refresh package lists
apt update
# Pre-install kernel 6.14 as a fallback BEFORE the dist-upgrade.
# PVE 9.1 ships kernel 6.17 as the new default; the full-upgrade will
# replace the 8.4 kernel (6.8) with 6.17 and will NOT install 6.14
# unless we ask for it. Without this step, there is no fallback kernel
# in the boot menu if 6.17 fails to boot on the R430.
apt install proxmox-kernel-6.14 -y
# Dry run — review what will change
apt -s full-upgrade
# If satisfied, execute
apt full-upgrade -y
During the upgrade, respond to prompts as follows:
| Prompt | Action |
|---|---|
/etc/issue |
No — keep current version |
| Restart services automatically? | Yes |
/etc/lvm/lvm.conf |
Yes — install maintainer's version |
/etc/ssh/sshd_config |
No if customised, Yes if default |
/etc/default/grub |
No — keep current (preserves any custom kernel params) |
/etc/chrony/chrony.conf |
Yes if not customised |
If the upgrade tries to remove
proxmox-ve: This usually means a conflict withlinux-image-amd64. Runapt remove linux-image-amd64first, then retryapt full-upgrade -y.
1.5 — Reboot¶
Watch the iDRAC console during boot. If the system fails to boot on kernel 6.17, use the boot menu to select kernel 6.14, then pin it:
Phase 2 — Post-Upgrade Verification¶
2.1 — Confirm Upgrade¶
# Verify PVE version
pveversion -v
# Should show proxmox-ve: 9.x.x
# Verify kernel
uname -r
# Should show 6.14.x-pve or 6.17.x-pve
# Verify Debian release
cat /etc/os-release | grep VERSION
# Should show Trixie / 13
# Verify ZFS version
zfs --version
# Should show 2.3.x
2.2 — Verify ZFS Pool Health¶
zpool status stash
zpool list -v stash
zfs list
# Compare with pre-upgrade snapshots — pool should be ONLINE with no errors
2.3 — Reinstall Nvidia Driver¶
Driver compatibility with kernel 6.17 (as of April 2026)
The 550 and 570 driver branches both fail DKMS on kernel 6.17 due to DRM API changes (fb_create / drm_helper_mode_fill_fb_struct signature mismatches). Community-confirmed working combinations:
- Path A (recommended, lower risk): kernel 6.14 + driver 550.163.01 (unchanged from current setup — no driver change inside the Plex LXC either). Requires that 6.14 was pre-installed per section 1.4.
- Path B (more work): kernel 6.17 + driver 580.82.07 or later (580.95.05 and 580.105.08 confirmed building and running on 6.17.2+). T400 is Turing and is still in 580.x's supported-GPU list. Must be applied on BOTH the host AND inside the Plex LXC — the LXC userspace driver version must exactly match the host kernel module version.
Default to Path A for this upgrade; Path B can be done later as a separate exercise once the PVE 9 host has soaked.
# Verify you are on kernel 6.14 (not 6.17) before proceeding
uname -r
# Install headers for the running kernel
apt install pve-headers-$(uname -r) build-essential -y
# Blacklist nouveau if not already done
cat /etc/modprobe.d/blacklist-nouveau.conf
# Should contain:
# blacklist nouveau
# options nouveau modeset=0
# Run the Nvidia installer
cd /root # or wherever the .run file is
./NVIDIA-Linux-x86_64-<VERSION>.run --dkms
# If prompted about existing DKMS modules, allow overwrite
# If prompted about 32-bit compatibility libraries, skip (not needed)
# Verify
nvidia-smi
# Should show the T400 with driver version and CUDA version
2.4 — Re-Apply NVENC Patch¶
cd /opt/nvidia-patch
git pull # Update to latest
./patch.sh
# Verify patch applied
nvidia-smi # Should still work normally
2.5 — Verify IPMI Fan Settings¶
# Re-apply fan response disable for the T400 PCIe card if needed
ipmitool raw 0x30 0xce 0x00 0x16 0x05 0x00 0x00 0x00 0x05 0x00 0x01 0x00 0x00
2.6 — Start and Verify Guests¶
# Start LXC containers
pct start 100 # Plex LXC
pct start 104 # Ansible control node
# Start VMs
qm start 101 # arrstack VM
qm start 102 # nginx VM
# Verify all running
pct list
qm list
Per-guest checks:
CT 100 — Plex LXC¶
# Verify GPU passthrough
pct exec 100 -- nvidia-smi
# Should show the T400
# Verify Plex data mount
pct exec 100 -- df -h /stash/plex-data
# or check the symlink
pct exec 100 -- ls -la /mnt/plex
# Verify Plex service
pct exec 100 -- systemctl status plexmediaserver
# Test: Access Plex web UI at 192.168.1.230:32400/web
VM 101 — Arrstack¶
# Verify NFS media mounts
ssh root@192.168.1.252 'df -h /stash'
# Verify Dockhand/Docker
ssh root@192.168.1.252 'docker ps'
# Check arrstack compose is running
# Gluetun, qBittorrent, Sonarr, Radarr, Prowlarr, Seerr, FlareSolverr, mediabot
ssh root@192.168.1.252 'docker compose -f /opt/mediaserver/dockhand/stacks/homelab-ansible/docker-compose.yml ps'
# Verify .env file is in place
ssh root@192.168.1.252 'ls -la /opt/mediaserver/dockhand/stacks/homelab-ansible/.env'
# Verify ProtonVPN/gluetun connectivity
ssh root@192.168.1.252 'docker exec gluetun wget -qO- ifconfig.me'
VM 102 — Nginx¶
# Verify NPM is running
ssh root@192.168.1.249 'docker ps'
# Verify Cloudflare tunnel
# Test external access to rampancy.cloud services
CT 104 — Ansible Control Node¶
2.7 — Verify System-Level Items¶
# Check cron jobs survived
crontab -l
# Should show weekly pct fstrim cron (Sundays at 3am ACST)
# Verify ZFS ARC cap
cat /etc/modprobe.d/zfs.conf
# Should show options zfs zfs_arc_max=<bytes for 14GB>
# Verify sysctl settings migrated correctly
sysctl -a 2>/dev/null | grep <any custom values you had>
# Check Proxmox web UI
# Access at https://192.168.1.250:8006 — should show PVE 9.x
2.8 — Stability Soak¶
- Run system for at least 24-48 hours before proceeding to RAIDZ expansion
- Monitor
dmesg -wfor any kernel errors - Verify Plex hardware transcoding works (play something requiring transcode, check for green bars bug PM-4795)
- Verify all arr services are downloading and processing correctly
- Run a ZFS scrub to confirm pool integrity on the new ZFS version:
Phase 3 — ZFS RAIDZ Expansion¶
Only proceed after Phase 2 stability soak is complete and the scrub passes clean.
3.1 — Understand the Irreversibility¶
WARNING: Enabling the
raidz_expansionfeature flag is a ONE-WAY operation. Once enabled, the pool CANNOT be imported on older ZFS versions (pre-2.3). There is no going back to PVE 8 after this step.
- Confirmed: PVE 9 is stable and I am committed to staying on PVE 9
- Confirmed: All backups are current and verified
3.2 — Upgrade the ZFS Pool¶
# Check current pool feature flags
zpool get all stash | grep feature
# Upgrade the pool to enable new features (including raidz_expansion)
zpool upgrade stash
# Verify expansion feature is enabled
zpool get all stash | grep raidz_expansion
# Should show: feature@raidz_expansion enabled local
3.3 — Install New Drives¶
- Physically install new drive(s) into the R430
- Boot the system (or hot-plug if supported by the backplane)
# Identify new drives
lsblk
# Get persistent identifiers (ALWAYS use by-id for ZFS)
ls -la /dev/disk/by-id/ | grep -v part
# Note the new drive IDs:
# Drive 1: /dev/disk/by-id/___________________________________
# Drive 2: /dev/disk/by-id/___________________________________ (if applicable)
# Verify drives are clean (no existing partitions)
fdisk -l /dev/sdX # replace with actual device
# If drives have existing partitions, wipe them:
# wipefs -a /dev/sdX
3.4 — Verify Current RAIDZ Layout¶
zpool status stash
# Note the vdev name (e.g., raidz1-0) and current member disks
# Note the current disk count in raidz1-0 (proxfold: 6 as of 2026-04-20, after first expansion from 4 → 6)
3.5 — Expand the RAIDZ Vdev¶
# Attach the first new disk to the existing raidz1 vdev
# Use the VDEV NAME from zpool status (e.g., raidz1-0)
zpool attach stash raidz1-0 /dev/disk/by-id/<NEW-DISK-1>
# Monitor expansion progress
watch -n 5 'zpool status stash'
# The output will show something like:
# expand: expanded raidz1-0 copied XXG in HH:MM:SS, XX% done
The expansion runs online — all services can remain running, but expect increased I/O latency. Consider running during off-peak hours.
Do NOT add a second disk until the first expansion is fully complete. The reflow must finish and a scrub will automatically run.
3.6 — Add Additional Disks (If Applicable)¶
# Wait for first expansion to complete
# zpool status should show no ongoing expansion
# Then attach the next disk
zpool attach stash raidz1-0 /dev/disk/by-id/<NEW-DISK-2>
# Monitor again
watch -n 5 'zpool status stash'
3.7 — Post-Expansion Verification¶
# Verify all disks present and ONLINE
zpool status stash
# Verify capacity increased
zpool list stash
# Compare with pre-upgrade values
cat ~/pre-upgrade-zpool-list.txt
# Run a manual scrub if one didn't auto-run
zpool scrub stash
# Verify quota on plex-data dataset still in place
zfs get quota stash/plex-data
# Should show 100G
# Verify dataset structure
zfs list -r stash
Phase 4 — Cleanup & Ansible Integration¶
4.1 — Update Ansible Roles¶
- Update
homelab-ansiblerepo to reflect PVE 9 changes: - Kernel version references
- Nvidia driver reinstall procedure for new kernel
- New ZFS pool feature flags
- sysctl migration (if applicable)
- Any changed package names or paths
4.2 — Rotate Exposed Credentials¶
Now is a good checkpoint to rotate any previously exposed keys:
- Discord bot token (mediabot)
- Anthropic API key
- Sonarr / Radarr / Seerr API keys
- UniFi API key
- Update
.envfiles and vault secrets accordingly - Verify force-push to scrub plaintext secrets from repo history was completed
4.3 — Update Documentation¶
- Update Obsidian vault with new PVE version, kernel version, ZFS version, pool layout
- Record new drive serial numbers and
/dev/disk/by-id/paths - Note any BIOS changes made (X2APIC, I/OAT DMA)
4.4 — Remove Old Kernels (Optional)¶
# List installed kernels
dpkg -l | grep proxmox-kernel
# Remove old PVE 8 kernels if no longer needed
# apt remove proxmox-kernel-6.8.*
# Be careful — keep at least one known-good kernel as fallback
Rollback Plan¶
If the upgrade fails catastrophically:
- Boot failure: Use iDRAC to select previous kernel from boot menu
- Pool won't import: Boot a PVE 8.4 rescue ISO — the pool should import fine as long as
zpool upgradehas NOT been run - Services broken: Restore CTs/VMs from backups taken in Phase 0
- Nvidia driver won't install: Pin kernel 6.14 and retry; ensure
proxmox-headers-<kernel>matches the running kernel exactly (on PVE 9 the meta-package isproxmox-headers, notpve-headers) - Nuclear option: Fresh PVE 9 install from ISO, restore all CTs/VMs from backup, reimport ZFS pool
Point of no return: Running
zpool upgrade stashin Phase 3. Before that point, reverting to PVE 8.4 is straightforward.
Quick Reference¶
| Item | Value |
|---|---|
| Proxmox Host IP | 192.168.1.250 |
| iDRAC IP | (check your iDRAC config) |
| CT 100 (Plex) | 192.168.1.230 |
| VM 101 (arrstack) | 192.168.1.252 |
| VM 102 (nginx) | 192.168.1.249 |
| CT 104 (Ansible) | 192.168.1.245 |
| ZFS Pool | stash (~21TB RAIDZ1 usable on 6× 3.84TB SAS SSD, ~61% full post-2026-04-20 expansion) |
| Plex Data Dataset | stash/plex-data (100GB quota) |
| Media Path | /stash/rodneystash/ |
| Nvidia Driver Location | /root/NVIDIA-Linux-x86_64-<VER>.run |
| NVENC Patch | /opt/nvidia-patch |
| Dockhand Stack Env | /opt/mediaserver/dockhand/stacks/homelab-ansible/.env |
| Ansible Repo | rampantlemming/homelab-ansible |
| Timezone | Australia/Adelaide |
Lessons from the April 2026 run¶
These corrections apply on re-run of this runbook or when writing any follow-up runbook against the same host. Each item is a concrete deviation between the runbook as written and what actually worked on proxfold.
1. lvm.conf — keep the old file, don't take the maintainer's version¶
Phase 1.4 dpkg conffile prompts will offer a new /etc/lvm/lvm.conf. The maintainer's version drops the global_filter entry that excludes ZFS zvols (/dev/zd.*) and Ceph RBD (/dev/rbd.*). Without the filter, LVM scans every guest disk at boot, which is noisy at best and can latch onto guest-internal LVM metadata at worst.
Do: answer D (or keep the current file) at the conffile prompt. Post-upgrade, diff the offered version and merge anything genuinely new into the existing file — do not blindly take theirs.
2. deb822 sources — skip Phase 1.3's .sources file creation if already in sources.list¶
The runbook's Phase 1.3 creates /etc/apt/sources.list.d/pve-no-subscription.sources in deb822 format. On proxfold the PVE no-subscription entry was already in /etc/apt/sources.list as a classic .list-style single line. Creating both causes apt to warn about duplicate sources (harmless but noisy) and makes the host config harder to reason about.
Do: check grep -r pve-no-subscription /etc/apt/ first. If an entry already exists, just flip its suite from bookworm to trixie in place and leave the new .sources file uncreated.
3. proxmox-boot-tool did not exist on this host (pre-4C)¶
Phase 1.4's kernel-pin guidance assumes proxmox-boot-tool. At the time of this run (2026-04-19) proxfold booted single-disk UEFI + GRUB and did not use proxmox-boot-tool at all.
Do (at the time): pin via GRUB — find the 6.14 menuentry under "Advanced options" in /boot/grub/grub.cfg, set GRUB_DEFAULT="gnulinux-advanced-<UUID>>gnulinux-6.14.11-6-pve-advanced-<UUID>" in /etc/default/grub, and update-grub.
Post-4C update (2026-04-22)
The Phase 4C boot-drive swap moved proxfold to a ZFS RAID1 mirror managed by proxmox-boot-tool. The runbook's original proxmox-boot-tool guidance now applies. See Proxmox Configuration — Kernel pin for the current command.
4. Nvidia cgroup major numbers shift after the upgrade — LXC passthrough breaks silently¶
After the Nvidia module rebuilds against the new kernel, the dynamically-allocated chrdev majors for nvidia-uvm and nvidia-caps shift. On this run: 235 → 234 (nvidia-uvm) and 238 → 237 (nvidia-caps). Major 195 (nvidia0/nvidiactl/nvidia-modeset) is static.
The failure mode is sneaky: nvidia-smi inside the Plex LXC still works (uses major 195), so passthrough looks healthy, but CUDA calls fail because /dev/nvidia-uvm is blocked by cgroup2. Plex falls back to software transcode without warning — dashboard shows transcode active but no (hw) tag.
Do (Phase 2.3, after reboot):
ls -la /dev/nvidia* /dev/nvidia-caps/
# note the majors, then update /etc/pve/lxc/100.conf accordingly
pct restart 100
# verify: run a forced transcode in Plex and look for (hw) in the dashboard
5. /opt/nvidia-patch may already exist — don't blind-clone¶
The runbook's NVENC step clones keylase/nvidia-patch into /opt/nvidia-patch. On proxfold this directory already existed from a prior run. git clone against an existing non-empty dir fails loudly.
Do: test -d /opt/nvidia-patch && (cd /opt/nvidia-patch && git pull) || git clone ... — idempotent.
6. Kernel 6.17 (PVE 9.1 default) is installed but not used¶
The PVE 9.1 full-upgrade pulls in kernel 6.17.x as the default. Because Nvidia 550.x is not compatible with kernel 6.17 and the Dell 13G platform has a kernel 6.17 boot regression, the GRUB pin keeps 6.14 in charge. 6.17 is harmless sitting uninstalled-from-boot — leave it present as an escape hatch.
Do NOT apt remove proxmox-kernel-6.17* unless you have a reason; you'll lose the fallback.
7. AppArmor "profile failed to load" warnings during upgrade are benign¶
A transient "At least one profile failed to load" error fires mid-upgrade when apparmor reloads before the apparmor-features package has finished unpacking. Self-resolves. Only worry if it persists after the upgrade completes.
8. ZFS userland/kmod split is expected on PVE 9.1¶
Post-upgrade you will see zfs-2.4.1-pve1 userland paired with zfs-kmod-2.3.4-pve1. zpool status will nag about upgradable features. This is deliberate and backward-compatible — do not zpool upgrade to resolve the nag; it is irreversible and locks the pool to the current kernel module's feature set. Defer until after RAIDZ expansion and soak.
9. RAIDZ expansion on all-SSD pools finishes in hours, not days¶
Phase 3 of this runbook warns RAIDZ expansion at 91% full will take "multiple days." That estimate was written with spinning-disk math in mind. On proxfold's all-SSD enterprise SAS pool (Samsung PM1633a, 12Gb/s) the actual reflow rate was 1.73 GB/s for drive 1 (2h08m elapsed) and 2.00 GB/s for drive 2 (1h45m elapsed). The follow-up scrub added ~47 minutes. Total Phase 3 wall time: ~5h 41m including both attach operations and the final scrub.
Do: expect hours, not days, for an all-SSD pool. Monitor with zpool status stash — the expand: line reports MB/s and ETA directly. Do not plan downtime windows around the runbook's HDD-era estimate.
10. PERC H730 may auto-set new drives to Non-RAID — check before iDRAC conversion¶
The runbook assumes new drives will show up as individual RAID0 virtual disks on the H730 and prescribes converting them to Non-RAID via iDRAC. On this run, both drives appeared already configured as Non-RAID on insertion — the conversion step was skipped entirely. Behaviour may depend on controller firmware or a persistent "default disk state" setting.
Do: lsblk or check iDRAC Storage → Physical Disks after insertion. If the drives already report Non-RAID (and are visible at /dev/sdX with the full raw capacity), skip the iDRAC conversion. If they appear as RAID0 VDs, fall back to the documented conversion procedure.
11. pct restart does not exist on PVE 9 — use pct reboot¶
The runbook and several older notes reference pct restart <CT>. On PVE 9 this fails with unknown command 'pct restart'. The correct command is pct reboot <CT>.
12. systemd ≥252 in unprivileged LXCs needs nesting=1¶
Unrelated to the pool expansion itself but surfaced during Phase 3 pre-flight: Plex LXC (CT 100, Ubuntu with systemd 252) was generating a continuous AppArmor denial storm for sd-mkdcreds, systemd-networkd, and systemd-logind bind mounts. Proxmox UI also warned "Systemd 252 detected. You may need to enable nesting."
Do: add nesting=1 to the features: line of the LXC config (e.g. features: mount=nfs,nesting=1) and pct reboot the container. AppArmor denies clear immediately.