[Proxmox Series Part 8] Designing Storage Layouts, Snapshots, Backups, and Recovery

It only takes a few weeks of Proxmox usage to bump into storage limits. When every VM disk, ISO, and backup sits on the default local and local-lvm, rollback options become murky. The opposite approach—separating snapshots from backups and deciding when to use local disks, a NAS, or Proxmox Backup Server (PBS)—turns even a tiny mini PC into an environment with real recovery habits.

This article answers the practical question: “Where should each asset live, and how do I recover when something breaks?”

How this post flows

Review what each Proxmox storage type actually does
Separate snapshot thinking from backup planning
Choose and attach off-host backup targets (NAS, PBS)
Build a recovery-first checklist
Combine local and external storage in workable layouts

Terms introduced here

Storage pool: A Proxmox abstraction over a specific backend (LVM-Thin, LVM, ZFS, Directory, NFS, CIFS, PBS, …). Each pool advertises which content types it can hold—for example local (Directory) stores ISOs/backups only, while local-lvm (LVM-Thin) stores VM disks and LXC roots.
Thin provisioning: Allocating physical space only for blocks that are actually used, regardless of the virtual disk size assigned to each VM. Thin pools share capacity, and when data or metadata usage reaches 100% writes will fail and VMs freeze.
Proxmox Backup Server (PBS): A Proxmox-native incremental backup server with block-level deduplication, encryption, and verification.
Recovery dry run: Rehearsing the restore process on an isolated datastore or disposable VM so you can validate backups without touching production workloads.

Assumptions

The examples target a single Proxmox 8.x host with vmbr0 already configured. Clustered or Ceph-backed setups require additional planning.

You have already opened Datacenter > Storage and Datacenter > Backup at least once.

Reading card

Estimated time: 20 minutes

Prereqs: you have already created at least one VM or LXC guest on Proxmox

After reading: you can map snapshots, backups, and off-host storage to the right use cases.

Start with a clear view of storage roles

A fresh install offers two default storage pools:

local: Directory storage for ISOs, templates, and backups only—it cannot store VM disks.
local-lvm: An LVM-Thin pool for VM disks and LXC root filesystems.

The pools are already separated, but if they sit on the same physical disk, you still share capacity and failure domain. Filling local with ISOs leaves less headroom for local-lvm, and a single disk failure wipes both pools. Start with these rules of thumb:

Keep live VM/LXC disks on the fastest SSD available. Workloads care about latency.
Keep ISOs and templates where management is easiest. local works fine, or add a shared NFS/SMB mount when multiple hosts need the same files.
Send backups off the host. At minimum use a different disk; ideally ship them to another device or location.

The Datacenter > Storage screen lets you add LVM-Thin, ZFS, Directory, NFS, CIFS/SMB, and PBS targets, so design the mix of physical disks and network storage before the pool fills up.

Snapshots versus backups

Many newcomers postpone backups because “snapshots exist,” only to lose everything when the disk dies. Use the table below to separate intent and scope quickly.

	Snapshots	Backups
Storage location	Same pool (LVM-Thin, ZFS, …)	A different disk/NAS/PBS
Creation speed	Seconds to minutes	Minutes to hours
Survives host/disk loss?	No	Yes
Use cases	Instant rollbacks, short-lived experiments	Hardware loss, ransomware, migration
Risks	Long chains hurt performance; consumes pool capacity	Longer RTO; requires retention planning

Snapshots: rollback inside the same storage

LVM-Thin snapshots keep changed blocks via copy-on-write. Each additional snapshot adds metadata lookups, increasing write latency.
ZFS snapshots are read-only point-in-time copies; they have minimal write impact but keep referenced blocks allocated until you delete the snapshot.
Capturing a running VM demands qemu-guest-agent, Windows VSS, or database-specific quiesce scripts if you want application-consistent data.

Backups: move data elsewhere

Use [[vzdump|vzdump]] to create .vma.zst archives or push to PBS for incremental backups.
Define the RPO (how much data loss you can tolerate) and RTO (how fast you must restore) in concrete numbers—for example, “24 h RPO / 60 min RTO.”
Backups are not done until you verify them—use proxmox-backup-client verify or perform actual restores. Remember that PBS incremental chains depend on all prior backups in the chain; protect the full set.

In practice, keep one or two recent snapshots per VM and schedule at least one backup per day. Increase frequency only for workloads with stricter RTO/RPO.

Picking an off-host backup target

External SSD/HDD (offline & rotated): Plug it into the host, add it via Datacenter > Storage > Add > Directory, run backups, then unplug and store it elsewhere. Leaving it connected 24/7 keeps it in the same blast radius; unplugging makes it both offline and off-site when stored away from the rack.
NAS (NFS/SMB): Build or buy a NAS, export a share, and add it in Proxmox. Separate its power and network path from the hypervisor where possible (different room or UPS) so one surge can’t wipe both.
Proxmox Backup Server: Install PBS on a spare mini PC, complete the initial wizard, then register it through Datacenter > Storage > Add > Proxmox Backup Server. From there, point backup jobs to PBS and let block-level deduplication, encryption, and verification run automatically.

Guiding principle: isolate the threat domain. Whoever can delete or damage the host should not be able to delete or damage the backup (or its encryption keys) in the same event. Keep NAS/PBS gear on different power circuits/UPS units and store PBS encryption keys separately.

Building a recovery-first checklist

Prioritize workloads: e.g., Reverse proxy > App > DB > Windows test box. Assign RTO/RPO targets to each.
Record storage types: LVM-Thin vs. ZFS vs. Ceph determine how you snapshot and migrate data.
Snapshot policy: Limit chain depth (“keep max 2 snapshots per VM, delete after 7 days”) and specify when to create them.
Backup cadence + retention: Define the schedule (Datacenter > Backup or PBS) with mode (stop/snapshot), time, and retention (7 daily, 4 weekly, etc.).
Verification & dry runs: Restore into a new VM ID every quarter, boot it, run app-level health checks, and log the actual restore time.

Turn this into a simple table so the whole team knows what “good” looks like.

Workload	Storage	Snapshot policy	Backup schedule	Target RTO / RPO	Last dry run
reverse-proxy	LVM-Thin	Manual before deploys (max 2)	PBS daily 02 , retain 7	30 min / 24 h	2026-03-10 (22 min)
app-server	LVM-Thin	Weekly manual	PBS daily 02 , retain 14	60 min / 24 h	2026-03-10 (35 min)
monitoring	LXC-Dir	None	NAS weekly Sun 03 , retain 4	120 min / 72 h	2026-02-25 (40 min)

Combining local and external storage

1. Single NVMe + external HDD (rotated)

Live workloads: NVMe local-lvm
ISOs/templates: NVMe local
Backups: USB HDD Directory (disconnect and store elsewhere after each run)
Pros: lowest upfront cost, dead-simple layout
Cons: manual steps every time; still vulnerable if you leave the drive connected

2. NVMe + SATA SSD + NAS

Live workloads: NVMe LVM-Thin
ISOs/templates: SATA SSD Directory
Backups: NAS NFS share
Pros: live data, media, and backups live on different media; NAS can sit in another room
Cons: higher cost, requires network wiring and NAS maintenance

3. NVMe + PBS

Live workloads: NVMe ZFS mirror (if possible) or NVMe LVM-Thin
Backups: dedicated mini PC running PBS with 2.5" SSD or HDD
Pros: incremental backups, verification, fast single-VM restores
Cons: needs another machine plus disks; benefits from ≥2.5 GbE links

Pick the mix that fits your budget and space, but always confirm that live storage and backup storage are physically separate.

Common mistakes and how to prevent them

Filling thin pools to 100%: Monitor both data and metadata usage under Datacenter > Storage > local-lvm or via lvs -o+metadata_percent vg/lv. Start cleanup or expansion around 80%—metadata exhaustion will freeze VMs even if the data pool looks healthy.
Sharing NAS credentials broadly: Keep the backup share on its own account with narrow permissions to limit ransom or accidental deletion impact.
Powering PBS and the host from the same outlet: Separate power or use independent UPS units so a single trip or surge does not take both down.
Skipping restore tests: Perform at least annual (preferably quarterly) restores and log how long each step takes, including incremental chain verification.
Ignoring backup alerts: Subscribe to Datacenter > Backup notifications or PBS reports so failed jobs don’t go unnoticed.

Wrap-up

When you deliberately split storage roles and treat snapshots, backups, and recovery as separate tools, your Proxmox environment graduates from a lab toy to an operable infrastructure host. Local SSDs handle performance, while NAS or PBS targets ensure data survives hardware loss. In the next article we will take this storage foundation and walk through building a complete self-hosted web infrastructure on spare hardware—from reverse proxies to monitoring and beyond.