Everyone says “put this in a VM and that in LXC,” but beginners rarely know why. Many end up creating only VMs because it feels safer, or they place everything in LXC and then hit kernel limitations. This article uses real homelab workloads to show how a simple decision framework keeps those choices consistent.
The point is to evaluate isolation, kernel needs, statefulness, and recovery costs together so you can produce repeatable rules. Once that framework exists, adding more workloads or expanding to multiple nodes becomes straightforward.
How this post flows
- Decision framework: isolation, state, kernel needs, recovery cost
- Examples: reverse proxy, monitoring, app server, database, Docker host, Windows test box
- Build a decision table
- Common mistakes and how to avoid them
- Next: connecting the framework to a standard Linux VM template
Terms introduced here
- Kernel dependency: Requirements tied to specific kernel modules or versions.
- State density: How quickly data changes and how much of it is stored locally.
- Tuning cost: Kernel parameters, cgroup settings, or other per-workload adjustments.
- Recovery lag: Time and steps needed to bring the workload back after failure.
- Decision table: A table that aligns workloads with evaluation criteria.
Reading card
- Estimated time: 22 minutes
- Prereqs: Proxmox host with storage, bridge, and backup layout already defined
- After reading: you can justify why a workload should run as a VM or LXC guest.
Decision framework: isolation, state, kernel, recovery
Score each workload across four axes and most decisions become obvious. Define what 1, 2, and 3 mean before you start so everyone in the team interprets scores the same way.
| Axis | 1 point | 2 points | 3 points |
|---|---|---|---|
| Isolation strength | Process/file separation is enough (reverse proxy) | Occasional elevated access (log shipper, monitoring agent) | Full OS isolation needed (databases, Docker hosts, Windows) |
| State density | Settings + small caches; total data < 1 GB | Some volumes change daily; 1–10 GB | Fast-changing data or databases; tens of GB and strict RPO/RTO |
| Kernel dependency | No special modules | Needs AppArmor/eBPF tweaks but tolerates host schedule | Requires own kernel version, HugePages, or custom modules |
| Recovery cost | <5 minutes; clone from template | 5–30 minutes; scripts or partial rebuild | 30+ minutes; multi-step restore/validation |
- Isolation strength: If the entire OS must be isolated, choose a VM. If file and process boundaries are enough, LXC stays on the table.
- State density: Workloads with complex disk state and high change rates lean toward VMs. LXC snapshots and backups exist but behave differently. They also depend on the storage backend—
ZFS,LVM-Thin, andbtrfssupport container snapshots, whiledirstorage on ext4/xfs does not. - Kernel dependency: Check for special kernel modules, AppArmor, eBPF, iptables quirks, or other features. LXC shares the host kernel, so conflicts are more likely and every host kernel upgrade immediately affects all containers.
- Recovery cost: Lightweight workloads benefit from LXC’s quick lifecycle, but anything that is hard to rebuild favors VMs.
Score each axis from 1 to 3, multiply by weights if certain axes matter more (for example, state density ×1.5), and default to VMs for totals of 8 or more or whenever a single axis hits 3.
Storage check before scoring
ZFS,LVM-Thin,btrfs: LXC and VM snapshots both supporteddir+ ext4/xfs: LXC snapshots unavailable (pct snapshotfails)dir+ btrfs: LXC snapshots work but require btrfs maintenance Decide on storage first; it can force a VM choice even if scores are low.
Workload examples
Reverse proxy (Nginx, Caddy)
- Isolation: Low; only ports and TLS keys need separation.
- State: Minimal; config files and certificates.
- Kernel: No special modules required.
- Recovery: Fast; clone from a template.
Recommendation: LXC. Protect certificates with proper permissions and backups, and create the container as unprivileged so TLS key compromise does not leak host root access. If you must run privileged containers (for PCI passthrough, for example), treat isolation as 2 points instead of 1.
Monitoring stack (Prometheus, Netdata)
- Isolation: Medium; host metrics collection may need elevated access.
- State: Medium to high if you keep long retention.
- Kernel: eBPF collectors may demand kernel features.
- Recovery: Medium; historical data takes time to recover.
Recommendation: Lightweight tools like Netdata fit LXC (provided you grant additional permissions for host metrics). Long-retention Prometheus+Grafana works better in a VM with a dedicated disk so you can tune I/O schedulers and run application-level backups without touching other containers. If eBPF collectors are required, test them on the host kernel before committing to LXC.
Application server (Ubuntu + Docker + app)
- Isolation: High; you want Docker and the app stack separate from other services.
- State: Medium; even if data lives elsewhere, deployment artifacts remain.
- Kernel: Docker inside LXC introduces nested cgroup and privilege headaches.
- Recovery: Medium to high; manual rebuilds are painful.
Recommendation: VM. Maintain a base Ubuntu/Debian VM template for quick cloning. Docker technically runs inside LXC if you enable nesting, cgroup v2, and overlayfs support, but the combination is fragile and unsupported; keep production Docker hosts in VMs so kernel upgrades and daemon changes remain isolated.
Database (PostgreSQL, MariaDB)
- Isolation: High; you need control over filesystem and I/O scheduling.
- State: Very high; failures hit the most critical data.
- Kernel: Filesystem tuning, HugePages, or other kernel tweaks may be required.
- Recovery: Very high; data loss is unacceptable.
Recommendation: VM. Attach dedicated virtual disks (virtio-blk/scsi) and combine Proxmox backups with database-native backups so you can pick between block-level restore and logical restore depending on the incident.
Docker host (multiple containers)
- Isolation: High; nesting Docker under Proxmox LXC complicates management.
- State: Medium; container volumes persist.
- Kernel: Docker benefits from controlling its own kernel version.
- Recovery: Medium.
Recommendation: VM. Keep Docker lifecycle independent from the host kernel. If you absolutely must use LXC, restrict it to privileged containers with explicit nesting=1, understand the security trade-off, and be ready to shift to VM when the host kernel diverges from Docker requirements.
Windows test box
- Isolation: Maximum; it needs its own OS.
- State: Moderate; snapshots usually suffice.
- Kernel: Windows cannot run inside LXC.
- Recovery: Low; snapshots restore fast.
Recommendation: VM with VirtIO drivers verified. Confirm virtualization extensions (VT-x/AMD-V), UEFI vs. BIOS mode, and VirtIO driver installation in Device Manager before calling the template done.
Decision table example
| Workload | Isolation | State | Kernel | Recovery | Total | Recommended |
|---|---|---|---|---|---|---|
| Reverse proxy | 1 | 1 | 1 | 1 | 4 | LXC |
| Monitoring (Netdata) | 2 | 1 | 2 | 2 | 7 | LXC (host metrics might require privileged mode or VM) |
| Monitoring (Prometheus) | 2 | 2 | 2 | 3 | 9 | VM |
| App server (Docker) | 3 | 2 | 3 | 2 | 10 | VM |
| Database | 3 | 3 | 3 | 3 | 12 | VM |
| Docker host | 3 | 2 | 3 | 2 | 10 | VM |
| Windows test box | 3 | 2 | 3 | 2 | 10 | VM |
Keep the table editable. Add columns such as “network requirements,” “compliance constraints,” or “shared storage dependency” if your environment needs them, and document why each score was chosen so the next engineer can follow the same reasoning.
Common mistakes and avoidance strategies
- Putting everything in LXC: Lightweight does not mean universally correct. Databases and Docker hosts in LXC add kernel complexity and backup headaches.
- Backing up only VMs: LXC guests deserve the same backup schedule. Containers are not automatically disposable.
- Tuning the host for a single workload: If only one workload needs a kernel tweak, move it to a VM. Host-level changes affect every LXC guest.
- Skipping restore tests: Mixing up snapshots and backups leads to irreversible deletes. Practice both
pct restoreandqmrestorebeforehand. A quick vzdump drill looks like this:
# LXC test
pct backup 200 local && pct stop 200 && pct destroy 200
pct restore 200 local:vzdump-lxc-200.tar.zst && pct exec 200 -- hostname
# VM test
vzdump 110 --mode snapshot --storage local
qmrestore local:vzdump-qemu-110.vma.zst 111 --unique && qm start 111
Wrap-up
Choosing between VMs and LXC should come from structured criteria, not gut feelings. Scoring isolation, state, kernel needs, and recovery cost gives consistent answers that scale as workloads grow. Adapt the table to your own homelab or small office so everyone makes the same call.
Next, we will apply this framework to the standard Linux VM cases and build a reusable Ubuntu/Debian template with sensible defaults. 5. Ignoring shared-kernel blast radius: Multi-tenant or compliance-bound workloads often require hardware or hypervisor isolation regardless of score. If two tenants must never share a kernel, stop the decision tree early and pick a VM.
💬 댓글
이 글에 대한 의견을 남겨주세요