Skip to content

Longhorn Backup Policy

Longhorn provides replicated block storage across the three bare-metal nodes. Snapshot scheduling and off-site backup to MinIO are enforced cluster-wide by a labelling script.

Label Groups

longhorn-backup-labels.sh runs as part of the weekly maintenance flow and ensures every PVC has the correct labels. Longhorn maps labels to recurring job groups.

Label Applied to Effect
recurring-job.longhorn.io/source=enabled All PVCs Opts the PVC into the recurring job system
recurring-job-group.longhorn.io/default=enabled All PVCs Daily local snapshots
recurring-job-group.longhorn.io/weekly=enabled Selected namespaces Weekly backup to MinIO S3
recurring-job-group.longhorn.io/nosnapshots=enabled Cache / metrics PVCs No snapshots (prometheus, redis, elasticsearch, ...)

Namespaces in the weekly group: appdaemon, bootstrap, envuassu, esphome, frigate, home-assistant, openldap, and others where data loss would be significant.

Backup Target

Backups ship to MinIO at minio.home.tillo.ch:30000. The backup target URL is configured in Longhorn settings as s3://longhorn-backups@mdapi/.

VM Backups

Longhorn also backs up KubeVirt VMs (e.g. the CipherTrust Manager appliance) as standard Longhorn volumes. This is what makes the CipherTrust Manager recoverable — a Longhorn snapshot restore brings the entire VM disk back without needing to reconfigure Akeyless.

Monitoring Thresholds

The weekly_infra_health Windmill flow monitors backup ages:

Job group Expected cadence Alert threshold
default (daily) Daily 35 days without snapshot
weekly (MinIO) Weekly 70 days without backup

Thresholds are intentionally wider than the cadence to absorb missed runs without false-positive alerts.

PV Reclaim Policy

All production PVs must use reclaimPolicy: Retain. If a PVC is deleted accidentally, Retain leaves the PV (and its Longhorn volume) intact for manual recovery.

Default is Delete

Longhorn PVs are provisioned with Delete by default. Any PVC that could contain irreplaceable data should have its PV reclaim policy patched immediately after provisioning:

kubectl patch pv <pv-name> -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

The pv_reclaim_policy_analysis Windmill flow audits this weekly.