Storage Architecture¶
Overview¶
Three storage tiers serve different needs: Longhorn for replicated block storage, TrueNAS via democratic-csi for large NFS/iSCSI volumes, and MinIO for S3-compatible object storage.
graph LR
subgraph k8s["Kubernetes workloads"]
pvc_lh["Longhorn PVCs\nharvester-longhorn-2replicas"]
pvc_nfs["NFS PVCs\nsalt-nfs"]
pvc_iscsi["iSCSI PVCs\nsalt-iscsi"]
s3["S3 clients\n(GitLab, Rancher backup)"]
end
subgraph longhorn["Longhorn"]
lh_ctrl["Longhorn controller\n2-replica default"]
qui_d["qui — local disks"]
quo_d["quo — local disks"]
qua_d["qua — local disks"]
lh_snap["Snapshot scheduler\nMinIO backup target"]
end
subgraph dcsi["democratic-csi"]
nfs_driver["NFS CSI driver"]
iscsi_driver["iSCSI CSI driver"]
end
subgraph nas["External NAS"]
salt["salt — TrueNAS CORE\nZFS RAIDZ1"]
pepper["pepper — TrueNAS"]
end
subgraph obj["Object Storage"]
minio["MinIO\nminio.home.tillo.ch:30000"]
end
pvc_lh --> lh_ctrl
lh_ctrl --> qui_d & quo_d & qua_d
lh_ctrl -->|"backup snapshots"| minio
pvc_nfs --> nfs_driver --> salt
pvc_iscsi --> iscsi_driver --> salt
s3 --> minio
Storage Classes¶
| Class | Type | Replication | Use case |
|---|---|---|---|
harvester-longhorn-2replicas |
Longhorn block (RWO) | 2× across nodes | All production PVCs |
longhorn |
Longhorn block (RWO) | 1× | Non-critical / ephemeral |
harvester-longhorn-2replicas-notmigratable |
Longhorn block (RWX) | 2× fixed nodes | TV namespace shared volume |
salt-nfs |
NFS v4 via democratic-csi | ZFS on salt |
Large shared volumes |
salt-iscsi |
iSCSI via democratic-csi | ZFS on salt |
Block storage from TrueNAS |
Longhorn Backup Policy¶
Longhorn snapshots ship to MinIO (S3) on a schedule enforced by longhorn-backup-labels.sh, which labels every PVC at provisioning time.
| Label | Applied to | Schedule |
|---|---|---|
recurring-job-group.longhorn.io/default=enabled |
All PVCs | Daily snapshots |
recurring-job-group.longhorn.io/weekly=enabled |
Selected namespaces | Weekly off-site snapshot to MinIO |
recurring-job-group.longhorn.io/nosnapshots=enabled |
Cache / metrics PVCs | No snapshots (prometheus, redis, ...) |
Weekly backup namespaces: appdaemon, bootstrap, envuassu, esphome, frigate, home-assistant, openldap, and others.
MinIO — S3 Object Storage¶
MinIO runs outside the cluster on dedicated storage hardware, exposed on port 30000. Three consumers use it:
| Bucket | Consumer | Purpose |
|---|---|---|
gitlab-backups |
GitLab | Application + registry backups |
gitlab-runner-cache |
GitLab CI runners | Build artifact cache |
rancher-backups |
Rancher Backup Operator | Cluster resource backup |
democratic-csi — External NFS/iSCSI¶
democratic-csi bridges Kubernetes persistent volumes to TrueNAS CORE on salt via the TrueNAS REST API. The driver config (including API credentials) is stored in Akeyless and fetched via an ExternalSecret at deploy time.
This allows large volumes (like the mirror's 200 Gi or the TV stack's 3.5 Ti) to be provisioned from ZFS pools with full snapshot and clone support — without running storage inside the cluster.
PV Reclaim Policy¶
Retain policy is critical
Production PVCs use reclaimPolicy: Retain. If a PVC is deleted accidentally, the PV is not automatically removed and data can be recovered. This is verified periodically by the Windmill pv_reclaim_policy_analysis flow.
One known exception: the Windmill postgres PV (pvc-6e21d7c5) was provisioned with Delete by default and had to be patched manually to Retain.