GitOps with Fleet¶
Design: Git as the Source of Truth¶
Every resource in the cluster — Helm releases, raw manifests, RBAC, ingress rules, ExternalSecrets — is declared in the mdapi/fleet Git repository. No kubectl apply is ever run manually. This makes the cluster state fully auditable, reproducible, and diff-able.
Rancher Fleet watches the repo and continuously reconciles the cluster state to match.
Pipeline¶
flowchart TD
dev["git push\n(main / test / dev)"]
subgraph gitlab["GitLab — mdapi/fleet"]
main_br["main branch"]
test_br["test branch"]
dev_br["dev branch"]
end
subgraph rancher_mgmt["mdapi-rancher (management cluster)"]
fleet_ctrl["Fleet Controller"]
bd_prod["BundleDeployments\n(prod namespace)"]
bd_test["BundleDeployments\n(test namespace)"]
end
prod["mdapi-prod\nHelm releases + raw manifests"]
test_cl["mdapi-test (Rackspace)"]
dev_cl["mdapi-dev (Rackspace)"]
dev --> gitlab
main_br --> fleet_ctrl
test_br --> fleet_ctrl
dev_br --> fleet_ctrl
fleet_ctrl --> bd_prod --> prod
fleet_ctrl --> bd_test --> test_cl
fleet_ctrl --> dev_cl
Repository Structure¶
The repo is organized as one subdirectory per namespace. Each directory contains the Kubernetes manifests (raw YAML or Helm values) and a fleet.yaml or fleet.yml file that tells Fleet how to deploy them.
fleet/
├── bootstrap/ # GitLab (Helm)
├── windmill/ # Windmill (Helm)
├── keycloak/ # Keycloak
├── joplin/ # Joplin + MCP server
├── mail/ # Full mail stack
├── tv/ # Media stack (16 services)
├── longhorn/ # Longhorn config
├── keel/ # Keel (Helm)
├── nameserver/ # BIND9
├── ... # 35+ more namespaces
└── README.md
Helm History and ErrApplied¶
Fleet validates that the Helm release history is intact before each reconcile — specifically, it checks that current_release_version - maxHistory still exists in the Helm history secrets. When Kubernetes GC removes old secrets, Fleet enters ErrApplied and stops reconciling.
Prevention: every fleet.yaml sets helm.maxHistory: 25, giving a large enough window that GC never removes the oldest version Fleet needs.
# fleet.yaml — standard pattern
defaultNamespace: my-namespace
helm:
maxHistory: 25
releaseName: my-release
chart: my-chart
repo: https://charts.example.com/
Recovery (if it still occurs): clear status.release on the BundleDeployment to force a fresh helm install, bypassing the history chain check entirely.
Keel — Automated Image Updates¶
Keel polls container registries every 4 hours and compares image digests. If a digest has changed, Keel patches the Deployment and triggers a rolling update. This keeps all workloads on the latest upstream releases without manual intervention.
flowchart LR
reg["Container Registry\ndocker.io / ghcr.io\nregistry.mdapi.ch"]
keel["Keel\ndigest poll @every 4h"]
deploy["Deployment"]
pod["Rolling update"]
reg -->|"digest changed?"| keel -->|"patch image tag"| deploy --> pod
Helm provider requires explicit pollSchedule
The global polling.defaultSchedule in Keel's values only applies to the Kubernetes annotations provider. For Helm-managed releases, pollSchedule must be declared explicitly inside the release's own values:
Non-Fleet Resources¶
A small number of resources are deployed with kubectl apply directly against the mdapi-rancher management cluster and are not managed by Fleet — mainly because they manage Fleet itself or cross cluster boundaries:
~/rancher-local/— ovpn-admin UI, cert-manager cluster issuers- These are committed to the
mdapi/rancher-localGitLab repo for traceability, but not reconciled automatically