Supply Chain Security¶

Two complementary controls sit on the supply chain into the cluster: preventing secrets from leaking into git in the first place, and scanning every image for known vulnerabilities before and after it reaches a node. The first runs at commit time and in CI; the second runs at build time and again daily against the running pods.

Secret Management is the sibling page — that one is about how secrets are handled at runtime (Akeyless customer fragment, External Secrets Operator). This page is about how secrets are kept out of source control, and how images are kept honest about what they contain.

Secret Leak Prevention¶

Four layers, each picking up what the layer above missed.

flowchart LR
    dev["Developer edit"] --> hook["L1: pre-commit hook<br/>gitleaks<br/>~/.git-hooks/pre-commit"]
    hook -->|leak found| reject["commit rejected<br/>locally"]
    hook -->|clean| push["git push"]
    push --> ci["L2: CI .pre stage<br/>secret-scan job<br/>blocks pipeline"]
    ci -->|leak found| fail["pipeline fails<br/>before build"]
    ci -->|clean| build["build / scan / deploy"]
    gi["L3: .gitignore baseline<br/>.env* / kubeconfig* / keys"] -.->|never staged| hook
    audit["L4: reactive audit<br/>gitleaks historical scan<br/>+ rotation runbook"] -.->|periodic sweep| ci

L1 — Pre-commit hook (local)¶

A single global hook covers every repo on the workstation. git config --global core.hooksPath ~/.git-hooks redirects the hook search path; ~/.git-hooks/pre-commit runs zricethezav/gitleaks against the staged diff and exits non-zero on any finding.

git config --global core.hooksPath ~/.git-hooks
# ~/.git-hooks/pre-commit runs gitleaks on the staged diff

This is the highest-impact layer: one install, every repo covered, zero per-repo configuration. A blocked commit never reaches the remote, so the secret never enters anyone's git history and there is no rotation to do.

L2 — CI `.pre` stage¶

In each repo with a .gitlab-ci.yml, a secret-scan job runs in the .pre stage so it gates every downstream job in the pipeline. The job runs the same zricethezav/gitleaks container against the full checkout, not just the diff, so it catches secrets that bypassed L1 (e.g. commits from another workstation, or from before the global hook was installed).

secret-scan:
  stage: .pre
  image: zricethezav/gitleaks:latest
  tags: [mdapi]
  script:
    - gitleaks detect --no-banner --redact -v

The tags: [mdapi] is load-bearing — the in-cluster gitlab-runner has run_untagged=false and silently queues forever otherwise.

L3 — `.gitignore` baseline¶

A blanket .gitignore block applied to every repo covers the common shapes of credential files that operators do touch but should never commit: .env*, kubeconfig*, SSH private keys, PKCS12 bundles, .netrc. This catches the most common mode of accidental leak (operator drops a kubeconfig into a working dir for debugging, forgets, runs git add .) before L1 even sees the staged diff.

L4 — Reactive audit¶

When a secret does slip through all three preventive layers — typically a credential that pre-dates the hook deployment, or one that was committed before being rotated to Akeyless — the recovery pattern is:

Rotate at source (the upstream service: Pushover, GitLab, etc.) so the leaked value loses value the moment it's noticed.
Store the new value in Akeyless under /mdapi/<namespace>/<name>/<key>.
Refactor consumers to fetch via External Secrets Operator instead of hardcoding.
Patch live workloads in place (kubectl patch secret + restart) so the rotation propagates before the next reconcile.

A periodic gitleaks sweep across all local repos surfaces anything still hiding in history; combined with the three preventive layers above it makes net-new leaks rare and gives a documented response when one occurs.

CVE Scanning¶

Container image CVE scanning is integrated into every custom image CI pipeline using Syft (SBOM generation) and Grype (vulnerability matching).

Pipeline integration¶

flowchart LR
    code["Source / upstream bump"] --> build["build<br/>buildkit rootless<br/>multi-arch"] --> scan["scan<br/>Syft → SBOM<br/>Grype → CVE match"] --> notify["notify<br/>Pushover priority 1<br/>(HIGH/CRITICAL only)"]
    build --> push["push :latest<br/>registry.mdapi.ch/mdapi/"]
    scan --> artifact["SBOM artifact<br/>SPDX JSON<br/>7-day retention"]

The scan stage runs after the image is built but the result never blocks the pipeline (allow_failure: true). A CVE finding sends a notification for human review — the deployment still proceeds because production systems cannot be held hostage by upstream vulnerabilities that may have no fix yet.

Every pipeline also carries a daily cache-buster (ARG CACHEBUST_DAY + apk/apt upgrade) and a 4-hour schedule trigger, so base-image security patches reach the registry within hours of being published upstream without any manual rebuild.

Covered images¶

Custom images live under registry.mdapi.ch/mdapi/ and each has its own CI pipeline running the scan stage. Representative subset:

Image	Base	Purpose
`gitlab-webservice-ee`	GitLab EE upstream	GitLab Rails (Puma)
`gitlab-sidekiq-ee`	GitLab EE upstream	GitLab background jobs
`keycloak`	`quay.io/keycloak/keycloak:26.x`	OIDC IdP
`chrony`	`debian:stable-slim`	Stratum-1 NTP with optional GPS
`nameserver`	`debian:stable-slim`	BIND9 + Webmin
`unbound`	`alpine:latest` + `bind-tools`	Split-horizon resolver (shell + `dig` for the exec readiness probe)
`certspotter`	`golang`	Certificate Transparency monitor
`autoconfig`	`nginx:alpine`	Mail-client auto-configuration
`opennic-tier2`	`debian:stable-slim`	BIND9 OpenNIC Tier-2
`threadfin`	`debian:stable-slim`	IPTV proxy for Plex
`joplin-mcp`	`python:slim`	HTTP/SSE MCP wrapper for Joplin
`znc`	`debian:stable-slim`	IRC bouncer

(Plus the static-site builders for the WordPress / Joomla properties, which inherit the same scan stage from the shared CI template.)

SBOM artifacts¶

Syft generates an SPDX JSON SBOM for each image build. This provides:

A point-in-time snapshot of every package installed in the image.
A queryable artifact for retroactive CVE analysis when new vulnerabilities are published against packages that scanned clean at build time.
Compliance evidence for software supply-chain auditing (an SBOM with a known build provenance is the artefact regulators and customers ask for).

SBOMs are stored in GitLab CI artifacts for 7 days per build; the registry mirror itself keeps the images far longer.

Runtime CVE scanning¶

A Windmill flow (f/security/pod_image_cve_scan) runs daily and scans the images of every currently running pod against the live Grype database. This catches two classes of finding the CI scan can't:

Third-party images (from docker.io, ghcr.io, quay.io) that were never built in-cluster and so never went through the CI scan.
Newly published CVEs affecting images that passed scan at build time but now match a freshly disclosed vulnerability.

Findings are sent via Pushover with the affected image and CVE IDs, scoped by namespace so the right operator gets the right notification.

Why two layers (build-time + runtime)¶

Build-time scanning gives a clean signal on the image you're about to ship; runtime scanning gives a clean signal on the image you're actually running, including everything the cluster pulled from upstream registries that you didn't build yourself. Neither alone is sufficient — a third-party image can introduce a CVE the build scan never saw, and a self-built image can pick up a new CVE long after the pipeline last ran. The combination converges on "everything running, every day".