Skip to content

External Traffic Flow & Redirects

External IP Routing

ISP allocates a /28 (31.3.128.48/28). The PPP endpoint occupies .49, additional public IPs are configured as /32 aliases on pppoe-wan and DNAT'd into the cluster. The wansub interface (31.3.128.49/28) routes the rest of the /28 for hosts that need a real public address rather than a DNAT target.

For the full per-IP mapping see Hardware → External IP Map. Highlights:

External Protocol Internal Purpose
31.3.128.50:80,443 TCP/UDP 192.168.1.191 (RKE2 ingress-nginx) HTTPS — all *.mdapi.ch services
31.3.128.53:443 TCP (3443) / UDP-QUIC (4443) 192.168.1.40 cloud.envuassu.ch (Nextcloud AIO)
31.3.128.53:3478 TCP+UDP 192.168.1.41 Nextcloud Talk STUN/TURN
31.3.128.55 multi .49 (mirror), .58:123 (NTP), .44:53 (OpenNIC) Multi-service shared IP
31.3.128.56:443 TCP 192.168.1.50:443 Squid HTTPS proxy
31.3.128.57:113,443 TCP 192.168.1.51 ZNC IRC bouncer
31.3.128.58:443 TCP+UDP 192.168.1.246:4443 sslh on mbptillo (VPN/SSH/mosh demux)
31.3.128.59:53 UDP+TCP 192.168.1.53:53 BIND9 authoritative DNS
31.3.128.62:32400 TCP+UDP 192.168.1.46:32400 Plex Media Server
31.3.128.62:1194 UDP 192.168.1.157:1194 Firewalla VPN passthrough

HTTPS Web Traffic

Direct path: external client → BPI-R4 DNAT → ingress-nginx at 192.168.1.191.

flowchart LR
    client["External Client\n*.mdapi.ch"] -->|"TCP :443"| bpir4["BPI-R4\nDNAT .50 → 192.168.1.191"]
    bpir4 --> ingress["ingress-nginx\n192.168.1.191\nModSecurity WAF"]
    ingress --> svc["K8s services"]

IPv6 → IPv4 (NAT64 via Jool)

The cluster runs IPv4-only, but every public service and every internal subnet has IPv6 reachability. Stateful NAT64 with Jool bridges the two: the entire v6 prefix 2a0d:d05:401:e964:ffff:ffff::/96 represents v4 destinations — 2a0d:d05:401:e964:ffff:ffff:c0a8:1bf is 192.168.1.191, and so on.

Jool runs in its own network namespace (jool on bpi-r4) connected to the main netns by a veth pair (jool ↔ openwrt, both ends inside 192.168.164.0/24). The main netns has the NAT64 prefix routed into the namespace; Jool translates v6→v4 and emits v4 packets with a source from its pool4, which the main netns routes onward.

flowchart LR
    v6client["IPv6 client\n(external or LAN)"]
    subgraph bpi[BPI-R4]
        main["main netns\nroutes 2a0d:d05:401:e964:ffff:ffff::/96\nvia jool veth"]
        subgraph jool_ns["jool netns"]
            jool["Jool stateful NAT64\npool6 = 2a0d:d05:401:e964:ffff:ffff::/96\npool4 = 192.168.164.{2,3}"]
        end
    end
    cluster["IPv4 cluster\n192.168.1.0/24"]
    v6client -->|"to NAT64 prefix"| main
    main -->|"veth"| jool
    jool -->|"v4, src .2 or .3"| main
    main --> cluster

Source-distinguishing via pool4 marking

Jool's pool4 carries a per-entry mark, and a small in-netns nftables rule sets the mark based on the v6 source prefix before Jool fires:

Source v6 prefix nft mark Jool pool4 v4 source Meaning
2a0d:d05:401:e900::/56 (our delegated /56) + fdfd:dce5:df64::/48 (ULA) 0x64 192.168.164.3 Internal LAN IPv6 client
everything else 0 (default) 192.168.164.2 External IPv6 client

Both pools carry TCP, UDP, and ICMP entries — every protocol has to be added explicitly with its own jool pool4 add … --tcp / --udp / --icmp line, and a missing entry silently drops that protocol's traffic (e.g. UDP hairpin from a LAN client to a service on the AAAA path).

Downstream this lets the WAF, audit, and rate-limiters treat the two egress IPs differently — see ModSecurity → Trusted-source false-positive alerting.

jool pool4 display defaults to TCP

The bare jool pool4 display only shows TCP rows. Pass --tcp, --udp, or --icmp to inspect a specific protocol; absence in the default view is not evidence the entry isn't configured.

Mark must be set inside the jool netns

skb_scrub_packet() deliberately zeroes skb->mark when a packet crosses a veth into a different namespace, as a security isolation. A mark set in the main-netns mangle_forward chain (e.g. via /etc/config/firewall and fw4) never reaches Jool. The mark rule lives in the jool netns instead, where the source IPv6 address — which does survive the crossing — is enough to identify the egress class.

VPN / SSH / mosh

All three share port 443 on a separate external VIP (31.3.128.58), demultiplexed by sslh on mbptillo.

flowchart TD
    client["Client\nvpn.home.tillo.ch"]

    subgraph bpir4["BPI-R4 DNAT 31.3.128.58"]
        tcp["TCP :443 → mbptillo:4443"]
        udp["UDP :443 → mbptillo:4443"]
    end

    subgraph mbptillo["mbptillo 192.168.1.246"]
        sslh["sslh :4443\nprotocol demux"]
        socat["socat UDP4/UDP6-LISTEN:4443,fork"]
        mosh_srv["mosh-server :443\nauthbind"]
        openvpn["OpenVPN :9443\ntun10 10.8.10.0/24"]
        sshd["sshd :22"]
    end

    client --> tcp --> sslh
    client --> udp --> socat --> mosh_srv

    sslh -->|"TLS"| ingress["192.168.1.191"]
    sslh -->|"SSH"| sshd
    sslh -->|"OpenVPN"| openvpn

sslh listens on :4443, not :443 — another process occupies :443 on mbptillo.

socat LISTEN vs RECVFROM

socat UDP-LISTEN:4443,fork forks once per client session, preserving the source IP:port for the lifetime of the connection. UDP-RECVFROM:4443,fork forks per packet, creating a new ephemeral source port each time — mosh-server sees a new client every packet and the session breaks immediately.

GitLab Pages — Custom Domains

GitLab Pages shares the cluster-wide RKE2 nginx ingress (class nginx, VIP 192.168.1.191). The wildcard *.pages.mdapi.ch is covered by a chart-managed ingress. Custom domains (e.g. docs.mdapi.ch) require an explicit Ingress:

spec:
  ingressClassName: nginx
  rules:
  - host: docs.mdapi.ch
    http:
      paths:
      - backend:
          service:
            name: gitlab-gitlab-pages
            port:
              number: 8090

The domain must also be verified by GitLab via a _gitlab-pages-verification-code.<domain> TXT record in DNS. The docs.mdapi.ch ingress + cert are reconciled by Fleet from https://gitlab.mdapi.ch/mdapi/fleet/-/tree/main/docs (public mirror).

TLS Certificate Automation

cert-manager issues certificates via DNS-01 (Let's Encrypt ACME v2) for the public domains — DNS-01 supports wildcards and works for services not reachable from the internet. Names under the internal-only home.tillo.ch zone are the exception: they validate via HTTP-01, since a DNS-01 _acme-challenge record for them would be shadowed by the in-cluster home.tillo.ch zone.

flowchart LR
    ing["Ingress\ncert-manager.io/issuer:\nletsencrypt-prod"]
    cm["cert-manager"]
    bind9["BIND9\n31.3.128.59:53\n(ns.mdapi.ch external)"]
    le["Let's Encrypt"]
    secret["TLS Secret"]

    ing --> cm
    cm -->|"nsupdate RFC 2136\nTSIG key: mdapi"| bind9
    bind9 --> le -->|"certificate"| cm --> secret --> ing

BIND9 at 192.168.1.53 accepts dynamic updates on the mdapi.ch zone (allow-update { key "mdapi"; } in named.conf). Updates land in a journal without DNSSEC signatures — acceptable because Let's Encrypt verifies with a non-validating resolver.

Split-Horizon DNS

Internal clients reach *.mdapi.ch services without hairpinning back through the WAN. The split-horizon resolver is an in-cluster unbound deployment in the split-horizon namespace, reachable on the well-known LAN DNS VIP 192.168.1.1 — a single-address MetalLB L2 pool advertised on mgmt-br. External clients resolve the same names through public BIND9 and hit 31.3.128.50 instead.

unbound answers an explicit override set — public hostnames mapped to internal VIPs (e.g. gitlab.mdapi.ch → 192.168.1.191, plex.mdapi.ch → 192.168.1.46) — and forwards every other query, including the dynamic home.tillo.ch zone, to Technitium at 192.168.1.54. The override list is declared in the Fleet-managed unbound ConfigMap; each overridden zone is marked local-zone … transparent, so a name that is not in the explicit list still falls through to Technitium rather than returning NXDOMAIN.

unbound's access-control answers only the internal LAN / DMZ / cluster subnets, and the Service runs externalTrafficPolicy: Local so the real client IP — not a node SNAT address — drives that check.

Lesson learned: avoid wildcards in BIND9 zones that share a name with cluster search domains

BIND9 used to publish *.home.tillo.ch IN A 31.3.128.50 as a catch-all so external clients reaching unknown *.home.tillo.ch URLs landed on the public ingress. That wildcard quietly broke outbound HTTPS from K8s pods: kubelet inherits the host's DHCP-supplied search home.tillo.ch into pod resolv.conf, the libc resolver with ndots:5 tries <host>.home.tillo.ch before the literal name, and the wildcard answered with our public IP — so outbound calls to e.g. project-akri.github.io connected to our own ingress and returned the default ingress.local cert. Fix: don't publish wildcards over a domain that also exists as a search domain inside the cluster. Use only explicit records.