Skip to main content

Command Palette

Search for a command to run...

Production-Grade MongoDB Replica Set on k3s — for Homelabs, Raspberry Pis, and Beyond

A practical, end-to-end blueprint with TLS via Let's Encrypt, ArgoCD GitOps, sealed secrets, and copy-paste-ready scripts. The guide I wish existed when I started.

Updated
32 min read
Production-Grade MongoDB Replica Set on k3s — for Homelabs, Raspberry Pis, and Beyond
I

Android enthusiast turned full-stack web & Android dev! My 6+ years journey started with Android, but curiosity led me to backend & DevOps via an internship. There, I built email/push services & a cloud storage solution, fueling my passion for web dev. Landing the Smart India Hackathon win for an AI-powered proctored exam tool solidified my expertise. Now, I'm embracing cross-platform development & DevOps tools like Docker & CI/CD, while expanding my AWS knowledge to make a bigger impact. Always learning, always growing, always seeking new challenges!

Heads up — this is a long one. Grab a coffee. By the end you'll have a 3-node MongoDB replica set running on k3s, fronted by a real Let's Encrypt certificate, fully managed by ArgoCD, and reachable from DataGrip / Compass on your laptop with one paste-able URI. No private CA gymnastics. No "works on my cluster". Tested on a single Raspberry Pi 5 — designed to scale to a multi-node lab without rewriting anything.

I went looking for the blog on this and never found it. There are a hundred posts that say "just use Bitnami's Helm chart". There are zero that walk through the why behind every choice — why a StatefulSet, why three different services, why allowTLS and not requireTLS, why a per-pod ClusterIP for each replica, what replicaSetHorizons actually does, how to make external clients trust your cert without installing a custom CA. So I wrote the one I wished I'd found.

This post is for the homelab person who:

  • Already has k3s (or a basic kubeadm/RKE2 cluster) running on a Pi or a small NUC.

  • Wants a "close to production" MongoDB — replica set, TLS, secrets in git, GitOps-managed — without standing up a Kubernetes Operator just to run one DB.

  • Wants the same manifests to work unchanged on a 12-node Talos cluster later.

We're going to build it step by step, file by file, with every YAML and every shell helper included verbatim. Stick with me.

1. What we're building

A 3-member MongoDB 7 replica set named rs0, deployed as a single StatefulSet on a databases Kubernetes namespace. From outside the cluster, each replica is reachable at its own public hostname:

mongo-0.home.your-domain.tld:27017
mongo-1.home.your-domain.tld:27017
mongo-2.home.your-domain.tld:27017

…with a real Let's Encrypt certificate — meaning DataGrip, MongoDB Compass, the official drivers, and even curl trust it out of the box, no custom CA install required. From inside the cluster, your apps connect to a single ClusterIP and the driver auto-discovers the topology.

The whole thing is reconciled by ArgoCD from a git repo, so "deploying" really means "git push". Secrets are committed too — encrypted with Bitnami SealedSecrets so only the controller in your cluster can decrypt them. TLS certs are issued and renewed by cert-manager using a Cloudflare DNS-01 challenge.

Why each design choice

  • StatefulSet, not Deployment. Replica-set members must have stable names — mongodb-0, mongodb-1, mongodb-2 — because rs.conf().members[].host records them by name. Random pod names from a Deployment would break the RS the first time a pod restarted.

  • podManagementPolicy: Parallel. The default OrderedReady blocks pod-N until pod-(N-1) is Ready. But mongod isn't Ready until the RS is initiated, and the RS can't be initiated until all 3 pods exist. Classic chicken-and-egg. Parallel starts all three concurrently.

  • A postStart lifecycle hook on pod-0 does rs.initiate(...) once. It's idempotent — error code 23 ("already initialized") is treated as success.

  • Three services for three different audiences: the headless service for the StatefulSet's own pod DNS, a regular ClusterIP for in-cluster apps that just want "a Mongo", and three per-pod ClusterIPs for Traefik to route SNI-based external traffic to specific replicas.

  • TLS in allowTLS mode — accepts both encrypted and plain connections, but doesn't initiate TLS for outbound replica-set heartbeats. This matters because the LE cert only covers public hostnames, not internal cluster DNS, and we don't want intra-RS traffic failing TLS validation against names not in the SAN.

  • replicaSetHorizons so external clients see external hostnames in db.hello(), not internal cluster DNS they can't resolve.

If half of those bullets sound like Greek, don't worry — every single one is unpacked below.


2. Prerequisites

This post starts where most homelab posts end. If you don't already have these pieces, here are the canonical "install this once" guides — none take more than an afternoon.

You need Why Get started
A Kubernetes cluster (k3s / kubeadm / RKE2 / Talos) Where everything runs. Single-node Pi is fine. k3s quick install · kubeadm · Talos
kubectl configured to point at it The CLI you'll live in. kubectl install
cert-manager Issues and renews TLS certs declaratively. cert-manager install
Traefik ingress controller k3s ships it by default. We just add a TCP entrypoint. k3s + Traefik · standalone Helm
Bitnami SealedSecrets controller Encrypts your Secret YAML so it can live in git. Sealed Secrets
ArgoCD (optional but recommended) GitOps reconciler. Without it, just kubectl apply by hand. Argo CD install
kubeseal CLI on your laptop To encrypt secrets locally before committing. kubeseal releases

I also keep these explainers bookmarked — Techno Tim's homelab content covers k3s + Traefik + cert-manager wiring better than any blog. If GitOps is new to you, the Argo CD ApplicationSet docs are short and excellent.

Choose your domain story

This is the most flexible part of the setup, and the part I see people get blocked on. There are three sane paths — pick whichever fits your situation. The MongoDB manifests change in only a couple of places between them.

Option A — You own a real domain (the path this post defaults to)

This is what I use, and what gives the cleanest experience: public Let's Encrypt cert, no CA install on clients, works from anywhere on or off the LAN.

  • A registered domain — .dev, .com, .io, .xyz, anything ACME-supported. Cloudflare is cheapest for .com-class TLDs and free for DNS hosting.

  • DNS managed by a provider cert-manager supports for DNS-01. The popular ones: Cloudflare, Route 53, DigitalOcean, Hetzner via webhook, Google Cloud DNS.

  • A wildcard A record *.home.your-domain.tld → <your-cluster-ip>. The IP can be a private LAN address like 192.168.0.108 — DNS-01 doesn't need the LE servers to reach you, only Cloudflare's DNS.

  • An API token for your provider (e.g. Cloudflare with Zone:DNS:Edit), stored as a Secret named cloudflare-api-token in the cert-manager namespace.

I use home.ijlalahmad.dev throughout this post — search & replace it with your own domain. Anywhere you see mongo-N.home.ijlalahmad.dev, mentally substitute mongo-N.home.your-domain.tld.

Why "private IP" works: with DNS-01 the ACME server proves you own the domain by querying a TXT record at your DNS provider — it never connects back to your cluster. So your LAN IP being unroutable from the internet is irrelevant. Same trick that makes wildcard certs trivial.

Option B — No public domain, just .local / Pi-hole / /etc/hosts

Perfectly valid for a private homelab. You won't get a public LE cert (no DNS-01 target), but everything else works. You have two sub-options for TLS:

  1. Skip TLS entirely. Set tls.mode: disabled in mongod.conf, drop the cert-manager Certificate and the Traefik IngressRouteTCP, and expose the per-pod services with type: NodePort or via a simple TCP LoadBalancer. Connection string drops tls=true. Fine on a trusted home network.

  2. Self-signed CA via cert-manager. Issue a private root with cert-manager's selfSigned Issuer + a CA Certificate, sign your mongod cert from that, and install the ca.crt once on the laptops/tools that need to connect. Earlier drafts of this project used exactly that path — the cert chain is in the appendix.

For name resolution, point mongo-0.lab.local / mongo-1.lab.local / mongo-2.lab.local at your cluster IP via Pi-hole local DNS, your router's static DNS entries, or /etc/hosts on each client. The rest of the post applies unchanged — the SAN list in the certificate just becomes mongo-N.lab.local instead of public hostnames.

Option C — Local-only with a public domain you don't want to expose

Buy a cheap domain anyway. Use DNS-only records in Cloudflare (the grey cloud — no proxy), pointing at your private LAN IP. You get the LE cert via DNS-01, hostnames resolve only on your LAN (because the IP is RFC1918), and nothing about the cluster is reachable from the internet. This is what I do — "public name, private network". Best of both worlds.


One Traefik tweak you must add

Mongo speaks raw TCP on 27017, not HTTP. Traefik needs an entrypoint and allowCrossNamespace. In your Traefik Helm values:

ports:
  mongodb:
    port: 27017
    expose:
      default: true
    exposedPort: 27017
    protocol: TCP

providers:
  kubernetesCRD:
    allowCrossNamespace: true

The cross-namespace flag lets us put IngressRouteTCP in the traefik namespace while pointing at services in databases.


3. The folder layout

Here's the entire MongoDB folder. Flat, no subfolders — important because ArgoCD's path is non-recursive by default, and a nested services/ folder gets silently skipped.

k3s/databases/mongodb/
├── certificate.yaml         # cert-manager Certificate, LE via Cloudflare DNS-01
├── pv.yaml                  # 3 PersistentVolumes (hostPath, retained)
├── statefulset.yaml         # The 3-replica STS with postStart RS init
├── service.yaml             # ClusterIP "mongodb" for in-cluster apps
├── service-headless.yaml    # Headless "mongodb-headless" for STS pod DNS
├── service-per-pod.yaml     # 3 ClusterIPs (mongodb-0/1/2) as Traefik backends
├── ingressroute-tcp.yaml    # Traefik IngressRouteTCP, 3 HostSNI rules
├── secret.yaml              # Plain Secret (gitignored, dev only)
├── sealedsecret.yaml        # Encrypted version, the one ArgoCD applies
├── config/
│   └── mongod.conf          # Source of truth → auto-converted to ConfigMap
└── setup.sh                 # Local CLI tool

A note on layout philosophy. Databases live at k3s/databases/* (sibling of k3s/apps/*), not under apps/. Stateful infrastructure has a fundamentally different lifecycle from "throwaway" apps. You restart a Jellyfin pod casually; you do not casually restart a database that's mid-write. Keeping them in a separate root makes the distinction obvious and lets us layer DB-specific tooling on top.


4. Storage: PersistentVolumes that survive cluster wipes

For homelab use, hostPath is honest and simple. We pre-create one PV per pod, with persistentVolumeReclaimPolicy: Retain so a kubectl delete pvc doesn't nuke the bytes on disk.

pv.yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: mongodb-pv-0
spec:
  capacity:
    storage: 20Gi
  accessModes: ["ReadWriteOnce"]
  persistentVolumeReclaimPolicy: Retain
  storageClassName: ""
  hostPath:
    path: /home/pi/db-data/mongodb-rs0/mongodb-0
    type: DirectoryOrCreate
  claimRef:
    namespace: databases
    name: data-mongodb-0
---
# (mongodb-pv-1 and mongodb-pv-2 are identical with /mongodb-1 and /mongodb-2 paths)

The trick that makes this rock-solid for a fresh-cluster bootstrap is the claimRef field. By spelling out exactly which PVC will claim the PV up front, we guarantee that when the StatefulSet's volumeClaimTemplates fires and creates data-mongodb-0, the binding is deterministic. No "PV stuck Available, PVC stuck Pending" dance. Coupled with storageClassName: "" (explicit empty string, not unset), the default storage class never tries to provision anything.

type: DirectoryOrCreate means the kubelet creates the host directory on first use if it's missing. Survives full cluster reinstalls — the next bootstrap re-binds to the same data on disk.

If you're on a multi-node cluster, swap hostPath for Longhorn, Ceph, or your storage class of choice. The rest of the manifests don't change.


5. Secrets: sealed and committed to git

We need three sensitive values:

  • MONGO_INITDB_ROOT_USERNAME — the bootstrap admin user.

  • MONGO_INITDB_ROOT_PASSWORD — its password.

  • MONGO_REPLICA_SET_KEY_FILE — a 1024-byte base64 keyfile used for inter-member auth. Without it, any pod could pretend to be a replica set member.

Generate the keyfile:

openssl rand -base64 756 | tr -d '\n' > keyfile

Build the plain Secret locally (this file is gitignored):

kubectl create secret generic mongodb-secret \
  -n databases \
  --from-literal=MONGO_INITDB_ROOT_USERNAME=admin \
  --from-literal=MONGO_INITDB_ROOT_PASSWORD='your-strong-pw!' \
  --from-file=MONGO_REPLICA_SET_KEY_FILE=keyfile \
  --dry-run=client -o yaml > secret.yaml

Then seal it for git:

kubeseal --controller-namespace=kube-system < secret.yaml > sealedsecret.yaml

sealedsecret.yaml looks like ciphertext to anyone reading the repo. Only the controller running in your cluster can decrypt it — they're encrypted with the controller's per-cluster key. Even if your repo is public, the secret is safe.

The sealed file goes in git, the plain secret.yaml does not. Add to .gitignore:

**/secret.yaml

6. TLS: Let's Encrypt via Cloudflare DNS-01

Skipping this section? Re-read the "Choose your domain story" box. If you went with Option B (no public domain), use cert-manager's selfSigned Issuer chain instead — same overall shape, just swap the issuerRef in certificate.yaml and distribute the resulting ca.crt to your clients. If you went with Option A or C, read on.

This was the big "lightbulb moment" of the project. Most tutorials hand you a self-signed CA, then tell clients to "just trust this ca.crt". That works — until it doesn't. JDBC drivers want the cert in a JKS truststore. Compass wants a PEM. Your colleague's Mac doesn't trust it. It's a constant friction tax.

A real LE cert eliminates all of that. Drivers, GUIs, and curl just trust it because the LE root is in every modern OS.

The ClusterIssuer (one-time setup)

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: you@example.com
    privateKeySecretRef:
      name: letsencrypt-prod-key
    solvers:
      - dns01:
          cloudflare:
            apiTokenSecretRef:
              name: cloudflare-api-token
              key: api-token

Verify it once:

$ kubectl get clusterissuer
NAME               READY   AGE
letsencrypt-prod   True    10d

The MongoDB Certificate

certificate.yaml

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: mongodb-server
  namespace: databases
spec:
  secretName: mongodb-server-tls
  duration: 2160h        # 90 days (LE max)
  renewBefore: 360h      # renew 15 days before expiry
  privateKey:
    algorithm: RSA
    size: 2048
    rotationPolicy: Always
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  additionalOutputFormats:
    - type: CombinedPEM   # cert + key concatenated, what mongod expects
  commonName: mongo-0.home.ijlalahmad.dev
  dnsNames:
    - mongo-0.home.ijlalahmad.dev
    - mongo-1.home.ijlalahmad.dev
    - mongo-2.home.ijlalahmad.dev

A few things worth pointing out:

  • additionalOutputFormats: [CombinedPEM] — mongod's certificateKeyFile directive expects the cert and key in a single PEM file. cert-manager's default split-file output won't fly. This one-line opt-in produces tls-combined.pem in the secret alongside the standard tls.crt/tls.key.

  • Only public hostnames in the SAN. LE physically cannot sign internal cluster DNS like mongodb-0.mongodb-headless.databases.svc.cluster.local. So the cert covers external traffic only. Internal RS traffic stays plain TCP (more on that in the mongod.conf section).

  • 90-day duration, 15-day renewal window. cert-manager will silently re-issue and roll the secret in your cluster forever.

Apply, then watch:

$ kubectl -n databases get certificate
NAME             READY   SECRET               AGE
mongodb-server   True    mongodb-server-tls   2m

When READY=True you have an LE-signed cert sitting in a Kubernetes Secret. cert-manager handles renewal automatically.


7. Configuration: a real mongod.conf you can read

Forget cramming a multi-line YAML string into a ConfigMap. Keep mongod.conf as an actual file in config/ and let your tooling auto-build the ConfigMap. (We'll see how in the scripts section.)

config/mongod.conf

storage:
  dbPath: /data/db
  wiredTiger:
    engineConfig:
      cacheSizeGB: 0.5     # Pi-friendly. Bump on bigger nodes.

net:
  bindIp: 0.0.0.0
  port: 27017
  tls:
    # allowTLS = mongod accepts both plain & TLS, but does NOT initiate TLS
    # on outbound connections (e.g. replica-set member-to-member).
    # That keeps internal RS traffic plain — necessary because the LE cert
    # only covers the public hostnames, not the cluster DNS names that
    # members use to dial each other.
    mode: allowTLS
    certificateKeyFile: /etc/mongo/tls/tls-combined.pem
    # mongod 7 requires CAFile even when not validating clients. Since we
    # set allowConnectionsWithoutCertificates: true, no client cert is
    # actually checked against this CA — we just need any valid bundle.
    # Use the image's OS CA bundle (Debian path, present in mongo:7).
    CAFile: /etc/ssl/certs/ca-certificates.crt
    allowConnectionsWithoutCertificates: true

replication:
  replSetName: rs0

security:
  keyFile: /etc/mongo/keyfile

Why allowTLS and not preferTLS or requireTLS?

This is the single most counter-intuitive line in the whole setup. The official docs nudge you toward requireTLS for production. Don't do it here. Here's the tradeoff matrix:

Mode Inbound plain Inbound TLS Outbound (RS heartbeats)
disabled ✅ accept ❌ reject plain
allowTLS ✅ accept ✅ accept plain
preferTLS ✅ accept ✅ accept TLS
requireTLS ❌ reject ✅ accept TLS

For RS heartbeats, members dial each other using the names from rs.conf() — internal cluster DNS like mongodb-0.mongodb-headless.databases.svc.cluster.local. Those names are not in the LE cert's SAN (LE won't sign them). With preferTLS or requireTLS, mongod tries TLS for outbound, the cert fails hostname validation, heartbeats fail, the RS goes unhealthy, and you end up debugging at 2am.

allowTLS says "I'll accept TLS from clients, but I'll only initiate plain". Internal RS traffic stays plain (it's already inside the cluster network, so the security loss is small), external clients get full TLS through Traefik passthrough.

Why CAFile is set even though we don't validate clients

mongod 7 refuses to start with TLS enabled unless CAFile is configured (see SERVER-72839). Combined with allowConnectionsWithoutCertificates: true, no client cert is ever actually validated against this file — but the file must exist. We point at the image's built-in OS bundle. Cosmetic compliance, no security impact.


8. The StatefulSet — heart of the cluster

This is the biggest manifest, so we'll walk through it in chunks.

statefulset.yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mongodb
  namespace: databases
spec:
  serviceName: mongodb-headless     # Must match the headless Service name
  replicas: 3
  podManagementPolicy: Parallel     # Critical — see "why" below
  selector:
    matchLabels:
      app: mongodb

serviceName: mongodb-headless is what gives each pod its DNS name mongodb-N.mongodb-headless.databases.svc.cluster.local. podManagementPolicy: Parallel is the workaround for the chicken-and-egg problem we discussed.

The init container

initContainers:
  - name: keyfile-init
    image: busybox
    command:
      - sh
      - -c
      - |
        cp /tmp/keyfile /keyfile/keyfile && \
        chmod 400 /keyfile/keyfile && \
        cp /tls-src/tls-combined.pem /tls/tls-combined.pem && \
        chmod 400 /tls/tls-combined.pem && \
        chown 999:999 /keyfile/keyfile /tls/tls-combined.pem && \
        chown -R 999:999 /data/db
    volumeMounts:
      - name: keyfile-src
        mountPath: /tmp
      - name: keyfile
        mountPath: /keyfile
      - name: tls-src
        mountPath: /tls-src
      - name: tls
        mountPath: /tls
      - name: data
        mountPath: /data/db

Why an init container? Two reasons:

  1. mongod runs as UID 999 (the mongo user) but Kubernetes mounts secrets with permissions that mongod refuses to read for the keyfile (it requires chmod 400). Secret projections don't let you set defaultMode strictly enough on a per-key basis. The init container side-steps the entire problem by copying secrets out of the read-only secret mount into an emptyDir, applying the right permissions, and chowning to mongo.

  2. The init container also fixes ownership of the data directory on first boot.

The mongod container

containers:
  - name: mongodb
    image: mongo:7
    command:
      - mongod
      - --config
      - /etc/mongo/mongod.conf
    ports:
      - containerPort: 27017

Nothing exotic. We override the entrypoint to use our config file directly.

The replica-set initiator

This is the magic that makes the whole thing self-bootstrapping:

lifecycle:
  postStart:
    exec:
      command:
        - bash
        - -c
        - |
          # Only pod-0 bootstraps the replica set
          [[ "${HOSTNAME}" != "mongodb-0" ]] && exit 0
          # Wait up to 60s for mongod to accept connections
          for i in $(seq 1 30); do
            mongosh --quiet --eval "db.adminCommand('ping').ok" 2>/dev/null | grep -q '^1$' && break
            sleep 2
          done
          # Initiate replica set — error code 23 means already initialized, which is fine
          mongosh admin \
            -u "${MONGO_INITDB_ROOT_USERNAME}" \
            -p "${MONGO_INITDB_ROOT_PASSWORD}" \
            --quiet \
            --eval "
              var r = rs.initiate({
                _id: 'rs0',
                members: [
                  { _id: 0, host: 'mongodb-0.mongodb-headless.databases.svc.cluster.local:27017' },
                  { _id: 1, host: 'mongodb-1.mongodb-headless.databases.svc.cluster.local:27017' },
                  { _id: 2, host: 'mongodb-2.mongodb-headless.databases.svc.cluster.local:27017' }
                ]
              });
              if (r.ok !== 1 && r.code !== 23) quit(1);
            " 2>/dev/null || true

It runs on every pod start, but the [[ "${HOSTNAME}" != "mongodb-0" ]] && exit 0 short-circuits everyone except pod-0. The code !== 23 check makes it idempotent — rs.initiate on an already-initialized cluster returns 23, which we treat as success.

You'll notice we're not advertising the external horizons here. We add those after the cluster is up:

kubectl -n databases exec -it mongodb-0 -- mongosh -u admin -p
> cfg = rs.conf()
> cfg.members[0].horizons = { external: "mongo-0.home.ijlalahmad.dev:27017" }
> cfg.members[1].horizons = { external: "mongo-1.home.ijlalahmad.dev:27017" }
> cfg.members[2].horizons = { external: "mongo-2.home.ijlalahmad.dev:27017" }
> rs.reconfig(cfg)

What horizons do: when a driver connects via mongo-0.home.ijlalahmad.dev, mongod replies to db.hello() with the external names of all members — names the external client can actually resolve. Without horizons, the client would be told "the other members are at mongodb-1.mongodb-headless.databases.svc.cluster.local" and immediately fail to connect to them.

The volume choreography

volumeMounts:
  - { name: data,    mountPath: /data/db }
  - { name: config,  mountPath: /etc/mongo/mongod.conf, subPath: mongod.conf }
  - { name: keyfile, mountPath: /etc/mongo/keyfile,     subPath: keyfile }
  - { name: tls,     mountPath: /etc/mongo/tls,         readOnly: true }

volumes:
  - name: config
    configMap: { name: mongodb-config }

  - name: keyfile-src
    secret:
      secretName: mongodb-secret
      items:
        - key: MONGO_REPLICA_SET_KEY_FILE
          path: keyfile

  - name: keyfile
    emptyDir: {}

  - name: tls-src
    secret:
      secretName: mongodb-server-tls

  - name: tls
    emptyDir: {}

volumeClaimTemplates:
  - metadata:
      name: data
      labels: { app: mongodb }
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: ""
      resources:
        requests: { storage: 20Gi }

Both the keyfile and the TLS material follow the same pattern: a *-src secret-mount the init container reads from, an emptyDir it copies into with the right permissions, and the main container mounts the emptyDir. Clean and works around every Kubernetes secret-permissions footgun.


9. Three services, three jobs

This is the question that confused me longest when I started: why do I need three services for one StatefulSet? The honest answer: each one solves a different connectivity problem.

9.1 The headless service — service-headless.yaml

apiVersion: v1
kind: Service
metadata:
  name: mongodb-headless
  namespace: databases
spec:
  clusterIP: None
  publishNotReadyAddresses: true
  selector:
    app: mongodb
  ports:
    - port: 27017
      name: mongodb

clusterIP: None makes this headless — there is no virtual IP. Its sole job is to give each pod a DNS A record mongodb-N.mongodb-headless.databases.svc.cluster.local. Without this, the StatefulSet pods have no stable DNS, and the replica set falls apart.

publishNotReadyAddresses: true is essential during bootstrap — it lets pods discover each other before they're marked Ready, breaking the chicken-and-egg.

9.2 The regular ClusterIP — service.yaml

apiVersion: v1
kind: Service
metadata:
  name: mongodb
  namespace: databases
spec:
  selector:
    app: mongodb
  ports:
    - port: 27017
      targetPort: 27017

This is what internal apps (n8n, your custom services, whatever) connect to. They use the URI mongodb://...@mongodb.databases.svc.cluster.local:27017/?replicaSet=rs0. The driver connects to whatever pod the Service forwards to, asks db.hello(), learns the topology, and starts talking to the primary directly.

9.3 Per-pod ClusterIPs — service-per-pod.yaml

apiVersion: v1
kind: Service
metadata:
  name: mongodb-0
  namespace: databases
spec:
  selector:
    statefulset.kubernetes.io/pod-name: mongodb-0
  ports:
    - port: 27017
      targetPort: 27017
---
# (mongodb-1 and mongodb-2 are identical with the matching pod-name)

This is the trick that makes external SNI routing work. Each Service selects exactly one pod via the statefulset.kubernetes.io/pod-name label that Kubernetes auto-applies. Now Traefik can map "incoming connection with SNI = mongo-0.home.ijlalahmad.dev" to "the Service that fronts only mongodb-0", giving us per-replica external addressability.


10. Traefik IngressRouteTCP with SNI passthrough

MongoDB wire protocol is binary TCP — Traefik's HTTP Ingress can't see it. We use IngressRouteTCP (Traefik's L4 CRD) with TLS passthrough:

ingressroute-tcp.yaml (lives in the traefik namespace)

apiVersion: traefik.io/v1alpha1
kind: IngressRouteTCP
metadata:
  name: mongodb-0
  namespace: traefik
spec:
  entryPoints:
    - mongodb       # The 27017 entrypoint we added to Traefik
  routes:
    - match: HostSNI(`mongo-0.home.ijlalahmad.dev`)
      services:
        - name: mongodb-0       # Per-pod Service from the previous section
          namespace: databases
          port: 27017
  tls:
    passthrough: true            # Traefik does NOT terminate TLS
---
# (mongodb-1 and mongodb-2 are identical, with their own SNI hostname and Service)

What "passthrough" actually means

Without passthrough, Traefik would terminate TLS itself and re-encrypt to the backend (or send plain). With passthrough: true, Traefik treats the connection as opaque bytes — it only peeks at the unencrypted SNI field of the TLS ClientHello to decide where to route, then forwards the encrypted stream untouched. mongod sees the TLS handshake directly, presents its cert, terminates the session.

This is perfect for our use case: Traefik doesn't need a copy of the cert (mongod has it), the TLS path is end-to-end client→mongod, and we can route different SNI hostnames to different backend pods.

Traffic flow

  1. DataGrip on your laptop connects to mongo-0.home.ijlalahmad.dev:27017.

  2. DNS resolves to your cluster's IP (192.168.0.108 in my case).

  3. Traefik on 27017 reads the SNI from the ClientHello — sees mongo-0.home.ijlalahmad.dev.

  4. Matches the first IngressRouteTCP rule, forwards bytes to the mongodb-0 Service in databases.

  5. The Service forwards to the mongodb-0 pod.

  6. mongod completes TLS handshake using the LE cert.

  7. Driver issues db.hello(). mongod replies with the horizons view: members are at mongo-0/1/2.home.ijlalahmad.dev.

  8. Driver opens TLS connections to the other two members the same way.


11. The shell helpers — a layered, reusable management library

You could manage everything with raw kubectl. After the third time typing kubectl -n databases logs mongodb-0 -f --tail=200, you won't want to.

I built a two-layer management library:

k3s/scripts/
├── _app-ctl.sh   ← Generic library — every app gets these commands free
└── _db-ctl.sh    ← DB-specific layer; sources _app-ctl.sh; adds DB commands

Each app's setup.sh is a tiny shim that sources the appropriate library and provides 3-4 hook functions for app-specific behavior.

Generic commands (from _app-ctl.sh)

Every app gets these for free:

deploy        Apply all manifests in dependency order
delete        Inverse of deploy
redeploy      delete + deploy
status        Pods, PVCs, services, recent events, resource usage
logs          Tail logs (last 200 lines, then follow)
restart       Roll the workload
shell         Exec a shell into the workload pod
port-forward  kubectl port-forward to localhost
seal          Resign the secret with kubeseal
argocd-sync   Trigger an ArgoCD sync
help          Show available commands

The deploy logic applies a known-order list (pv → pvc → certificate → configmap → secret → rbac → workload → service → ingress), then globs every other root-level YAML file to catch helpers like service-headless.yaml, ingressroute-tcp.yaml, etc. without you having to update the script.

DB-specific commands (from _db-ctl.sh)

connection-string [external|internal]
print-ca
detach-storage
attach-storage [n]
dump [file]
restore <file>

_db-ctl.sh in full (about 130 lines):

#!/usr/bin/env bash
# _db-ctl.sh — Shared database management library

DB_CTL_DIR="\((cd "\)(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$DB_CTL_DIR/_app-ctl.sh"

cmd_connection_string() {
  local scope="${1:-external}"
  if ! declare -F _db_connection_string >/dev/null; then
    err "This DB's setup.sh has not defined _db_connection_string()"
    exit 1
  fi
  _db_connection_string "$scope"
}

cmd_dump() {
  local out="\({1:-\){APP}-$(date +%Y%m%d-%H%M%S).dump}"
  if ! declare -F _db_dump_cmd >/dev/null; then
    err "This DB's setup.sh has not defined _db_dump_cmd()"
    exit 1
  fi
  _db_dump_cmd "$out"
  ok "Dump complete: $out"
}

cmd_restore() {
  local in="${1:-}"
  [[ -z "\(in" || ! -f "\)in" ]] && { err "Usage: ./setup.sh restore <dump-file>"; exit 1; }
  if ! declare -F _db_restore_cmd >/dev/null; then
    err "This DB's setup.sh has not defined _db_restore_cmd()"
    exit 1
  fi
  warn "This may overwrite existing data."
  read -r -p "  Type 'yes' to confirm: " confirm
  [[ "$confirm" != "yes" ]] && echo "  Aborted." && return 0
  _db_restore_cmd "$in"
  ok "Restore complete"
}

cmd_detach_storage() {
  local kind; kind="$(_detect_workload_kind)"
  kubectl scale "\(kind/\)APP" -n "$NAMESPACE" --replicas=0
  kubectl rollout status "\(kind/\)APP" -n "$NAMESPACE" --timeout=120s || true
  ok "Detached. Reattach with: ./setup.sh attach-storage [replicas]"
}

cmd_attach_storage() {
  local replicas="${1:-1}"
  local kind; kind="$(_detect_workload_kind)"
  kubectl scale "\(kind/\)APP" -n "\(NAMESPACE" --replicas="\)replicas"
  kubectl rollout status "\(kind/\)APP" -n "$NAMESPACE" --timeout=300s
  ok "Reattached at ${replicas} replica(s)"
}

db_main() {
  case "${1:-help}" in
    connection-string) shift; cmd_connection_string "${1:-external}" ;;
    print-ca)          cmd_print_ca ;;
    detach-storage)    cmd_detach_storage ;;
    attach-storage)    shift; cmd_attach_storage "${1:-1}" ;;
    dump)              shift; cmd_dump "${1:-}" ;;
    restore)           shift; cmd_restore "${1:-}" ;;
    *)                 main "$@" ;;        # Fall through to _app-ctl.sh
  esac
}

MongoDB's setup.sh — only ~80 lines

#!/usr/bin/env bash
set -euo pipefail

APP="mongodb"
NAMESPACE="databases"
CONTAINER_PORT="27017"
HAS_PVC=true
HAS_SECRET=true
HAS_CONFIGMAP=true

DEPLOY_DIR="\((cd "\)(dirname "${BASH_SOURCE[0]}")" && pwd)"

# Walk up to find k3s/scripts/_db-ctl.sh
_find_scripts() {
  local d="$1"
  while [[ "$d" != "/" ]]; do
    [[ -d "\(d/scripts" && -f "\)d/scripts/_db-ctl.sh" ]] && echo "$d/scripts" && return
    d="\((dirname "\)d")"
  done
}
SCRIPTS_DIR="\((_find_scripts "\)DEPLOY_DIR")"
source "$SCRIPTS_DIR/_db-ctl.sh"

# ─── MongoDB-specific hooks ──────────────────────────────────────────────────

_mongo_password() {
  kubectl get secret mongodb-secret -n "$NAMESPACE" \
    -o jsonpath='{.data.MONGO_INITDB_ROOT_PASSWORD}' | base64 -d | \
    sed 's/!/%21/g; s/@/%40/g; s/:/%3A/g; s|/|%2F|g'
}

_db_connection_string() {
  local scope="${1:-external}"
  local user pass
  user="\((kubectl get secret mongodb-secret -n "\)NAMESPACE" -o jsonpath='{.data.MONGO_INITDB_ROOT_USERNAME}' | base64 -d)"
  pass="$(_mongo_password)"

  case "$scope" in
    external)
      echo "mongodb://\({user}:\){pass}@mongo-0.home.ijlalahmad.dev,mongo-1.home.ijlalahmad.dev,mongo-2.home.ijlalahmad.dev:27017/?replicaSet=rs0&tls=true&authSource=admin"
      info "TLS uses Let's Encrypt — no CA file needed."
      ;;
    internal)
      echo "mongodb://\({user}:\){pass}@mongodb.databases.svc.cluster.local:27017/?replicaSet=rs0&authSource=admin"
      ;;
  esac
}

_db_dump_cmd() {
  local out="$1"
  local pod; pod="$(_get_pod)"
  kubectl exec -n "\(NAMESPACE" "\)pod" -- bash -c \
    'mongodump --quiet --uri="mongodb://\({MONGO_INITDB_ROOT_USERNAME}:\){MONGO_INITDB_ROOT_PASSWORD}@localhost:27017/?authSource=admin" --archive --gzip' > "$out"
}

_db_restore_cmd() {
  local in="$1"
  local pod; pod="$(_get_pod)"
  kubectl exec -i -n "\(NAMESPACE" "\)pod" -- bash -c \
    'mongorestore --quiet --uri="mongodb://\({MONGO_INITDB_ROOT_USERNAME}:\){MONGO_INITDB_ROOT_PASSWORD}@localhost:27017/?authSource=admin" --archive --gzip --drop' < "$in"
}

db_main "$@"

That's it. To add Postgres or Redis later, you copy this file, change APP, and rewrite the three hook functions. Generic commands work unchanged.


12. Wiring it into ArgoCD

A single ApplicationSet generates one ArgoCD Application per directory entry:

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: homelab
  namespace: argocd
spec:
  goTemplate: true
  generators:
    - list:
        elements:
          - { name: mongodb,  namespace: databases, path: k3s/databases/mongodb }
          - { name: postgres, namespace: databases, path: k3s/databases/postgres }
          - { name: redis,    namespace: databases, path: k3s/databases/redis }
          # ... your other apps ...
  template:
    metadata:
      name: '{{ .name }}'
      namespace: argocd
    spec:
      project: default
      source:
        repoURL: https://github.com/yourname/Home-Server-Lab
        targetRevision: main
        path: '{{ .path }}'
      destination:
        server: https://kubernetes.default.svc
        namespace: '{{ .namespace }}'
      syncPolicy:
        automated:
          prune: true
          selfHeal: true
        syncOptions:
          - CreateNamespace=true

The non-recursive trap. ArgoCD's directory source is not recursive by default. If you nest a services/ subfolder, ArgoCD silently skips it and you spend 45 minutes wondering why your IngressRouteTCP isn't applying. Either flatten (what we did) or add directory: { recurse: true } to the source.

Once committed, ArgoCD polls git, sees the new manifests, applies them, and reconciles drift forever after. Your setup.sh and ArgoCD are two paths to the same end state — setup.sh is for the moments you want to redeploy without waiting.


13. Connection strings — the payoff

After all that work, here's what you actually paste. Replace home.your-domain.tld with whatever you used in the certificate's SAN list. If you went with Option B (no public domain) substitute your .local names; if you went with Option C, substitute your DNS-only names.

External (TLS via Let's Encrypt — for DataGrip, Compass, your laptop, external apps):

mongodb://admin:YOURPASS@mongo-0.home.your-domain.tld,mongo-1.home.your-domain.tld,mongo-2.home.your-domain.tld:27017/?replicaSet=rs0&tls=true&authSource=admin

No CA file. No truststore configuration. No "trust this self-signed cert" dialog. Just paste and connect. URL-encode any special characters in the password (!%21, @%40, etc.).

Without TLS (Option B, no domain): drop tls=true and replace the comma-separated hostnames with your local DNS names, e.g. mongo-0.lab.local. The driver still discovers the replica set the same way.

Internal (in-cluster apps, plain TCP):

mongodb://admin:YOURPASS@mongodb.databases.svc.cluster.local:27017/?replicaSet=rs0&authSource=admin

You can quickly verify both:

# From your laptop (using local mongosh or docker)
docker run --rm mongo:7 mongosh "mongodb://admin:...@mongo-0...&tls=true&authSource=admin" \
  --quiet --eval 'db.hello().me'
# → mongo-0.home.ijlalahmad.dev:27017

# From inside the cluster
kubectl -n databases run test --rm -i --restart=Never --image=mongo:7 -- \
  mongosh "mongodb://admin:...@mongodb.databases.svc.cluster.local:27017/?replicaSet=rs0&authSource=admin" \
  --quiet --eval 'db.hello().primary'

If both return a hostname instead of an error, you've made it.


14. Operating it

Daily commands

cd k3s/databases/mongodb

./setup.sh status                   # pods, RS health, PVCs, events, resource usage
./setup.sh logs                     # tail mongodb-0
./setup.sh shell                    # bash inside mongodb-0
./setup.sh restart                  # rolling restart
./setup.sh redeploy                 # full delete + apply
./setup.sh connection-string external
./setup.sh dump backup-$(date +%F).gz
./setup.sh restore backup.gz
./setup.sh detach-storage           # scale to 0 for fsck/PV work
./setup.sh attach-storage 3

Syncing from a production cluster

The biggest "but does it actually work" question for a homelab DB is whether you can pull production data down. Yes — two patterns.

Pattern A — dump locally, restore via setup.sh:

# Dump prod into a local archive
docker run --rm -v "$PWD":/d mongo:7 \
  mongodump --uri="$PROD_URI" --archive=/d/prod.gz --gzip

cd k3s/databases/mongodb
./setup.sh restore /full/path/prod.gz

Pattern B — stream prod → k3s without touching disk:

docker run --rm mongo:7 mongodump --uri="$PROD_URI" --archive --gzip \
  | kubectl -n databases exec -i mongodb-0 -- mongorestore \
      --uri="mongodb://admin:${PASS_ENC}@localhost:27017/?authSource=admin" \
      --archive --gzip --drop

Mongo replicates the writes from the primary to the secondaries automatically. You'll briefly see oplog catch-up activity in the logs, then the cluster settles.

For per-project user isolation (e.g. kai_user with readWrite@kai_db), create users via mongosh after restore:

use admin
db.createUser({
  user: "kai",
  pwd: "...",
  roles: [{ role: "readWrite", db: "kai" }]
})

Backups

For real backups, schedule a CronJob:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: mongodb-backup
  namespace: databases
spec:
  schedule: "0 3 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: OnFailure
          containers:
            - name: dump
              image: mongo:7
              command:
                - sh
                - -c
                - |
                  mongodump \
                    --uri="mongodb://admin:$PASS@mongodb:27017/?authSource=admin" \
                    --archive=/backups/mongo-$(date +%Y%m%d).gz --gzip
              env:
                - name: PASS
                  valueFrom:
                    secretKeyRef:
                      name: mongodb-secret
                      key: MONGO_INITDB_ROOT_PASSWORD
              volumeMounts:
                - name: backups
                  mountPath: /backups
          volumes:
            - name: backups
              persistentVolumeClaim:
                claimName: mongodb-backups

Pair with rclone to push to S3, B2, or your NAS.


15. What's missing — and what you'd do next

This setup is "homelab-grade production". For a real production rollout you'd extend it with:

Sharding

A replica set is enough for tens of GB. Beyond that — or beyond a single machine's RAM — you'd shard. The skeleton:

  • Three config server replica sets (csrs) — config metadata.

  • N shard replica sets — each a 3-member RS.

  • Two or more mongos routers — the query proxies your apps connect to.

In Kubernetes terms: each becomes its own StatefulSet with its own headless service. The current manifests are the template; you'd parametrize replSetName, dbPath, and the keyfile is shared across all of them. The Bitnami chart automates this if you don't want the YAML maintenance.

Monitoring

Add a mongodb-exporter sidecar to the StatefulSet, scrape with Prometheus, dashboard with Grafana. The MongoDB Community Operator's exporter manifests are a fine starting point.

Resource limits

The current manifest doesn't set requests/limits — fine for a Pi where you're the only tenant. On a multi-tenant cluster, pin them. WiredTiger cache + ~1GB headroom is a safe baseline.

True multi-zone HA

Add topologySpreadConstraints so the three pods land on three different nodes/zones. On a Pi homelab with one node, this is moot. On a multi-node cluster, it's essential.

PodDisruptionBudget

A two-of-three minimum so cluster maintenance can drain at most one mongod at a time:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: mongodb
  namespace: databases
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: mongodb

Per-project users + RBAC

The current setup has one root admin. For multi-tenant use, create role-scoped users per app, and never let apps use the root creds.

Velero / volume snapshots

Logical dumps are great for portability; volume-level snapshots are great for fast recovery. Velero with CSI snapshots gets you both.


16. Closing thoughts

The reason I wrote this is simple: I lost a weekend trying to find the seam between "spin up a Helm chart" tutorials and "operating MongoDB in production" books. Nobody explains the why of the small choices — the SNI passthrough, the allowTLS mode, the per-pod services, the horizons, the chicken-and-egg of Parallel pod management. So you end up with a Bitnami chart you don't understand or a hand-rolled mess that breaks the first time you cycle a pod.

This setup runs on a single Raspberry Pi 5 in my apartment. The same manifests run unchanged on a 6-node Talos cluster I tested it on. It restarts cleanly, survives full cluster reinstalls (PVs are retained, the keyfile and admin creds are sealed in git), and presents a real LE cert that DataGrip just trusts. The shell helpers turn ten-finger kubectl gymnastics into ./setup.sh status. ArgoCD makes the source of truth your git repo.

If you're walking the same path — running real databases in your homelab, learning Kubernetes by doing — I hope this saves you the weekend it cost me.

The full repository with every file shown here, plus the Postgres/Redis/MySQL stubs and the rest of the homelab apps, lives at: https://github.com/yourname/Home-Server-Lab.

Hit me up in the comments if you get stuck. And if you do extend this to sharding or write the monitoring sidecar version — please write the next post in this series. The next homelabber will thank you.

Happy hacking.


Appendix A — .gitignore essentials

# Never commit plaintext secrets
**/secret.yaml

# Local helper output
**/*.dump
**/*.gz

# Cluster state
.kubeconfig

Appendix B — Quick troubleshooting

Symptom Likely cause Fix
Pods stuck Pending forever PV claimRef doesn't match PVC name PVC names are data-mongodb-{0,1,2}. Match exactly.
rs.initiate exit code 23 Already initialized Treat as success (the script already does).
External clients see internal cluster DNS replicaSetHorizons not configured Run rs.reconfig() after first boot.
TLS handshake fails externally SNI mismatch / Traefik missing TCP entrypoint Check Helm values for ports.mongodb.
"InvalidOptions: TLS without specifying chain of trust" mongod 7 wants CAFile even for server-only TLS Point CAFile at /etc/ssl/certs/ca-certificates.crt.
Heartbeat: "SSL peer certificate validation failed" preferTLS/requireTLS + LE cert (no internal SAN) Use mode: allowTLS instead.
ArgoCD applies most files but not the ones in services/ Non-recursive directory source Flatten the folder or set directory.recurse: true.
MongoDB Replica Set on k3s with Let's Encrypt TLS for Homelab