I didn’t cause this one. I just kept finding it.
Internal apps – reporting jobs, admin dashboards, ML feature pulls, finance batches – pointed at the primary database. Same connection string as the checkout path. Same connection pool. Same row locks. The first time I saw it I assumed someone had missed a config step. Then I saw it at the next employer. Then at the one after that.
That’s the visible symptom. But the database is just one layer of a bigger pattern. “Internal API” gets treated as a label that lives in a URL prefix or a Swagger tag. The runtime path – the gateway it enters through, the pods that serve it, the database it reads from – is the same path the public consumer takes. The label is the only thing that isn’t shared.
This post is about how to fix that end to end. Not just the gateway. The whole stack.
PUBLIC TRAFFIC INTERNAL TRAFFIC
────────────── ────────────────
Browser / public client Dashboard / batch / ops tool
│ │
▼ ▼
┌───────────────────┐ ┌────────────────────────┐
│ Public gateway │ │ Internal gateway │
│ public DNS, OIDC │ │ cluster DNS only │
│ user rate limits │ │ batch rate limits │
└─────────┬─────────┘ └───────────┬────────────┘
│ │
▼ same image ▼
┌───────────────────┐ different pods ┌────────────────────────┐
│ api-public │ ─ ─ ─ ─ ─ ─ ─ ─ ▶ │ api-internal │
│ Deployment │ │ Deployment │
│ 6 replicas │ │ 2 replicas │
└─────────┬─────────┘ └───────────┬────────────┘
│ │
▼ ▼
┌───────────────────┐ ┌────────────────────────┐
│ Primary DB │ │ Read replica │
│ rw credentials │ │ ro credentials only │
└───────────────────┘ └────────────────────────┘
Each layer is independent. Same image, separate runtime.
Shared APIs (catalog, user lookup) live below this picture —
both columns call into them through their own gateway and pods.
It’s Not Just the Gateway
The entry-point version of this story is well-rehearsed: an internal route – /internal/reindex, /admin/cache-flush, pick your favorite – ends up reachable from the public Azure APIM (or Kong, or AWS API Gateway). I’ve seen it more than once. The leak vectors are unglamorous: a product scope set to the wrong API group, a default-allow policy nobody overrode, a “subscription required” toggle left off, a route published to the public collection by accident. The auth on the leaked route is usually “nobody knows the URL.” That isn’t an auth model.
The standard advice is: stand up a second gateway instance for internal traffic, with its own auth issuer, its own rate limits, and a DNS name that doesn’t resolve outside the cluster or VPC. That advice is correct. It’s also incomplete.
Because even when the gateway is right, the pods behind it are shared with the consumer-facing API. The database behind those pods is shared. The cache is shared. You’ve moved the leak risk off the perimeter and onto the runtime. The internal call still travels through the same fabric as the checkout call, and a noisy internal client still affects consumer latency.
Treat the gateway as the first hop, not the only one.
Same Image, Different Runtime
In Kubernetes the cheapest separation costs about twenty lines of YAML. One container image, one Deployment template, two Service objects pointing at it – one for public traffic, one for internal. They use different selectors and different in-cluster DNS names. An internal client calls api-internal.svc.cluster.local; the public ingress hits api-public.svc.cluster.local.
That alone is honest enough to start, but it doesn’t isolate compute. The pods are still shared. The version that actually pays off has two Deployment objects sharing the same image:
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-public
spec:
replicas: 6
selector:
matchLabels: { app: api, tier: public }
template:
metadata:
labels: { app: api, tier: public }
spec:
containers:
- name: api
image: registry.internal/api:1.42.0
resources:
requests: { cpu: 500m, memory: 512Mi }
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-internal
spec:
replicas: 2
selector:
matchLabels: { app: api, tier: internal }
template:
metadata:
labels: { app: api, tier: internal }
spec:
containers:
- name: api
image: registry.internal/api:1.42.0
resources:
requests: { cpu: 200m, memory: 256Mi }
---
apiVersion: v1
kind: Service
metadata: { name: api-public }
spec:
selector: { app: api, tier: public }
ports: [{ port: 80, targetPort: 8080 }]
---
apiVersion: v1
kind: Service
metadata: { name: api-internal }
spec:
selector: { app: api, tier: internal }
ports: [{ port: 80, targetPort: 8080 }]
Same image, same code, two separate pod pools with their own resource budgets and replica counts. A nightly export that pegs CPU on the internal pods can’t evict checkout traffic from the public pods. The runtime path is the actual boundary.
If you can’t justify two Deployments today, start with one Deployment plus two Services. The upgrade is mechanical: copy the Deployment, change the labels, point one Service at each. Doing this on day one is much cheaper than doing it later under load.
Internal Reads Belong on the Replica
This is where the opening hook lands. The primary database is for writes and for latency-sensitive consumer reads. Internal workloads are almost never either of those. Reporting jobs, dashboards, exports, ops queries, ML feature pulls – they’re read-heavy and tolerant of replication lag measured in seconds. They belong on a read replica.
The mistake I keep seeing isn’t ignorance. The engineers building internal apps know read replicas exist. They just inherit the same connection string the rest of the codebase uses, and “the rest of the codebase” defaults to the primary. The fix isn’t a code review note. It’s a credential boundary.
# Public app secret
DATABASE_URL=postgres://app_rw:***@db-primary.internal:5432/app
# Internal app secret
DATABASE_READ_URL=postgres://app_ro:***@db-replica.internal:5432/app
# No DATABASE_URL. No credentials for the primary.
The internal app’s secret doesn’t contain primary credentials at all. It physically cannot connect to the primary. If a reporting job tries to take a row lock on the production table, it gets a connection error, not an outage. The constraint enforces itself.
This costs almost nothing – a separate user in Postgres, a separate Kubernetes Secret, and a config flag in the app. Compared to debugging a checkout slowdown caused by a nightly batch, it is the most leveraged thirty minutes of work in this whole post.
When the API Really Is Shared
Some APIs are genuinely consumed by both public and internal callers. Product catalog. User lookup. Pricing. You can’t run two copies of those – the data is the same and the logic is the same.
The patterns for that case are well known. None of them is the “right answer”; it depends on what you can afford to refactor:
- Bulkhead. Same image, two Deployments by tier, two Services. Identical code, isolated capacity. Cheapest to adopt. Doesn’t address shared downstream contention (DB, cache).
- Facade split. Extract the shared logic into a library or a lower-level service. Build a thin public facade and a thin internal facade on top, each owning its own gateway, auth, and rate limits. Cleanest long-term separation; highest refactor cost.
- Async decoupling. Internal callers don’t hit the sync path at all. They publish to a queue; a separate consumer pool handles the request and writes results back. Works when internal callers tolerate a delay (most can).
- Mesh caller-class. Single Deployment, but service mesh policies enforce per-caller rate limits, priority, and circuit breakers based on caller identity. Cheapest YAML-wise. Isolates failure modes, not compute.
The point isn’t to pick one. The point is that “shared API” doesn’t mean “shared everything.” Pick the version that matches the constraint that actually bites you – is it capacity, deployment topology, latency tolerance, or ops surface area? Different teams converge on different answers. None of them converge on “one Deployment with no isolation.”
The Internal Share Grows
Here is the argument that makes this whole post worth doing on day one rather than day five hundred.
Internal apps almost always get written second. The consumer-facing service ships first. Then someone needs a dashboard. Then exports for finance. Then an ops tool for support. Then a reporting pipeline. Then a feature store rebuild. Each of these starts at a fraction of a percent of traffic, which is why each one is built against the existing stack: “it’s just a small read.”
Six months later there are eight of them. Twelve months later the nightly batches alone outweigh peak consumer load. By the time the internal share dominates – and on a bad day it does, by an order of magnitude – the topology you sketched up front is the topology you’ve shipped. If everything is fused, you don’t have a tuning problem. You have a refactor.
This is why the cheapest version of the separation – one Deployment, two Services, a replica-only credential for the internal app – is worth doing before there’s a second internal app to justify it. The upgrade from “two Services” to “two Deployments with different resources and replica counts” is a five-minute YAML change. The upgrade from “one fused stack serving everything” to “two stacks” is a quarter.
Do the cheap version while it’s still cheap.
The Honest Caveats
This isn’t a free architecture. Three things worth being honest about:
Replication lag is real. If an internal endpoint reads back its own write – an ops tool that updates a record and immediately re-fetches it – the replica will return stale data. Either route those specific endpoints to the primary, or design the workflow around eventual consistency. Don’t pretend the lag doesn’t exist.
Two stacks mean two sets of ops toil. Two Deployments, two Services, two sets of dashboards, two health checks, two rollout policies. For a small internal surface area today, one Deployment plus two Services plus tight gateway scoping is enough. The argument in the previous section is that “today” is misleading – pick the cheap version now so growth doesn’t force a refactor. But don’t jump straight to four-tier separation when you have one internal job.
Network boundary is not a trust boundary. A leaked credential or a compromised CI runner inside the cluster still reaches the internal gateway. The whole point of “internal-only DNS” is to reduce blast radius from external attackers – it doesn’t reduce blast radius from internal compromise. Auth on the internal API still matters. Don’t let the word “internal” do work that authentication should be doing.
Before You Call It Internal
Five questions. If the answer to any of them is no, you have a coupled API with an internal-sounding URL, not an internal API.
- Does it route through a gateway that only serves internal traffic?
- Does its DNS name resolve only inside the cluster or VPC?
- Does it run on a separate
Service– ideally separate pods – from consumer traffic? - Does it hit a read replica by default, with no credentials for the primary?
- If it died right now, would a public consumer notice? If yes, it isn’t internal. It’s coupled.