The Anti-Pattern
A disturbingly common pattern in internal APIs:
GET /items?id=1,2 → 200 OK (2 results)
GET /items?id=1,2 → 200 OK (1 result -- one ID didn't exist)
GET /items?id=1,2 → 200 OK (0 results)
GET /items?id=999 → 200 OK (empty body)
GET /items?id=abc → 200 OK (validation error buried in body)
Every response is 200 OK. The only way to know what actually happened is to parse the body, inspect it, and hope the shape tells you something. This is not REST. This is “HTTP as a transport layer for mystery payloads.”
The Production Incident
Endpoint: GET /xx?id=1,2
What happened:
- Client requested two records by passing
id=1,2. - Server found only one record (ID 2 existed, ID 1 did not).
- Server returned
200 OKwith a response body containing one record. - Client code assumed that a
200meant “all requested records returned successfully.” - Client attempted to map the response to two objects – parse error in production.
Root cause: The status code lied. 200 OK means “the request has succeeded” (RFC 9110). The client had every right to trust it. The API gave no signal that the response was incomplete.
What should have happened: The server should have returned a status code that communicates “I couldn’t fully satisfy your request” – forcing the client to handle the partial result explicitly.
Why This Matters
HTTP status codes are not decoration. They are a contract consumed by:
| Consumer | What it does with status codes |
|---|---|
| Client application code | Branches on 2xx/4xx/5xx to decide success vs. error handling |
| Retry logic / circuit breakers | Retries on 503, backs off on 429, never retries on 400 |
| Load balancers | Marks backends unhealthy based on 5xx rates |
| API gateways | Routes, rate-limits, and caches based on status codes |
| CDN / caching layers | Caches 200 responses; never caches 500s |
| Monitoring / alerting | Fires alerts when 5xx rate exceeds threshold |
| Logging / observability | Dashboards aggregate by status code to show error rates |
When you return 200 for everything, every one of these systems is blind. Your monitoring shows 0% error rate while production is on fire.
Correct Status Codes for Common Scenarios
Single-Resource Endpoints
| Scenario | Correct Code | Meaning |
|---|---|---|
| Resource found | 200 OK | Here is the resource |
| Resource not found | 404 Not Found | That ID does not exist |
| Resource created | 201 Created | Resource created; Location header points to it |
| Resource deleted | 204 No Content | Deleted successfully; no body |
| Invalid input | 400 Bad Request | Malformed request (bad syntax, missing required fields) |
| Validation failure | 422 Unprocessable Entity | Syntactically valid but semantically wrong |
| Unauthorized | 401 Unauthorized | No valid credentials provided |
| Forbidden | 403 Forbidden | Authenticated but not authorized |
| Server error | 500 Internal Server Error | Something broke on our side |
Multi-Resource / Batch Endpoints (The Hard Part)
This is where GET /xx?id=1,2 lives. You asked for multiple things. What if some succeed and some don’t?
| Scenario | Option A | Option B |
|---|---|---|
| All found | 200 OK with all records | 200 OK |
| Some found, some missing | 200 OK with partial results + explicit metadata | 207 Multi-Status with per-item status |
| None found | 404 Not Found | 200 OK with empty array + warning |
| Some found, some errored | 207 Multi-Status | 200 OK with error details per item |
Recommended Patterns for Multi-ID Endpoints
Pattern 1: Strict – Fail the Whole Request
If any ID is not found, return an error. Simple, safe, forces the client to deal with it.
GET /items?id=1,2
// ID 1 not found:
HTTP/1.1 404 Not Found
{
"error": "not_found",
"message": "The following IDs were not found: [1]",
"missing_ids": [1]
}
Pros: Impossible to silently lose data. Client must handle it. Cons: One missing record blocks the entire request. Can be frustrating for best-effort use cases.
Pattern 2: Lenient – Return What You Have, Signal What’s Missing
Return available records with 200, but include metadata so the client knows the response is partial.
GET /items?id=1,2
HTTP/1.1 200 OK
{
"requested_ids": [1, 2],
"returned_count": 1,
"missing_ids": [1],
"data": [
{ "id": 2, "name": "Widget B" }
]
}
The key: requested_ids, returned_count, and missing_ids make it impossible for the client to silently ignore the gap.
Pros: Partial data is usable. Missing items are explicit. Cons: Lazy clients may still ignore the metadata (but that’s their bug, not yours).
Pattern 3: Multi-Status (207) – Per-Item Status
Best for batch operations where each item can independently succeed or fail. Used by WebDAV, Microsoft Graph API, and others.
GET /items?id=1,2
HTTP/1.1 207 Multi-Status
{
"results": [
{ "id": 1, "status": 404, "error": "not_found" },
{ "id": 2, "status": 200, "data": { "id": 2, "name": "Widget B" } }
]
}
Pros: Maximum clarity. Each item has its own status code. Cons: More complex response structure. 207 is less universally understood.
Pattern 4: 206 Partial Content
206 is traditionally used for range requests (byte ranges in file downloads), but some APIs repurpose it to signal “I’m returning less than you asked for.”
GET /items?id=1,2
HTTP/1.1 206 Partial Content
{
"data": [
{ "id": 2, "name": "Widget B" }
],
"missing_ids": [1]
}
Pros: Status code itself signals incompleteness – hard to ignore. Cons: Pedants will argue 206 is only for byte-range requests (they’re technically right per RFC 9110).
What Would Have Prevented the Production Incident
Any of these would have caught the problem:
| Approach | How it helps |
|---|---|
| Pattern 1 (404) | Client gets a 404, error handling kicks in, no parse error |
| Pattern 2 (metadata) | Client checks returned_count !== requested_ids.length, handles gracefully |
| Pattern 3 (207) | Client sees per-item status, knows ID 1 was missing |
| Client-side validation | Client compares response array length to request array length (defensive coding – but the API should not rely on this) |
The worst option is what actually happened: 200 OK with a silently incomplete body and no metadata.
How Mature APIs Handle This
Stripe
Returns 404 if a single resource is not found. Batch endpoints return arrays with individual error objects.
Google APIs
Uses standard HTTP codes. Batch requests return 207-style responses with per-item status.
Microsoft Graph
Explicitly uses 207 Multi-Status for batch operations with per-item HTTP status codes.
GitHub API
Returns 404 for missing resources. Multi-item endpoints return arrays and document when results may be partial (e.g., paginated with Link headers).
The Argument to Your Team
“But we always parse the body anyway, so what does the status code matter?”
You are not the only consumer. Load balancers, monitoring, caches, API gateways, and future clients all use status codes. They don’t parse your body.
200 OK means the contract was fulfilled. If you return 200, you are saying “everything you asked for is here and correct.” If that’s a lie, you have broken the HTTP contract.
Silent failures are the most expensive bugs. The production incident happened because the failure was invisible. A proper status code would have made it loud and immediate.
Defensive client code is not a substitute. Yes, clients should validate responses. But relying on clients to compensate for a lying API is designing for failure.
Every major API in the industry does this correctly. Stripe, Google, AWS, GitHub, Microsoft – none of them return 200 for partial failures. There’s a reason for that.
Quick Reference
200 OK → Request fully succeeded. Response contains what was asked for.
201 Created → Resource created. Location header included.
204 No Content → Success, but nothing to return (e.g., DELETE).
206 Partial Content→ Only part of the resource is returned (range requests).
207 Multi-Status → Batch response; each item has its own status.
400 Bad Request → Malformed request syntax.
401 Unauthorized → Missing or invalid authentication.
403 Forbidden → Authenticated but not permitted.
404 Not Found → Resource does not exist.
409 Conflict → Request conflicts with current state (e.g., duplicate).
422 Unprocessable → Valid syntax but invalid semantics.
429 Too Many Req. → Rate limited. Retry-After header included.
500 Server Error → Something broke on the server side.
502 Bad Gateway → Upstream service returned invalid response.
503 Unavailable → Server temporarily unavailable (maintenance, overload).
504 Gateway Timeout→ Upstream service did not respond in time.