Stop Returning 200 OK for Everything

The Anti-Pattern

A disturbingly common pattern in internal APIs:

GET /items?id=1,2    → 200 OK  (2 results)
GET /items?id=1,2    → 200 OK  (1 result -- one ID didn't exist)
GET /items?id=1,2    → 200 OK  (0 results)
GET /items?id=999    → 200 OK  (empty body)
GET /items?id=abc    → 200 OK  (validation error buried in body)

Every response is 200 OK. The only way to know what actually happened is to parse the body, inspect it, and hope the shape tells you something. This is not REST. This is “HTTP as a transport layer for mystery payloads.”

The Production Incident

Endpoint: GET /xx?id=1,2

What happened:

Client requested two records by passing id=1,2.
Server found only one record (ID 2 existed, ID 1 did not).
Server returned 200 OK with a response body containing one record.
Client code assumed that a 200 meant “all requested records returned successfully.”
Client attempted to map the response to two objects – parse error in production.

Root cause: The status code lied. 200 OK means “the request has succeeded” (RFC 9110). The client had every right to trust it. The API gave no signal that the response was incomplete.

What should have happened: The server should have returned a status code that communicates “I couldn’t fully satisfy your request” – forcing the client to handle the partial result explicitly.

Why This Matters

HTTP status codes are not decoration. They are a contract consumed by:

Consumer	What it does with status codes
Client application code	Branches on 2xx/4xx/5xx to decide success vs. error handling
Retry logic / circuit breakers	Retries on 503, backs off on 429, never retries on 400
Load balancers	Marks backends unhealthy based on 5xx rates
API gateways	Routes, rate-limits, and caches based on status codes
CDN / caching layers	Caches 200 responses; never caches 500s
Monitoring / alerting	Fires alerts when 5xx rate exceeds threshold
Logging / observability	Dashboards aggregate by status code to show error rates

When you return 200 for everything, every one of these systems is blind. Your monitoring shows 0% error rate while production is on fire.

Correct Status Codes for Common Scenarios

Single-Resource Endpoints

Scenario	Correct Code	Meaning
Resource found	`200 OK`	Here is the resource
Resource not found	`404 Not Found`	That ID does not exist
Resource created	`201 Created`	Resource created; `Location` header points to it
Resource deleted	`204 No Content`	Deleted successfully; no body
Invalid input	`400 Bad Request`	Malformed request (bad syntax, missing required fields)
Validation failure	`422 Unprocessable Entity`	Syntactically valid but semantically wrong
Unauthorized	`401 Unauthorized`	No valid credentials provided
Forbidden	`403 Forbidden`	Authenticated but not authorized
Server error	`500 Internal Server Error`	Something broke on our side

Multi-Resource / Batch Endpoints (The Hard Part)

This is where GET /xx?id=1,2 lives. You asked for multiple things. What if some succeed and some don’t?

Scenario	Option A	Option B
All found	`200 OK` with all records	`200 OK`
Some found, some missing	`200 OK` with partial results + explicit metadata	`207 Multi-Status` with per-item status
None found	`404 Not Found`	`200 OK` with empty array + warning
Some found, some errored	`207 Multi-Status`	`200 OK` with error details per item

Recommended Patterns for Multi-ID Endpoints

Pattern 1: Strict – Fail the Whole Request

If any ID is not found, return an error. Simple, safe, forces the client to deal with it.

GET /items?id=1,2

// ID 1 not found:
HTTP/1.1 404 Not Found
{
  "error": "not_found",
  "message": "The following IDs were not found: [1]",
  "missing_ids": [1]
}

Pros: Impossible to silently lose data. Client must handle it. Cons: One missing record blocks the entire request. Can be frustrating for best-effort use cases.

Pattern 2: Lenient – Return What You Have, Signal What’s Missing

Return available records with 200, but include metadata so the client knows the response is partial.

GET /items?id=1,2

HTTP/1.1 200 OK
{
  "requested_ids": [1, 2],
  "returned_count": 1,
  "missing_ids": [1],
  "data": [
    { "id": 2, "name": "Widget B" }
  ]
}

The key: requested_ids, returned_count, and missing_ids make it impossible for the client to silently ignore the gap.

Pros: Partial data is usable. Missing items are explicit. Cons: Lazy clients may still ignore the metadata (but that’s their bug, not yours).

Pattern 3: Multi-Status (207) – Per-Item Status

Best for batch operations where each item can independently succeed or fail. Used by WebDAV, Microsoft Graph API, and others.

GET /items?id=1,2

HTTP/1.1 207 Multi-Status
{
  "results": [
    { "id": 1, "status": 404, "error": "not_found" },
    { "id": 2, "status": 200, "data": { "id": 2, "name": "Widget B" } }
  ]
}

Pros: Maximum clarity. Each item has its own status code. Cons: More complex response structure. 207 is less universally understood.

Pattern 4: 206 Partial Content

206 is traditionally used for range requests (byte ranges in file downloads), but some APIs repurpose it to signal “I’m returning less than you asked for.”

GET /items?id=1,2

HTTP/1.1 206 Partial Content
{
  "data": [
    { "id": 2, "name": "Widget B" }
  ],
  "missing_ids": [1]
}

Pros: Status code itself signals incompleteness – hard to ignore. Cons: Pedants will argue 206 is only for byte-range requests (they’re technically right per RFC 9110).

What Would Have Prevented the Production Incident

Any of these would have caught the problem:

Approach	How it helps
Pattern 1 (404)	Client gets a 404, error handling kicks in, no parse error
Pattern 2 (metadata)	Client checks `returned_count !== requested_ids.length`, handles gracefully
Pattern 3 (207)	Client sees per-item status, knows ID 1 was missing
Client-side validation	Client compares response array length to request array length (defensive coding – but the API should not rely on this)

The worst option is what actually happened: 200 OK with a silently incomplete body and no metadata.

How Mature APIs Handle This

Stripe

Returns 404 if a single resource is not found. Batch endpoints return arrays with individual error objects.

Google APIs

Uses standard HTTP codes. Batch requests return 207-style responses with per-item status.

Microsoft Graph

Explicitly uses 207 Multi-Status for batch operations with per-item HTTP status codes.

GitHub API

Returns 404 for missing resources. Multi-item endpoints return arrays and document when results may be partial (e.g., paginated with Link headers).

The Argument to Your Team

“But we always parse the body anyway, so what does the status code matter?”

You are not the only consumer. Load balancers, monitoring, caches, API gateways, and future clients all use status codes. They don’t parse your body.
200 OK means the contract was fulfilled. If you return 200, you are saying “everything you asked for is here and correct.” If that’s a lie, you have broken the HTTP contract.
Silent failures are the most expensive bugs. The production incident happened because the failure was invisible. A proper status code would have made it loud and immediate.
Defensive client code is not a substitute. Yes, clients should validate responses. But relying on clients to compensate for a lying API is designing for failure.
Every major API in the industry does this correctly. Stripe, Google, AWS, GitHub, Microsoft – none of them return 200 for partial failures. There’s a reason for that.

Quick Reference

200 OK             → Request fully succeeded. Response contains what was asked for.
201 Created        → Resource created. Location header included.
204 No Content     → Success, but nothing to return (e.g., DELETE).
206 Partial Content→ Only part of the resource is returned (range requests).
207 Multi-Status   → Batch response; each item has its own status.

400 Bad Request    → Malformed request syntax.
401 Unauthorized   → Missing or invalid authentication.
403 Forbidden      → Authenticated but not permitted.
404 Not Found      → Resource does not exist.
409 Conflict       → Request conflicts with current state (e.g., duplicate).
422 Unprocessable  → Valid syntax but invalid semantics.
429 Too Many Req.  → Rate limited. Retry-After header included.

500 Server Error   → Something broke on the server side.
502 Bad Gateway    → Upstream service returned invalid response.
503 Unavailable    → Server temporarily unavailable (maintenance, overload).
504 Gateway Timeout→ Upstream service did not respond in time.

The Anti-Pattern#

The Production Incident#

Why This Matters#

Correct Status Codes for Common Scenarios#

Single-Resource Endpoints#

Multi-Resource / Batch Endpoints (The Hard Part)#

Recommended Patterns for Multi-ID Endpoints#

Pattern 1: Strict – Fail the Whole Request#

Pattern 2: Lenient – Return What You Have, Signal What’s Missing#

Pattern 3: Multi-Status (207) – Per-Item Status#

Pattern 4: 206 Partial Content#

What Would Have Prevented the Production Incident#

How Mature APIs Handle This#

Stripe#

Google APIs#

Microsoft Graph#

GitHub API#

The Argument to Your Team#

Quick Reference#