The Anti-Pattern

A disturbingly common pattern in internal APIs:

GET /items?id=1,2    → 200 OK  (2 results)
GET /items?id=1,2    → 200 OK  (1 result -- one ID didn't exist)
GET /items?id=1,2    → 200 OK  (0 results)
GET /items?id=999    → 200 OK  (empty body)
GET /items?id=abc    → 200 OK  (validation error buried in body)

Every response is 200 OK. The only way to know what actually happened is to parse the body, inspect it, and hope the shape tells you something. This is not REST. This is “HTTP as a transport layer for mystery payloads.”

The Production Incident

Endpoint: GET /xx?id=1,2

What happened:

  1. Client requested two records by passing id=1,2.
  2. Server found only one record (ID 2 existed, ID 1 did not).
  3. Server returned 200 OK with a response body containing one record.
  4. Client code assumed that a 200 meant “all requested records returned successfully.”
  5. Client attempted to map the response to two objects – parse error in production.

Root cause: The status code lied. 200 OK means “the request has succeeded” (RFC 9110). The client had every right to trust it. The API gave no signal that the response was incomplete.

What should have happened: The server should have returned a status code that communicates “I couldn’t fully satisfy your request” – forcing the client to handle the partial result explicitly.

Why This Matters

HTTP status codes are not decoration. They are a contract consumed by:

ConsumerWhat it does with status codes
Client application codeBranches on 2xx/4xx/5xx to decide success vs. error handling
Retry logic / circuit breakersRetries on 503, backs off on 429, never retries on 400
Load balancersMarks backends unhealthy based on 5xx rates
API gatewaysRoutes, rate-limits, and caches based on status codes
CDN / caching layersCaches 200 responses; never caches 500s
Monitoring / alertingFires alerts when 5xx rate exceeds threshold
Logging / observabilityDashboards aggregate by status code to show error rates

When you return 200 for everything, every one of these systems is blind. Your monitoring shows 0% error rate while production is on fire.

Correct Status Codes for Common Scenarios

Single-Resource Endpoints

ScenarioCorrect CodeMeaning
Resource found200 OKHere is the resource
Resource not found404 Not FoundThat ID does not exist
Resource created201 CreatedResource created; Location header points to it
Resource deleted204 No ContentDeleted successfully; no body
Invalid input400 Bad RequestMalformed request (bad syntax, missing required fields)
Validation failure422 Unprocessable EntitySyntactically valid but semantically wrong
Unauthorized401 UnauthorizedNo valid credentials provided
Forbidden403 ForbiddenAuthenticated but not authorized
Server error500 Internal Server ErrorSomething broke on our side

Multi-Resource / Batch Endpoints (The Hard Part)

This is where GET /xx?id=1,2 lives. You asked for multiple things. What if some succeed and some don’t?

ScenarioOption AOption B
All found200 OK with all records200 OK
Some found, some missing200 OK with partial results + explicit metadata207 Multi-Status with per-item status
None found404 Not Found200 OK with empty array + warning
Some found, some errored207 Multi-Status200 OK with error details per item

Pattern 1: Strict – Fail the Whole Request

If any ID is not found, return an error. Simple, safe, forces the client to deal with it.

GET /items?id=1,2

// ID 1 not found:
HTTP/1.1 404 Not Found
{
  "error": "not_found",
  "message": "The following IDs were not found: [1]",
  "missing_ids": [1]
}

Pros: Impossible to silently lose data. Client must handle it. Cons: One missing record blocks the entire request. Can be frustrating for best-effort use cases.

Pattern 2: Lenient – Return What You Have, Signal What’s Missing

Return available records with 200, but include metadata so the client knows the response is partial.

GET /items?id=1,2

HTTP/1.1 200 OK
{
  "requested_ids": [1, 2],
  "returned_count": 1,
  "missing_ids": [1],
  "data": [
    { "id": 2, "name": "Widget B" }
  ]
}

The key: requested_ids, returned_count, and missing_ids make it impossible for the client to silently ignore the gap.

Pros: Partial data is usable. Missing items are explicit. Cons: Lazy clients may still ignore the metadata (but that’s their bug, not yours).

Pattern 3: Multi-Status (207) – Per-Item Status

Best for batch operations where each item can independently succeed or fail. Used by WebDAV, Microsoft Graph API, and others.

GET /items?id=1,2

HTTP/1.1 207 Multi-Status
{
  "results": [
    { "id": 1, "status": 404, "error": "not_found" },
    { "id": 2, "status": 200, "data": { "id": 2, "name": "Widget B" } }
  ]
}

Pros: Maximum clarity. Each item has its own status code. Cons: More complex response structure. 207 is less universally understood.

Pattern 4: 206 Partial Content

206 is traditionally used for range requests (byte ranges in file downloads), but some APIs repurpose it to signal “I’m returning less than you asked for.”

GET /items?id=1,2

HTTP/1.1 206 Partial Content
{
  "data": [
    { "id": 2, "name": "Widget B" }
  ],
  "missing_ids": [1]
}

Pros: Status code itself signals incompleteness – hard to ignore. Cons: Pedants will argue 206 is only for byte-range requests (they’re technically right per RFC 9110).

What Would Have Prevented the Production Incident

Any of these would have caught the problem:

ApproachHow it helps
Pattern 1 (404)Client gets a 404, error handling kicks in, no parse error
Pattern 2 (metadata)Client checks returned_count !== requested_ids.length, handles gracefully
Pattern 3 (207)Client sees per-item status, knows ID 1 was missing
Client-side validationClient compares response array length to request array length (defensive coding – but the API should not rely on this)

The worst option is what actually happened: 200 OK with a silently incomplete body and no metadata.

How Mature APIs Handle This

Stripe

Returns 404 if a single resource is not found. Batch endpoints return arrays with individual error objects.

Google APIs

Uses standard HTTP codes. Batch requests return 207-style responses with per-item status.

Microsoft Graph

Explicitly uses 207 Multi-Status for batch operations with per-item HTTP status codes.

GitHub API

Returns 404 for missing resources. Multi-item endpoints return arrays and document when results may be partial (e.g., paginated with Link headers).

The Argument to Your Team

“But we always parse the body anyway, so what does the status code matter?”

  1. You are not the only consumer. Load balancers, monitoring, caches, API gateways, and future clients all use status codes. They don’t parse your body.

  2. 200 OK means the contract was fulfilled. If you return 200, you are saying “everything you asked for is here and correct.” If that’s a lie, you have broken the HTTP contract.

  3. Silent failures are the most expensive bugs. The production incident happened because the failure was invisible. A proper status code would have made it loud and immediate.

  4. Defensive client code is not a substitute. Yes, clients should validate responses. But relying on clients to compensate for a lying API is designing for failure.

  5. Every major API in the industry does this correctly. Stripe, Google, AWS, GitHub, Microsoft – none of them return 200 for partial failures. There’s a reason for that.

Quick Reference

200 OK              Request fully succeeded. Response contains what was asked for.
201 Created         Resource created. Location header included.
204 No Content      Success, but nothing to return (e.g., DELETE).
206 Partial Content Only part of the resource is returned (range requests).
207 Multi-Status    Batch response; each item has its own status.

400 Bad Request     Malformed request syntax.
401 Unauthorized    Missing or invalid authentication.
403 Forbidden       Authenticated but not permitted.
404 Not Found       Resource does not exist.
409 Conflict        Request conflicts with current state (e.g., duplicate).
422 Unprocessable   Valid syntax but invalid semantics.
429 Too Many Req.   Rate limited. Retry-After header included.

500 Server Error    Something broke on the server side.
502 Bad Gateway     Upstream service returned invalid response.
503 Unavailable     Server temporarily unavailable (maintenance, overload).
504 Gateway Timeout Upstream service did not respond in time.