HTTP Status Codes: Protocol Semantics and Architecture Patterns

This article covers advanced HTTP status code protocol semantics including cache behavior, retry policies, and connection lifecycle signals. It explores architecture patterns for error propagation, gRPC-to-HTTP mapping, and troubleshooting strategies for distributed systems. Includes RFC 9110, RFC 9111, and RFC 9457 references.

HTTP status codes are three-digit response codes defined in RFC 9110 that encode the result of a request’s processing. At the protocol level, they are machine-readable signals that intermediaries and clients use to determine caching behavior, retry eligibility, and connection lifecycle without inspecting the response body.

How HTTP Status Codes Work at the Protocol Level

The Role of Status Codes in HTTP Semantics

Status codes occupy the first line of an HTTP response and carry no optionality — every response must include one. They are divided into five classes by the first digit, each defining a fundamental category of response semantics. The last two digits provide granularity within the class but carry no structural meaning.

HTTP/1.1 200 OK\r\n
││ └── Reason phrase (informational, no semantic role)
│└──┴── Specific code within class
└── Class (2 = success)

The reason phrase (“OK”, “Not Found”) has no protocol-level function. Clients MUST ignore it for decision-making. Only the numeric code is authoritative.

Status Codes and the HTTP Cache Key

Status codes determine cache behavior through the Cache-Control header and the code’s inherent cacheability. RFC 9111 defines the following default cacheability:

ClassDefault CacheableTypical Max-Age
200, 203, 204, 206Yes (unless Cache-Control forbids)Varies by resource
301, 302Yes (in practice, varies by implementation)Varies
304Yes (conditional response)Validated with origin
400, 403, 404No (explicit opt-in via Cache-Control)N/A
410Yes (gone, can be cached indefinitely)Permanent
429NoN/A
500, 502, 503, 504No (SHOULD NOT be cached by default)N/A

Status Code Extensibility

RFC 9110 reserves the entire 1xx-5xx range but does not restrict registration of new codes within those classes. The IANA HTTP Status Code Registry manages the official set. Custom codes follow semantic rules:

  • 1xx codes are informational (final response not yet available)
  • 2xx codes indicate success
  • 3xx codes require client action to complete the request
  • 4xx codes indicate client-side error
  • 5xx codes indicate server-side error

Status Codes as Protocol Signals

Connection Lifecycle Signals

  • 101 Switching Protocols: Upgrades connection from HTTP to a different protocol (WebSocket, HTTP/2). After this response, the HTTP connection is replaced.
  • 408 Request Timeout: Server closes idle connections. Client should retry on a new connection.
  • 421 Misdirected Request: Request was sent to a server that cannot respond. Client should retry on a different connection or to a different host.

Retry Semantics

CodeRetry BehaviorKey HeaderRetry Safety
408Safe to retry immediatelyNoneIdempotent methods only
429Retry after delayRetry-AfterSafe
503Retry after delayRetry-AfterSafe
502May retryNoneIdempotent methods only
504May retryNoneIdempotent methods only

Clients MUST NOT automatically retry on 400, 401, 403, 404, or 405. These indicate a client-side condition that will not change with repetition.

Advanced Troubleshooting Architecture

Tracing Status Codes Through the Request Path

In distributed systems, a status code is modified at each layer. Understanding which layer generated the code is essential:

Client Request
Edge/CDN Layer (generates 403 for WAF, 502/504 for origin issues)
Load Balancer (generates 503 for no healthy upstream)
API Gateway (maps downstream errors, generates 502)
Application (generates 200/4xx/5xx)
Upstream Dependencies (generates errors that propagate)

The status code the client sees is the LAST code generated along this path. To find the root cause, you must inspect logs at each layer.

Error Propagation Patterns

PatternClient CodeInternal Behavior
Direct propagation502Upstream returned 5xx; gateway passes it as 502
Timeout aggregation504Upstream did not respond within timeout
Circuit breaker503Upstream failure rate exceeded threshold; requests blocked without attempting
Fallback degradation200 (with degraded data)Upstream failed; service returned stale or partial data
Cascade failure503 (entire service)Upstream failure exhausted resources in downstream service

Status Code Architecture Patterns

The Uniform Error Schema

Error responses should follow a consistent schema so clients parse them uniformly:

{
"error": {
"code": "VALIDATION_ERROR",
"status": 422,
"id": "req_abc123",
"detail": "Field 'email' must be a valid email address",
"source": {
"pointer": "/data/attributes/email"
}
}
}

This pattern, based on RFC 9457 (Problem Details for HTTP APIs), separates the machine-readable component (status code) from the human-readable and machine-actionable components in the body.

Status Code Mapping Between Protocols

gRPC to HTTP mapping:

gRPC CodeHTTP Status Code
OK200
CANCELLED499 (custom)
UNKNOWN500
INVALID_ARGUMENT400
DEADLINE_EXCEEDED504
NOT_FOUND404
ALREADY_EXISTS409
PERMISSION_DENIED403
UNAUTHENTICATED401
RESOURCE_EXHAUSTED429
FAILED_PRECONDITION400
ABORTED409
OUT_OF_RANGE400
UNIMPLEMENTED501
INTERNAL500
UNAVAILABLE503
DATA_LOSS500

Metrics and Measurement

  • Status code error budget: Maximum permitted 5xx rate over a rolling window (typical SLO: 99.9% availability = 0.1% 5xx budget)
  • Mean time to acknowledge (MTTA): Time from 5xx alert to engineer acknowledgment (target: <15 minutes for P1)
  • Mean time to resolve (MTTR): Time from alert to error rate normalization (target: <60 minutes for P1)

Industry benchmarks:

  • 99.9% availability allows 8.76 hours of downtime per year (Google SRE)
  • 99.99% availability allows 52.56 minutes of downtime per year (Google SRE)
  • Top causes of 5xx errors: deployment bugs (40%), dependency failures (30%), configuration changes (20%), capacity exhaustion (10%) (catchpoint, 2025)

Common Mistakes and Fixes

Mistake: Returning 200 OK for all responses and putting error information only in the body Fix: Always return the correct status code class. Proxies, CDNs, and monitoring tools read the status code, not the body. Incorrect status codes break caching, alerting, and retry logic.

Mistake: Using 403 when 401 is correct Fix: 401 indicates the client is not authenticated. 403 indicates the client is authenticated but lacks authorization. Using 403 for both prevents clients from knowing whether they need to re-authenticate or request different permissions.

Mistake: Not implementing the Retry-After header for 429 and 503 Fix: Include Retry-After in seconds or HTTP-date format. Without it, clients default to implementation-specific retry delays, often causing retry storms.

Mistake: Caching 4xx responses without explicit opt-in Fix: By default, CDNs and proxies should not cache 4xx responses. If caching is desired (e.g., 410 Gone), set Cache-Control explicitly.

Mistake: Ignoring the difference between 502 and 503 in monitoring Fix: 502 indicates an upstream dependency failure. 503 indicates a local capacity or availability issue. Each triggers a different incident response path.

Advanced Use Cases

Microservices Error Propagation

In a microservices architecture, a single 500 from an upstream service can cascade. Implement the following pattern:

  1. All downstream errors are logged with correlation IDs
  2. The upstream error is mapped to a 502 for the client
  3. Circuit breakers prevent cascading failures by returning 503 immediately when error thresholds are exceeded
  4. Fallback responses (200 with stale data) are preferred over 502 for read endpoints when acceptable

GraphQL Error Handling

GraphQL deviates from standard HTTP status code usage. A single query can partially succeed and partially fail:

{
"data": {
"user": null,
"posts": [...]
},
"errors": [
{
"message": "User not found",
"locations": [{"line": 2, "column": 3}],
"path": ["user"],
"extensions": {
"code": "NOT_FOUND",
"status": 404
}
}
]
}

GraphQL endpoints typically return 200 even when errors occur, with error details in the errors array. Some implementations return 400 for request-level issues. This is a deliberate design choice that sacrifices HTTP semantics for partial-response capability.

Webhook Delivery Status

Webhook delivery uses status codes differently. The delivering system expects 2xx to acknowledge receipt. Any non-2xx triggers retries with exponential backoff:

  • 200: Delivered successfully
  • 202: Received, processing asynchronously
  • 400-499: Client misconfiguration — webhook may be disabled after repeated failures
  • 500-599: Server error — will retry with backoff

Frequently Asked Questions

What is the relationship between HTTP status codes and TCP/IP? HTTP status codes are an application-layer concept. They have no direct relationship with TCP/IP. A successful TCP connection does not guarantee a successful HTTP response, and a TCP reset does not produce an HTTP status code (the client sees status code 0).

Can I define custom HTTP status codes? Yes, any code in the 1xx-5xx range can be used, but it must follow the semantic rules of its class. Custom codes should be registered with IANA for public APIs. For internal APIs, any code within the correct class works, but clients may not handle non-standard codes correctly.

How do HTTP/2 and HTTP/3 affect status codes? HTTP/2 and HTTP/3 carry the same status codes as HTTP/1.1. The status code remains the same. The difference is in connection semantics: HTTP/2 multiplexes multiple streams over a single connection, so a 408 on one stream does not affect others. HTTP/3 uses QUIC, so connection resets do not require TCP renegotiation.

Why does my GraphQL API return 200 for failed queries? GraphQL endpoints return 200 by convention because the HTTP transport itself succeeded. Errors are reported in the response body’s errors array. This allows partial success: one field may fail while others return data.

What happens if a CDN receives a non-standard status code? The CDN treats it based on its class. A non-standard 4xx will be handled as a client error, not cached by default, and passed through to the client. A non-standard 2xx will be cached according to Cache-Control. The specific code value matters less than the class for CDN behavior.

How should I handle status codes in a BFF (Backend for Frontend) pattern? The BFF should aggregate downstream status codes and return a single, coherent status to the frontend. Map multiple downstream errors to the most specific applicable code. Always log the original downstream codes alongside the BFF’s response for debugging.

What is the correct status code for rate limiting by user vs by IP? 429 Too Many Requests is correct for both. The response body should differentiate between user-level and IP-level limits. Use headers like X-RateLimit-Limit and X-RateLimit-Remaining for scoping information.

How do status codes interact with HTTP pipelining? HTTP pipelining (HTTP/1.1) allows multiple requests without waiting for responses. Status codes for pipelined responses MUST be returned in order. A failed response does not invalidate subsequent responses in the pipeline. HTTP/2 and HTTP/3 eliminated the need for pipelining with multiplexing.

Should I return 404 or 403 for undisclosed resources? Return 404. A 403 confirms the resource exists but access is denied. A 404 reveals nothing about the resource’s existence. For security-sensitive resources (user IDs, file paths), 404 prevents information disclosure.

What is the correct approach for status code versioning in APIs? Status code semantics should not change between API versions. If a response transitions from 200 to 201 or 200 to 202, document it as a breaking change. Status codes are part of the API contract. Use a new endpoint version if status code behavior must change.

How This Applies in Practice

Advanced HTTP status code management treats them as first-class protocol signals rather than error labels. Each code carries specific semantics about cacheability, retry safety, and connection lifecycle that intermediaries and clients depend on.

Organizations running microservices at scale implement standardized error schemas (RFC 9457), circuit breakers with appropriate status code output, and correlation ID propagation across all layers. Status codes become inputs to automated incident response: a 502 spike triggers dependency health checks, while a 503 spike triggers capacity auto-scaling.

How to Implement on Azion

Azion provides protocol-level control over status code handling across the entire request path:

  1. Custom Error Responses: Configure per-status-code response behaviors at the edge, including custom headers (Retry-After, Cache-Control) and body
  2. Functions: Intercept and modify status codes programmatically using JavaScript or Rust at the edge before they reach the client
  3. Real-Time Metrics: Monitor status code distributions per application, per edge node, and per origin with sub-second granularity
  4. Data Streaming: Export full request metadata including status codes, timings, and upstream status for advanced analytics
  5. Intelligent Cache: Configure cache policies based on status code classes to prevent caching of error responses

Learn more in the Azion Documentation.


Sources:

  • IETF. “HTTP Semantics.” RFC 9110. 2022.

  • IETF. “HTTP Caching.” RFC 9111. 2022.

  • IETF. “Problem Details for HTTP APIs.” RFC 9457. 2023.

  • IANA. “HTTP Status Code Registry.” 2026.

  • Google SRE. “Service Level Objectives.” 2023.

  • Catchpoint. “Root Cause Analysis of 5xx Errors.” 2025.

     

stay up to date

Subscribe to our Newsletter

Get the latest product updates, event highlights, and tech industry insights delivered to your inbox.