Core Caching Fundamentals & HTTP Lifecycle
HTTP caching is governed by RFC 9111, which defines a deterministic state machine for storing and reusing responses across every layer of a distributed delivery stack. This reference covers the full mechanics: the normative standards, the terminology that every other topic in this space depends on, how requests route through browser, edge, and origin caches, the freshness and validation models that decide what gets served, and the diagnostic workflows that reveal what is actually happening in production. It is aimed at backend engineers, DevOps practitioners, and CDN engineers who need a precise, RFC-grounded understanding rather than a surface-level overview. For the complete step-by-step traversal of a single request, see The Complete HTTP Request Lifecycle; for the multi-tier architecture that separates browser and edge storage, see Understanding HTTP Cache Hierarchy.
RFC 9111 & Normative Caching Semantics
RFC 9111 (published June 2022, obsoleting RFC 7234) is the governing specification for HTTP caching. It defines storage eligibility, freshness calculation, conditional validation, and the interaction order of all Cache-Control directives. HTTP/2 and HTTP/3 do not change these semantics — transport framing is irrelevant to caching logic.
Core normative rules
- Caches must not store a response carrying
no-store(RFC 9111 §5.2.2.5). - Caches must not serve a stale response unless
stale-while-revalidateorstale-if-errorexplicitly grants an extension window, or the origin is unreachable and the cache is explicitly configured to serve stale under error (§4.2.4). - Shared caches must not serve a response marked
privateto any client other than the one that originally received it (§5.2.2.7). Cache-ControloverridesExpiresandPragmain all compliant implementations (§5.3, §5.4).
Header precedence decision table
| Competing rules | Winner | RFC reference |
|---|---|---|
Cache-Control: max-age vs Expires |
max-age wins |
§5.3 |
s-maxage vs max-age (shared cache) |
s-maxage wins |
§5.2.2.10 |
no-store vs any freshness directive |
no-store wins — nothing stored |
§5.2.2.5 |
no-cache vs max-age |
no-cache overrides; revalidation required |
§5.2.2.4 |
Vary: * |
Response is uncacheable by any shared cache | §4.1 |
private on shared cache |
Must not store or serve to others | §5.2.2.7 |
Concept Map & Terminology
The terms below appear across every topic on this site. Each links forward to the section that covers it in depth.
Freshness — The period during which a cached response can be served without contacting the origin. Freshness lifetime is calculated from max-age, s-maxage, or the legacy Expires header. Covered in depth in Freshness vs Validation Models Explained.
Validation — The conditional request/response exchange that confirms whether a stale cached entry is still current. Uses ETag/If-None-Match (strong, byte-level identity) or Last-Modified/If-Modified-Since (timestamp). A 304 Not Modified response reuses the cached body with a refreshed TTL. Also covered in Freshness vs Validation Models Explained.
max-age — Freshness lifetime directive, in seconds, applied to all caches (browser and shared). See Cache-Control Directives for the full directive taxonomy.
s-maxage — Shared-cache freshness override. Browsers ignore it; CDNs and reverse proxies use it instead of max-age. Covered in Mastering max-age and s-maxage Directives.
no-cache — Permits storage but mandates revalidation before every use. Frequently misread as “do not cache” — it means the opposite. Covered in no-cache vs no-store: When to Use Each.
no-store — Prohibits any storage at any layer. The response must not be written to disk or memory by any cache. Also covered in no-cache vs no-store: When to Use Each.
private — Restricts storage to the end-user’s browser; shared caches (CDNs, proxies) must bypass. Covered in Public vs Private Cache Scope.
stale-while-revalidate — Extension directive allowing a stale response to be served immediately while a background revalidation request refreshes the stored entry. Eliminates the latency spike at TTL expiry.
Vary — Response header that extends the cache key to include request header dimensions (e.g. Accept-Encoding, Accept-Language). Each unique combination of Vary-listed headers generates a separate cache entry. Misuse fragments cache storage. Covered in Mapping Vary Headers to Edge Routing.
Age — Response header set by shared caches indicating how many seconds the response has been stored. Age: 0 means origin-fresh; Age: 3600 means an edge node has held the copy for one hour.
Cache hit / miss / bypass — The three fundamental outcome states for every cache lookup. Covered in Cache Hit, Miss, and Bypass Mechanics.
Architecture Overview
The diagram below maps the three-tier caching topology — browser, CDN edge, and origin — and the decision points at each layer. Every request traverses this flow from left to right; responses propagate back and populate storage at each layer according to the directives they carry.
Every cache layer evaluates requests independently. No layer has visibility into what another layer is storing — each must be configured correctly in isolation through the directives the origin sets on each response.
Cross-cutting Patterns
The patterns below cover the four most common production configurations. Each shows the exact Cache-Control header, which layer stores the response, and why.
Pattern 1 — Immutable static assets with content-hashed URLs
Cache-Control: public, max-age=31536000, immutable
Use this for CSS, JS, and image files served with a content hash in the URL (e.g. /app.a3f8c1d.js). The immutable directive tells the browser not to send a revalidation request even if the user reloads, because the URL itself is a versioned fingerprint. The public directive permits CDN storage. When the content changes, the URL changes — the old URL’s cached copy is never invalidated, it simply expires naturally or is evicted.
Pattern 2 — CDN edge TTL decoupled from browser TTL
Cache-Control: public, max-age=60, s-maxage=86400
The browser re-checks the resource every 60 seconds. The CDN edge holds its copy for 24 hours and collapses all browser requests within that window into one upstream request. This pattern works for content that changes infrequently but where browser staleness of more than a minute is unacceptable. Shared caches read s-maxage and ignore max-age; browsers do the reverse. See Mastering max-age and s-maxage Directives for tier-specific TTL design.
Pattern 3 — API responses requiring per-request revalidation
Cache-Control: no-cache
ETag: "d8e8fca2dc0f896fd7cb4cb0031ba249"
Last-Modified: Mon, 21 Jun 2026 14:30:00 GMT
no-cache permits storage (both browser and CDN) but requires a conditional request before every use. If the ETag matches, the origin returns 304 Not Modified — no payload transfer, just a refreshed TTL. This pattern is ideal for API endpoints that are read-heavy but change unpredictably. For the full validation flow see Freshness vs Validation Models Explained.
Pattern 4 — Authenticated responses locked to the browser
Cache-Control: private, no-cache
Vary: Cookie, Authorization
private prevents shared caches from storing user-specific data. no-cache ensures the browser revalidates on each use. The Vary on Cookie and Authorization further fragments the cache key so that no cross-user contamination is possible even if a misconfigured CDN attempts to store the response. See Public vs Private Cache Scope for the full scope model.
Diagnostic & Debugging Reference
Inspect which cache layer served the response
# Full header dump — reveals Age, Cache-Control, ETag, Vary, X-Cache
curl -sI https://example.com/resource
# Force revalidation — bypasses stored caches and confirms origin response headers
curl -sI -H "Cache-Control: no-cache" https://example.com/resource
# Bypass CDN entirely by connecting directly to the origin IP
curl -sI --resolve example.com:443:203.0.113.42 https://example.com/resource
Headers to inspect on every response
| Header | What it reveals |
|---|---|
Age |
Seconds the response has been in a shared cache. Age: 0 = origin-fresh. |
Cache-Control |
The directives actually sent by the origin (check these match your config). |
ETag |
Validator for conditional requests. Absent means the origin does not support strong validation. |
Vary |
Cache key dimensions beyond the URL. Unexpected values cause cache fragmentation. |
X-Cache / CF-Cache-Status |
CDN hit/miss/stale/bypass state (vendor-specific header names). |
DevTools workflow
- Open Network tab, check Disable cache off (to observe real caching behavior).
- Click a resource. In the Headers tab, locate
Cache-Controlin the Response Headers section. - The Size column shows
(memory cache)or(disk cache)for browser hits; a byte count indicates a network fetch. - The Timing waterfall shows
TTFBand whether aCache Readphase appears. - Reload with
Shift+Reload(hard reload) to force revalidation — this sendsCache-Control: no-cachefrom the browser and bypasses the browser cache but not CDN caches.
Common Mistakes & RFC Violations
| Anti-pattern | RFC rule violated | Fix |
|---|---|---|
Sending no-store on publicly cacheable assets |
§5.2.2.5 — wastes CDN capacity and increases origin load | Use public, max-age=<ttl> for resources safe to cache |
Missing Vary: Accept-Encoding when serving gzip and brotli variants |
§4.1 — shared caches may serve the wrong encoding to clients | Always include Vary: Accept-Encoding when compressing at the origin |
Vary: User-Agent on all responses |
§4.1 — creates thousands of separate cache entries | Resolve mobile/desktop at the edge via path or header normalization, not Vary |
Using Expires without Cache-Control |
§5.3 — Expires is ignored by many modern implementations |
Always set Cache-Control; use Expires only as a fallback |
Setting no-cache but omitting ETag or Last-Modified |
§4.3 — revalidation will always result in a full 200 OK |
Add validators so the origin can respond with 304 Not Modified |
Serving private responses through a CDN without Vary: Cookie |
§5.2.2.7 — risks cross-user data leakage | Set private and configure the CDN to bypass storage for authenticated sessions |
stale-while-revalidate window larger than max-age on rapidly mutating data |
No direct RFC rule; a correctness concern | Keep stale-while-revalidate short (< 10% of max-age) for frequently updated content |
Frequently Asked Questions
What is the difference between no-cache and no-store?
no-store prohibits any storage of the response at any cache layer — nothing is written to disk or memory. no-cache permits storage but requires revalidation with the origin before the stored copy can be served to any subsequent request. Use no-store for responses that must never persist (account pages, payment confirmations); use no-cache for content that can be stored and served without payload re-transfer when the ETag or Last-Modified matches.
How does s-maxage differ from max-age?
max-age sets the freshness lifetime for all caches, including browsers. s-maxage overrides max-age specifically for shared caches — CDNs and reverse proxies — while leaving the browser TTL unchanged. When both are present, shared caches use s-maxage and browsers use max-age. If s-maxage is absent, shared caches fall back to max-age. See Mastering max-age and s-maxage Directives for full tier-specific design guidance.
When does a cache send a conditional request?
A cache sends a conditional request when a stored response has exceeded its freshness lifetime (max-age or s-maxage) and the stored entry includes a validator (ETag or Last-Modified). The cache forwards If-None-Match with the stored ETag value or If-Modified-Since with the stored Last-Modified date. The origin responds with 304 Not Modified (reuse the stored body, reset the freshness clock) or 200 OK (new content replaces the stored entry). Without validators, the cache cannot send a conditional request and must fetch the full response.
What does the Age header tell you?
Age is set by shared caches and indicates how many seconds the response has been stored. Age: 0 means the response was fetched directly from origin for this request. Age: 3600 means an edge node has held this entry for one hour. Comparing Age against the max-age value tells you exactly how much freshness lifetime remains. A missing Age header on a response that passed through a CDN is a diagnostic signal — either the CDN bypassed its cache or the CDN does not add Age.
Can a CDN cache a response that has a Set-Cookie header?
Not by default — most CDNs treat Set-Cookie as a signal that the response is user-specific and refuse to cache it. To cache such responses at the edge, you must configure the CDN to strip or ignore Set-Cookie (via edge rules, not Cache-Control), confirm the response is not genuinely user-specific, and ensure the cache key does not include the Cookie request header. Never cache personalized, session-bound, or authenticated responses in a shared cache.
What is stale-while-revalidate and when should you use it?
stale-while-revalidate=<seconds> extends the usable window of a stale entry by the specified duration. During that window, the cache serves the stale copy immediately (zero-latency response to the client) while issuing a background revalidation request to refresh the stored entry. It eliminates the visible latency spike that occurs at TTL expiry. Use it for content that tolerates brief staleness — news feeds, product listings, aggregated statistics. Avoid it for data where correctness is time-sensitive, and keep the window short relative to max-age to bound the maximum staleness exposure.
Related
- Cache Hit, Miss, and Bypass Mechanics explains the three outcome states for every cache lookup and how they interrelate across network layers.
- Understanding HTTP Cache Hierarchy details how browser, CDN, and origin storage work independently and how to configure each tier without creating cross-layer desynchronization.
- Cache-Control Directives & Header Combinations covers the full directive taxonomy, precedence rules, and how to combine directives safely for each resource class.
- CDN Architecture & Edge Routing Strategies maps how CDN edge nodes generate cache keys, apply
Varyrouting, and expose per-PoP cache states through observability headers.