CDN Architecture & Edge Routing Strategies
This reference covers how CDN edge infrastructure evaluates Cache-Control directives, builds cache keys, routes requests via Anycast, and propagates invalidations across distributed PoPs. It is aimed at backend developers and CDN engineers who need to reason about shared-cache behavior at the network layer — not just in the browser. The four main areas explored in depth through the sections below are how CDN cache keys are generated, how Vary headers drive edge routing decisions, origin shielding and request collapsing, and tag-based cache invalidation.
RFC and Specification Anchor
CDN behavior at the edge is governed by two IETF standards:
- RFC 9111 — HTTP Caching defines the semantics of
Cache-Controldirectives, freshness calculation, validation, and the obligations of shared caches. Sections 5.2 (directive definitions) and 7.4 (shared-cache specific rules) are the most CDN-relevant. - RFC 9110 — HTTP Semantics defines the
Varyheader (Section 12.5.5), conditional request headers (If-None-Match,If-Modified-Since), and the status codes (304 Not Modified) that underpin revalidation.
RFC 9111 §3.3 establishes the freshness calculation formula that every compliant cache must implement:
freshness_lifetime = s-maxage (if present)
else max-age (if present)
else heuristic (typically 10% of Last-Modified age)
current_age = age_value + resident_time
is_fresh = (freshness_lifetime > current_age)
RFC 9111 §5.2.2.10 specifies that s-maxage overrides max-age for shared caches only. A browser receiving the same response uses max-age because s-maxage is defined to apply exclusively to shared caches (CDNs, reverse proxies).
Directive precedence table (shared cache):
| Priority | Directive or header | Effect |
|---|---|---|
| 1 | no-store |
No storage at any tier; overrides all other directives |
| 2 | s-maxage |
Sets edge TTL; overrides max-age for shared caches |
| 3 | max-age |
Sets TTL when s-maxage is absent |
| 4 | Expires |
Legacy fallback; ignored when max-age or s-maxage present |
| 5 | Heuristic freshness | Applied when no explicit TTL is set |
RFC 9111 §3 also mandates that caches ignore unrecognized directives rather than returning an error. This means a typo in a directive name — no-stor instead of no-store — silently degrades to heuristic caching with no observable error. Always verify directive names with a response header audit.
Concept Map & Terminology
These are the foundational terms used across all sections of this reference. Understanding each one precisely is required before tuning CDN behavior.
Point of Presence (PoP): A geographically distributed data center where a CDN terminates client connections and maintains its edge cache. Each PoP holds independent cache state — a hit at one PoP does not populate any other PoP’s cache.
Cache key: The string identifier used to look up a stored response. At minimum it is the request URI. CDNs extend it by incorporating selected request headers and query parameters. See how CDN cache keys are generated for the full normalization pipeline.
s-maxage: The RFC 9111 directive that sets the freshness lifetime specifically for shared caches. Browsers ignore it; CDN edge nodes honor it in preference to max-age. Setting both allows independent TTL control per cache tier.
Vary header: A response header that tells caches which request header dimensions must match before serving a stored response. Vary: Accept-Encoding means the cache must store separate entries for gzip and brotli variants. Misconfigured Vary is the primary cause of cache key fragmentation. See mapping Vary headers to edge routing.
Origin shield (mid-tier cache): An optional dedicated cache tier that sits between edge PoPs and the origin server. All PoPs route misses to the shield rather than directly to origin. This collapses N PoP misses into a single shield-to-origin request. See origin shielding and request collapsing.
Request collapsing (coalescing): When multiple concurrent requests arrive for the same uncached resource, the CDN forwards only one upstream and holds the others. When the upstream response arrives, it fans out to all waiting clients. Prevents thundering-herd origin saturation.
Surrogate key (cache tag): A vendor-specific metadata tag attached to cached objects at response time. Tags enable bulk invalidation of logically related objects by tag value rather than by URL. See tag-based cache invalidation patterns.
stale-while-revalidate: RFC 9111 extension directive that allows a CDN to serve a stale response immediately while asynchronously triggering a background revalidation request to origin. Eliminates the latency spike of synchronous revalidation at TTL expiry.
Age header: The number of seconds since the cached response was generated or revalidated at the origin. The CDN increments this value as the response ages in cache. A response with Age: 0 was just fetched from origin; a high Age value means the cached copy is old.
Anycast routing: A network routing strategy where multiple CDN PoPs advertise the same IP prefix via BGP. The client’s ISP routes the connection to the topologically nearest PoP. Contrast with unicast, where a single server holds the IP.
Architecture Overview
The diagram below shows the full request flow through CDN edge tiers, from client through Anycast routing, edge PoP, optional shield node, to origin — and the cache decision logic at each stage.
Cache-Control Directives at the Edge
The table below maps each CDN-relevant directive to its precise edge-node behavior under RFC 9111. Browser-only directives (private, proxy-revalidate) are included for contrast.
| Directive | Edge behavior | Browser behavior |
|---|---|---|
s-maxage=N |
Sets edge TTL to N seconds; overrides max-age |
Ignored |
max-age=N |
Used as TTL only when s-maxage absent |
Sets browser TTL |
no-store |
Must not cache; every request bypasses cache | Must not cache |
no-cache |
May store; must revalidate before serving | Must revalidate |
public |
Explicitly authorizes shared-cache storage | No browser effect |
private |
Must not store in shared cache | Browser may cache |
stale-while-revalidate=N |
Serve stale immediately; revalidate async in background | Same |
stale-if-error=N |
Serve stale for N seconds on 5xx from origin | Same |
must-revalidate |
Must not serve stale under any circumstances | Same |
proxy-revalidate |
Like must-revalidate but shared-cache-only |
Ignored |
immutable |
Content never changes; skip revalidation during TTL | Same |
CDN split TTL pattern — different TTLs for edge and browser:
Cache-Control: public, max-age=60, s-maxage=86400
The browser caches for 60 seconds; the CDN edge caches for 24 hours. This is the standard approach for content that can be purged at the CDN without requiring client-side cache invalidation.
Static asset with background refresh:
Cache-Control: public, max-age=3600, stale-while-revalidate=86400, stale-if-error=604800
The edge serves immediately without waiting for origin during the 23-hour stale-while-revalidate window. On a 5xx from origin, it serves stale for up to 7 days.
Authenticated API response — prevent CDN caching:
Cache-Control: private, no-store
Vary: Authorization
private prevents shared-cache storage. no-store adds belt-and-suspenders prevention. Vary: Authorization ensures that even if a shared cache ignores private, it will not serve one user’s response to another.
Immutable versioned asset:
Cache-Control: public, max-age=31536000, immutable
A URL containing a content hash (/app.a3f9b2.js) will never change. immutable tells the browser not to issue conditional revalidation requests during the TTL. The edge caches indefinitely; deploy a new URL when content changes.
Vary Header and Cache Key Dimensionality
The Vary header is the primary control surface for cache key dimensionality. Each value listed in Vary creates a new dimension in the cache key — the edge stores separate entries for requests that differ along that dimension.
Vary: Accept-Encoding creates entries for gzip, br, identity, and every combination a client might send. Most CDNs normalize compression encoding internally and exclude Accept-Encoding from the cache key, but this behavior is vendor-specific. Check with curl -H "Accept-Encoding: zstd" -sI https://example.com/asset.js | grep -i vary.
Vary: Accept-Language creates one entry per language the origin returns distinct content for. For a site with 20 locales, a single resource could generate 20 cache entries. Structure content negotiation via URL (/en/, /fr/) rather than headers to avoid this fragmentation.
Vary: Cookie or Vary: Authorization effectively disables shared caching — the cookie or token value becomes part of the cache key, and every unique user gets a distinct cache miss. Use private or no-store explicitly instead.
The detailed mechanism and reduction strategies are covered in mapping Vary headers to edge routing.
Cross-Cutting Production Patterns
Pattern 1: CDN-Browser Split TTL
Set a long edge TTL and a short browser TTL. Purge the CDN when content changes; let browsers refresh naturally from the CDN within their short TTL.
Cache-Control: public, max-age=300, s-maxage=604800
Surrogate-Key: product-42 category-electronics
The Surrogate-Key header (Fastly) or Cache-Tag header (Cloudflare) attaches metadata for targeted purges. When product 42 updates, issue a tag-based purge for product-42 rather than a URL purge.
Pattern 2: API Response Caching
Short TTL with explicit revalidation. The ETag header enables conditional 304 responses that avoid full body retransmission.
Cache-Control: public, s-maxage=60, stale-while-revalidate=30
ETag: "v2-a9f3"
Vary: Accept
Vary: Accept ensures JSON and XML representations are cached separately. The 30-second stale-while-revalidate window prevents the latency spike at TTL boundary — the edge serves the stale response while asynchronously fetching a fresh copy.
Pattern 3: Authenticated Session Pages
Prevent shared-cache storage entirely. Use no-store for pages containing session-specific data.
Cache-Control: private, no-store
Pragma: no-cache
Pragma: no-cache is a legacy HTTP/1.0 directive. RFC 9111 §5.4 requires compliant caches to treat it as equivalent to Cache-Control: no-cache. Include it only if you need backward compatibility with HTTP/1.0 proxies.
Pattern 4: Stale-While-Revalidate for High-Traffic APIs
For endpoints that tolerate brief staleness, use stale-while-revalidate to completely eliminate the latency impact of TTL expiry.
Cache-Control: public, s-maxage=120, stale-while-revalidate=600, stale-if-error=3600
The edge serves stale for up to 10 minutes while revalidating in the background. On origin errors, it falls back to stale content for 1 hour. Combined with request collapsing, this pattern allows a single origin replica to serve millions of edge requests.
Diagnostic and Debugging Reference
Inspect Cache-Control and Age headers
curl -sI https://example.com/resource \
| grep -iE 'cache-control|age|vary|etag|x-cache|cf-cache-status'
Age: 0 means the response was just fetched from origin (cache MISS or first population). A non-zero Age confirms cache HIT. Age larger than the s-maxage value indicates the CDN is serving stale content — check stale-while-revalidate configuration.
Verify conditional revalidation
curl -sI -H 'If-None-Match: "abc123"' https://example.com/resource
A 304 Not Modified response confirms the CDN forwarded the conditional request upstream and origin confirmed validity. A 200 with a new ETag means the resource changed. A 200 with no ETag means the origin does not support ETag validation — add it at origin.
Check CDN-specific cache status
| Header | CDN | Values and meanings |
|---|---|---|
CF-Cache-Status |
Cloudflare | HIT, MISS, EXPIRED, STALE, BYPASS, DYNAMIC, REVALIDATED |
X-Cache |
CloudFront / Varnish | Hit from cloudfront, Miss from cloudfront, HIT, MISS |
X-Cache-Status |
Nginx proxy | HIT, MISS, EXPIRED, UPDATING, STALE, BYPASS |
Fastly-Debug-Digest |
Fastly | Request fingerprint for cache-key debugging |
Surrogate-Key |
Fastly (response) | Tags attached to this object |
Test Vary-based cache fragmentation
curl -sI -H "Accept-Language: en-US" https://example.com/page | grep -i 'age\|x-cache'
curl -sI -H "Accept-Language: fr-FR" https://example.com/page | grep -i 'age\|x-cache'
If both return Age: 0 on the second request, the CDN stored separate entries — Vary: Accept-Language is creating fragmentation. If the second returns a non-zero Age, the CDN is ignoring the language dimension (usually intentional).
Verify purge propagation
After a purge, confirm Age: 0 on the previously cached resource:
curl -sI https://example.com/resource | grep -i 'age\|x-cache'
Age: 0 with X-Cache: MISS confirms the purge succeeded and the resource was re-fetched from origin. Allow 5–30 seconds for purge propagation across all PoPs.
DevTools cache inspection
In Chrome DevTools, Network tab: enable “Size” column — entries showing (memory cache) or (disk cache) are browser-cached. Entries showing a byte size were fetched from origin or CDN. The Timing waterfall shows TTFB — a low TTFB on a cached resource (under 20ms) confirms CDN HIT at a nearby PoP.
Common Mistakes and RFC Violations
| Anti-pattern | RFC rule violated | Fix |
|---|---|---|
no-store, max-age=3600 |
RFC 9111 §5.2.2.5: no-store prohibits storage; max-age is meaningless |
Remove max-age when using no-store |
Cache-Control: no-cache without ETag or Last-Modified |
Revalidation requires a validator; without one, origin must return 200 every time |
Add ETag generation at origin |
Vary: * |
RFC 9111 §5.2.2: Vary: * means no stored response can be reused |
Use specific header names or remove Vary |
Vary: Cookie on public pages |
Creates a unique cache key per user; hit rate collapses to near zero | Remove Vary: Cookie; use private instead if content is user-specific |
s-maxage without public on non-credentialed responses |
RFC 9111 §5.2.2.10 implies s-maxage authorizes shared caching, but some CDNs require public explicitly |
Add public alongside s-maxage |
Pragma: no-cache as sole cache-prevention directive |
HTTP/1.1 caches are not required to honor Pragma; RFC 9111 §5.4 makes it advisory for no-cache only |
Use Cache-Control: no-cache or no-store |
Setting max-age=0 instead of no-cache |
max-age=0 makes content immediately stale but caches may still serve it without revalidation under some conditions |
Use no-cache for explicit revalidation requirement |
Long max-age without versioned URLs |
Content changes cannot reach users within the TTL; no invalidation path | Use content-hashed URLs for immutable assets; use short TTL + CDN purge for mutable resources |
FAQ
Does s-maxage apply to all CDNs the same way?
RFC 9111 defines s-maxage as binding on any shared cache, but vendor CDNs often layer their own override mechanisms on top — Cloudflare Page Rules, Fastly VCL, and Varnish TTL headers can each supersede the header value. Always verify with a cache-status header rather than assuming the header is respected literally.
Why do different PoPs return different Age header values for the same URL?
Each CDN Point of Presence maintains independent cache state. A cache miss at one PoP populates that node’s storage independently of other PoPs, so Age values diverge based on when each PoP first stored the response. Origin shielding reduces this variance by funneling all PoP misses through a single shield node.
What is the difference between a cache purge and a cache invalidation?
A purge removes the cached entry immediately and forces the next request to fetch from origin. An invalidation (soft purge) marks the entry stale — the CDN continues serving it under stale-while-revalidate semantics until the next background revalidation updates it. Purge is disruptive but immediate; invalidation is softer but requires revalidation infrastructure.
Can Vary: Accept-Encoding cause cache fragmentation on a CDN?
Yes. If the CDN does not normalize Accept-Encoding before lookup, a single resource may be stored as separate entries for gzip, br, gzip, br, and the uncompressed form. Most CDNs handle compression normalization internally and strip Vary: Accept-Encoding from the cache key; verify this in your CDN documentation before relying on it.
How does request collapsing interact with Cache-Control: no-cache?
When Cache-Control: no-cache is present, RFC 9111 requires that caches revalidate before serving. Request collapsing still applies: concurrent revalidation requests for the same resource can be collapsed into a single conditional request to origin. The CDN sends one If-None-Match or If-Modified-Since request and fans the 304 response out to all waiting clients.
Should I use s-maxage or a CDN-specific TTL header?
s-maxage is portable across all RFC-compliant shared caches and is the standard approach. CDN-specific headers (Cloudflare’s CDN-Cache-Control, Fastly’s Surrogate-Control) are useful when you need to set different edge and browser TTLs without exposing CDN directives to end clients. Use the standard header first; add vendor headers only when you need the override capability.
Related
- The mechanics of exactly how edge nodes construct lookup identifiers — including query string normalization, header inclusion, and key collision risks — are covered in How CDN Cache Keys Are Generated.
- When deploying across multiple CDN vendors, understanding how
no-cachediffers fromno-storeprevents misconfigurations that silently bypass intended caching at the edge. - The interaction between
s-maxageandmax-agewhen setting split browser/edge TTLs is covered in detail at Mastering max-age and s-maxage Directives. - Understanding how freshness is calculated and when revalidation triggers at the HTTP layer provides the foundation for CDN TTL decisions: Freshness vs Validation Models Explained.