CDN Architecture & Edge Routing Strategies

This reference covers how CDN edge infrastructure evaluates Cache-Control directives, builds cache keys, routes requests via Anycast, and propagates invalidations across distributed PoPs. It is aimed at backend developers and CDN engineers who need to reason about shared-cache behavior at the network layer — not just in the browser. The four main areas explored in depth through the sections below are how CDN cache keys are generated, how Vary headers drive edge routing decisions, origin shielding and request collapsing, and tag-based cache invalidation.


RFC and Specification Anchor

CDN behavior at the edge is governed by two IETF standards:

  • RFC 9111 — HTTP Caching defines the semantics of Cache-Control directives, freshness calculation, validation, and the obligations of shared caches. Sections 5.2 (directive definitions) and 7.4 (shared-cache specific rules) are the most CDN-relevant.
  • RFC 9110 — HTTP Semantics defines the Vary header (Section 12.5.5), conditional request headers (If-None-Match, If-Modified-Since), and the status codes (304 Not Modified) that underpin revalidation.

RFC 9111 §3.3 establishes the freshness calculation formula that every compliant cache must implement:

freshness_lifetime = s-maxage (if present)
                   else max-age (if present)
                   else heuristic (typically 10% of Last-Modified age)

current_age = age_value + resident_time

is_fresh = (freshness_lifetime > current_age)

RFC 9111 §5.2.2.10 specifies that s-maxage overrides max-age for shared caches only. A browser receiving the same response uses max-age because s-maxage is defined to apply exclusively to shared caches (CDNs, reverse proxies).

Directive precedence table (shared cache):

Priority Directive or header Effect
1 no-store No storage at any tier; overrides all other directives
2 s-maxage Sets edge TTL; overrides max-age for shared caches
3 max-age Sets TTL when s-maxage is absent
4 Expires Legacy fallback; ignored when max-age or s-maxage present
5 Heuristic freshness Applied when no explicit TTL is set

RFC 9111 §3 also mandates that caches ignore unrecognized directives rather than returning an error. This means a typo in a directive name — no-stor instead of no-store — silently degrades to heuristic caching with no observable error. Always verify directive names with a response header audit.


Concept Map & Terminology

These are the foundational terms used across all sections of this reference. Understanding each one precisely is required before tuning CDN behavior.

Point of Presence (PoP): A geographically distributed data center where a CDN terminates client connections and maintains its edge cache. Each PoP holds independent cache state — a hit at one PoP does not populate any other PoP’s cache.

Cache key: The string identifier used to look up a stored response. At minimum it is the request URI. CDNs extend it by incorporating selected request headers and query parameters. See how CDN cache keys are generated for the full normalization pipeline.

s-maxage: The RFC 9111 directive that sets the freshness lifetime specifically for shared caches. Browsers ignore it; CDN edge nodes honor it in preference to max-age. Setting both allows independent TTL control per cache tier.

Vary header: A response header that tells caches which request header dimensions must match before serving a stored response. Vary: Accept-Encoding means the cache must store separate entries for gzip and brotli variants. Misconfigured Vary is the primary cause of cache key fragmentation. See mapping Vary headers to edge routing.

Origin shield (mid-tier cache): An optional dedicated cache tier that sits between edge PoPs and the origin server. All PoPs route misses to the shield rather than directly to origin. This collapses N PoP misses into a single shield-to-origin request. See origin shielding and request collapsing.

Request collapsing (coalescing): When multiple concurrent requests arrive for the same uncached resource, the CDN forwards only one upstream and holds the others. When the upstream response arrives, it fans out to all waiting clients. Prevents thundering-herd origin saturation.

Surrogate key (cache tag): A vendor-specific metadata tag attached to cached objects at response time. Tags enable bulk invalidation of logically related objects by tag value rather than by URL. See tag-based cache invalidation patterns.

stale-while-revalidate: RFC 9111 extension directive that allows a CDN to serve a stale response immediately while asynchronously triggering a background revalidation request to origin. Eliminates the latency spike of synchronous revalidation at TTL expiry.

Age header: The number of seconds since the cached response was generated or revalidated at the origin. The CDN increments this value as the response ages in cache. A response with Age: 0 was just fetched from origin; a high Age value means the cached copy is old.

Anycast routing: A network routing strategy where multiple CDN PoPs advertise the same IP prefix via BGP. The client’s ISP routes the connection to the topologically nearest PoP. Contrast with unicast, where a single server holds the IP.


Architecture Overview

The diagram below shows the full request flow through CDN edge tiers, from client through Anycast routing, edge PoP, optional shield node, to origin — and the cache decision logic at each stage.

CDN Edge Request Flow and Cache Decision Architecture Sequence diagram showing how a client request travels through Anycast DNS to the nearest CDN PoP, where a cache hit is served immediately, a miss escalates to the origin shield, and a shield miss escalates to origin. Cache-Control headers and Age values are shown at each transition point. Client Anycast / DNS Edge PoP Shield Node Origin DNS query nearest PoP IP (Anycast) GET /resource cache lookup — fresh? HIT · 200 · Age:N · X-Cache:HIT MISS → shield shield lookup — fresh? shield HIT · 200 MISS → origin 200 + headers store at PoP 200 · Age:0 · X-Cache:MISS Request collapsing concurrent MISSes queued — one upstream fetch request response / store

Cache-Control Directives at the Edge

The table below maps each CDN-relevant directive to its precise edge-node behavior under RFC 9111. Browser-only directives (private, proxy-revalidate) are included for contrast.

Directive Edge behavior Browser behavior
s-maxage=N Sets edge TTL to N seconds; overrides max-age Ignored
max-age=N Used as TTL only when s-maxage absent Sets browser TTL
no-store Must not cache; every request bypasses cache Must not cache
no-cache May store; must revalidate before serving Must revalidate
public Explicitly authorizes shared-cache storage No browser effect
private Must not store in shared cache Browser may cache
stale-while-revalidate=N Serve stale immediately; revalidate async in background Same
stale-if-error=N Serve stale for N seconds on 5xx from origin Same
must-revalidate Must not serve stale under any circumstances Same
proxy-revalidate Like must-revalidate but shared-cache-only Ignored
immutable Content never changes; skip revalidation during TTL Same

CDN split TTL pattern — different TTLs for edge and browser:

Cache-Control: public, max-age=60, s-maxage=86400

The browser caches for 60 seconds; the CDN edge caches for 24 hours. This is the standard approach for content that can be purged at the CDN without requiring client-side cache invalidation.

Static asset with background refresh:

Cache-Control: public, max-age=3600, stale-while-revalidate=86400, stale-if-error=604800

The edge serves immediately without waiting for origin during the 23-hour stale-while-revalidate window. On a 5xx from origin, it serves stale for up to 7 days.

Authenticated API response — prevent CDN caching:

Cache-Control: private, no-store
Vary: Authorization

private prevents shared-cache storage. no-store adds belt-and-suspenders prevention. Vary: Authorization ensures that even if a shared cache ignores private, it will not serve one user’s response to another.

Immutable versioned asset:

Cache-Control: public, max-age=31536000, immutable

A URL containing a content hash (/app.a3f9b2.js) will never change. immutable tells the browser not to issue conditional revalidation requests during the TTL. The edge caches indefinitely; deploy a new URL when content changes.


Vary Header and Cache Key Dimensionality

The Vary header is the primary control surface for cache key dimensionality. Each value listed in Vary creates a new dimension in the cache key — the edge stores separate entries for requests that differ along that dimension.

Vary: Accept-Encoding creates entries for gzip, br, identity, and every combination a client might send. Most CDNs normalize compression encoding internally and exclude Accept-Encoding from the cache key, but this behavior is vendor-specific. Check with curl -H "Accept-Encoding: zstd" -sI https://example.com/asset.js | grep -i vary.

Vary: Accept-Language creates one entry per language the origin returns distinct content for. For a site with 20 locales, a single resource could generate 20 cache entries. Structure content negotiation via URL (/en/, /fr/) rather than headers to avoid this fragmentation.

Vary: Cookie or Vary: Authorization effectively disables shared caching — the cookie or token value becomes part of the cache key, and every unique user gets a distinct cache miss. Use private or no-store explicitly instead.

The detailed mechanism and reduction strategies are covered in mapping Vary headers to edge routing.


Cross-Cutting Production Patterns

Pattern 1: CDN-Browser Split TTL

Set a long edge TTL and a short browser TTL. Purge the CDN when content changes; let browsers refresh naturally from the CDN within their short TTL.

Cache-Control: public, max-age=300, s-maxage=604800
Surrogate-Key: product-42 category-electronics

The Surrogate-Key header (Fastly) or Cache-Tag header (Cloudflare) attaches metadata for targeted purges. When product 42 updates, issue a tag-based purge for product-42 rather than a URL purge.

Pattern 2: API Response Caching

Short TTL with explicit revalidation. The ETag header enables conditional 304 responses that avoid full body retransmission.

Cache-Control: public, s-maxage=60, stale-while-revalidate=30
ETag: "v2-a9f3"
Vary: Accept

Vary: Accept ensures JSON and XML representations are cached separately. The 30-second stale-while-revalidate window prevents the latency spike at TTL boundary — the edge serves the stale response while asynchronously fetching a fresh copy.

Pattern 3: Authenticated Session Pages

Prevent shared-cache storage entirely. Use no-store for pages containing session-specific data.

Cache-Control: private, no-store
Pragma: no-cache

Pragma: no-cache is a legacy HTTP/1.0 directive. RFC 9111 §5.4 requires compliant caches to treat it as equivalent to Cache-Control: no-cache. Include it only if you need backward compatibility with HTTP/1.0 proxies.

Pattern 4: Stale-While-Revalidate for High-Traffic APIs

For endpoints that tolerate brief staleness, use stale-while-revalidate to completely eliminate the latency impact of TTL expiry.

Cache-Control: public, s-maxage=120, stale-while-revalidate=600, stale-if-error=3600

The edge serves stale for up to 10 minutes while revalidating in the background. On origin errors, it falls back to stale content for 1 hour. Combined with request collapsing, this pattern allows a single origin replica to serve millions of edge requests.


Diagnostic and Debugging Reference

Inspect Cache-Control and Age headers

curl -sI https://example.com/resource \
  | grep -iE 'cache-control|age|vary|etag|x-cache|cf-cache-status'

Age: 0 means the response was just fetched from origin (cache MISS or first population). A non-zero Age confirms cache HIT. Age larger than the s-maxage value indicates the CDN is serving stale content — check stale-while-revalidate configuration.

Verify conditional revalidation

curl -sI -H 'If-None-Match: "abc123"' https://example.com/resource

A 304 Not Modified response confirms the CDN forwarded the conditional request upstream and origin confirmed validity. A 200 with a new ETag means the resource changed. A 200 with no ETag means the origin does not support ETag validation — add it at origin.

Check CDN-specific cache status

Header CDN Values and meanings
CF-Cache-Status Cloudflare HIT, MISS, EXPIRED, STALE, BYPASS, DYNAMIC, REVALIDATED
X-Cache CloudFront / Varnish Hit from cloudfront, Miss from cloudfront, HIT, MISS
X-Cache-Status Nginx proxy HIT, MISS, EXPIRED, UPDATING, STALE, BYPASS
Fastly-Debug-Digest Fastly Request fingerprint for cache-key debugging
Surrogate-Key Fastly (response) Tags attached to this object

Test Vary-based cache fragmentation

curl -sI -H "Accept-Language: en-US" https://example.com/page | grep -i 'age\|x-cache'
curl -sI -H "Accept-Language: fr-FR" https://example.com/page | grep -i 'age\|x-cache'

If both return Age: 0 on the second request, the CDN stored separate entries — Vary: Accept-Language is creating fragmentation. If the second returns a non-zero Age, the CDN is ignoring the language dimension (usually intentional).

Verify purge propagation

After a purge, confirm Age: 0 on the previously cached resource:

curl -sI https://example.com/resource | grep -i 'age\|x-cache'

Age: 0 with X-Cache: MISS confirms the purge succeeded and the resource was re-fetched from origin. Allow 5–30 seconds for purge propagation across all PoPs.

DevTools cache inspection

In Chrome DevTools, Network tab: enable “Size” column — entries showing (memory cache) or (disk cache) are browser-cached. Entries showing a byte size were fetched from origin or CDN. The Timing waterfall shows TTFB — a low TTFB on a cached resource (under 20ms) confirms CDN HIT at a nearby PoP.


Common Mistakes and RFC Violations

Anti-pattern RFC rule violated Fix
no-store, max-age=3600 RFC 9111 §5.2.2.5: no-store prohibits storage; max-age is meaningless Remove max-age when using no-store
Cache-Control: no-cache without ETag or Last-Modified Revalidation requires a validator; without one, origin must return 200 every time Add ETag generation at origin
Vary: * RFC 9111 §5.2.2: Vary: * means no stored response can be reused Use specific header names or remove Vary
Vary: Cookie on public pages Creates a unique cache key per user; hit rate collapses to near zero Remove Vary: Cookie; use private instead if content is user-specific
s-maxage without public on non-credentialed responses RFC 9111 §5.2.2.10 implies s-maxage authorizes shared caching, but some CDNs require public explicitly Add public alongside s-maxage
Pragma: no-cache as sole cache-prevention directive HTTP/1.1 caches are not required to honor Pragma; RFC 9111 §5.4 makes it advisory for no-cache only Use Cache-Control: no-cache or no-store
Setting max-age=0 instead of no-cache max-age=0 makes content immediately stale but caches may still serve it without revalidation under some conditions Use no-cache for explicit revalidation requirement
Long max-age without versioned URLs Content changes cannot reach users within the TTL; no invalidation path Use content-hashed URLs for immutable assets; use short TTL + CDN purge for mutable resources

FAQ

Does s-maxage apply to all CDNs the same way?

RFC 9111 defines s-maxage as binding on any shared cache, but vendor CDNs often layer their own override mechanisms on top — Cloudflare Page Rules, Fastly VCL, and Varnish TTL headers can each supersede the header value. Always verify with a cache-status header rather than assuming the header is respected literally.

Why do different PoPs return different Age header values for the same URL?

Each CDN Point of Presence maintains independent cache state. A cache miss at one PoP populates that node’s storage independently of other PoPs, so Age values diverge based on when each PoP first stored the response. Origin shielding reduces this variance by funneling all PoP misses through a single shield node.

What is the difference between a cache purge and a cache invalidation?

A purge removes the cached entry immediately and forces the next request to fetch from origin. An invalidation (soft purge) marks the entry stale — the CDN continues serving it under stale-while-revalidate semantics until the next background revalidation updates it. Purge is disruptive but immediate; invalidation is softer but requires revalidation infrastructure.

Can Vary: Accept-Encoding cause cache fragmentation on a CDN?

Yes. If the CDN does not normalize Accept-Encoding before lookup, a single resource may be stored as separate entries for gzip, br, gzip, br, and the uncompressed form. Most CDNs handle compression normalization internally and strip Vary: Accept-Encoding from the cache key; verify this in your CDN documentation before relying on it.

How does request collapsing interact with Cache-Control: no-cache?

When Cache-Control: no-cache is present, RFC 9111 requires that caches revalidate before serving. Request collapsing still applies: concurrent revalidation requests for the same resource can be collapsed into a single conditional request to origin. The CDN sends one If-None-Match or If-Modified-Since request and fans the 304 response out to all waiting clients.

Should I use s-maxage or a CDN-specific TTL header?

s-maxage is portable across all RFC-compliant shared caches and is the standard approach. CDN-specific headers (Cloudflare’s CDN-Cache-Control, Fastly’s Surrogate-Control) are useful when you need to set different edge and browser TTLs without exposing CDN directives to end clients. Use the standard header first; add vendor headers only when you need the override capability.