Origin Shielding and Request Collapsing
TL;DR: Place a dedicated shield node between your edge PoPs and origin; enable request collapsing on the shield so that concurrent cache misses share a single upstream fetch rather than triggering parallel origin connections.
Cache-Control: public, s-maxage=86400, stale-while-revalidate=3600
Mechanism and RFC Alignment
Origin shielding introduces a dedicated mid-tier cache between distributed edge PoPs and the origin server. Every edge PoP that misses its local cache forwards the request to the shield node, not directly to origin. From the origin’s perspective, only one upstream client exists regardless of how many PoPs are serving traffic.
Request collapsing — also called request coalescing or request serialisation — intercepts concurrent identical requests at the shield. When multiple edge misses arrive simultaneously for the same cache key, the shield holds subsequent requests and satisfies them from a single upstream fetch once the response arrives.
RFC 9111 §4 defines the freshness evaluation model that governs both layers. Section 4.2.1 specifies that max-age sets the freshness lifetime for both shared and private caches, while s-maxage (§5.2.2.10) overrides max-age exclusively for shared caches including CDN edge nodes and shield nodes. RFC 9111 §3.2 further specifies that a response with private or no-store must not be stored by any shared cache — this is the normative basis for why those directives defeat shielding entirely.
Together, these mechanisms eliminate the thundering herd problem: when a cached entry expires, a burst of concurrent cache misses no longer triggers parallel origin connections. Without shielding, simultaneous edge misses from dozens of PoPs each independently fetch from origin, exhausting connection pools and database resources at exactly the moment traffic peaks.
Scope and Precedence
s-maxage targets shared caches and overrides max-age for every CDN node and shield tier. Browsers ignore s-maxage and fall back to max-age. This separation allows long shield and CDN TTLs while keeping browser TTLs short for independent invalidation control.
| Directive | Browser cache | Edge PoP | Shield node | Origin |
|---|---|---|---|---|
max-age=3600 |
1 h TTL | 1 h TTL (unless overridden) | 1 h TTL (unless overridden) | ignored |
s-maxage=86400 |
ignored | 24 h TTL | 24 h TTL | ignored |
private |
cached | must bypass | must bypass | generates |
no-store |
must not store | must not store | must not store | generates |
stale-while-revalidate=3600 |
browser-specific | serve stale 1 h | serve stale 1 h | generates fresh |
Request collapsing depends on deterministic cache key generation. If keys diverge because of unnormalised query parameters, leaked cookies, or overly broad Vary declarations, coalescing fails — each variant is treated as a distinct request with its own upstream fetch. Key normalisation requirements are covered in How CDN Cache Keys Are Generated.
Implementation Patterns
Static assets with long shield TTL
Cache-Control: public, s-maxage=86400, stale-while-revalidate=3600
The shield stores the asset for 24 hours. During the first 24 hours, no origin traffic is generated. After s-maxage expires, stale-while-revalidate=3600 allows the shield to serve the previous version for up to 1 hour while a single background request refreshes the object. This eliminates the miss window at TTL boundary that would otherwise cause a burst of concurrent origin fetches.
API responses with short TTL and error tolerance
Cache-Control: public, s-maxage=60, stale-while-revalidate=30, stale-if-error=300
Appropriate for API responses that change frequently. The shield caches for 60 seconds, serves stale for 30 seconds during background refresh, and falls back to the last cached response for up to 5 minutes if origin returns a 5xx error.
CDN-browser TTL split
Cache-Control: public, s-maxage=604800, max-age=300
The shield and edge hold the asset for 7 days, preventing any origin traffic. Browsers cache for only 5 minutes, allowing faster client-side invalidation. This pattern suits immutable versioned assets where you control the filename on deploy.
Authenticated responses — shield bypass required
Cache-Control: private, no-store
Responses scoped to a specific user must not be stored at the shield. private and no-store both instruct shared caches to bypass storage per RFC 9111. Attempting to cache user-specific responses at the shield risks serving one user’s data to another.
Conditional revalidation at the shield
ETag: "v1-abc123"
Cache-Control: public, s-maxage=3600, must-revalidate
must-revalidate requires the shield to contact origin when the stored response becomes stale, rather than serving stale content. Combine with ETag or Last-Modified so the shield can issue conditional requests and receive efficient 304 Not Modified responses instead of full 200 OK retransfers.
Server and CDN Configuration
Nginx acting as a shield node
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=shield_cache:64m
max_size=10g inactive=7d use_temp_path=off;
server {
listen 80;
location / {
proxy_cache shield_cache;
proxy_cache_key "$scheme$host$request_uri";
proxy_cache_use_stale updating error timeout http_500 http_502 http_503;
proxy_cache_lock on; # enables request collapsing
proxy_cache_lock_timeout 15s;
proxy_cache_lock_age 5s;
proxy_pass http://origin_upstream;
# Preserve conditional request headers for efficient revalidation
proxy_set_header If-None-Match $http_if_none_match;
proxy_set_header If-Modified-Since $http_if_modified_since;
}
}
proxy_cache_lock on implements request collapsing: concurrent requests for the same uncached resource wait for the first upstream response rather than all fetching simultaneously. proxy_cache_lock_timeout caps how long a waiting request will wait before sending its own upstream fetch as a fallback.
Apache with mod_cache
CacheEnable disk /
CacheRoot /var/cache/apache2
CacheLock on
CacheLockPath /tmp/apache-cache-lock
CacheLockMaxAge 5
CacheHeader on
CacheDefaultExpire 3600
CacheIgnoreNoLastMod On
CacheLock on enables Apache’s request serialisation equivalent. CacheLockMaxAge sets the maximum seconds a request waits for the lock before fetching independently.
Cloudflare
Cloudflare enables request collapsing by default for cacheable requests — no explicit configuration is required. To control the shield-level TTL separately from the browser TTL, use s-maxage:
Cache-Control: public, s-maxage=86400, max-age=3600
In Cloudflare, the “Tiered Cache” feature (available on Business and Enterprise plans) designates an upper-tier PoP as the shield node. Enable it in the dashboard under Caching > Tiered Cache or via the API:
curl -X PATCH "https://api.cloudflare.com/client/v4/zones/{zone_id}/cache/tiered_cache_smart_topology_enable" \
-H "Authorization: Bearer {token}" \
-H "Content-Type: application/json" \
-d '{"value":"on"}'
Fastly
Fastly enables request collapsing (called “request coalescing” in Fastly’s documentation) by default for cacheable objects. To bypass collapsing for a specific request in VCL:
sub vcl_recv {
# Bypass collapsing for real-time data endpoints
if (req.url ~ "^/api/realtime/") {
set req.hash_always_miss = true;
}
}
To control the shield TTL independently in VCL:
sub vcl_fetch {
# Override TTL for the shield/CDN layer without affecting browser cache
if (beresp.http.Cache-Control ~ "s-maxage") {
set beresp.ttl = std.integer(regsuball(beresp.http.Cache-Control,
".*s-maxage=([0-9]+).*", "\1"), 0)s;
}
}
Interaction with Related Directives
stale-while-revalidate is the most important companion directive for shielded deployments. Without it, every TTL expiry creates a brief window where the shield serves cache misses to origin until the first fresh response returns. With stale-while-revalidate, the shield serves the previous version immediately and refreshes in the background, keeping hit rates at or near 100% even during active refresh cycles.
Vary headers expand the cache namespace and directly affect collapsing scope. Each distinct Vary variant occupies its own cache key, so collapsing only groups requests that share the exact same normalised key. Overly broad Vary declarations — particularly Vary: User-Agent — fragment the cache so severely that few requests ever share a key, effectively disabling the collapsing benefit. The interaction between Vary and edge routing is explored in Mapping Vary Headers to Edge Routing.
s-maxage and max-age create the CDN-browser TTL split that makes shielding useful. Long s-maxage values keep the shield populated between deployments; short max-age values let browsers re-validate independently without being constrained by the longer shield TTL. See Mastering max-age and s-maxage Directives for the complete precedence model.
no-cache requires the shield to revalidate with origin on every request before serving a stored response. While the object is still stored at the shield, it cannot be served without a round-trip to origin — this preserves correctness for frequently changing resources while still allowing conditional requests and 304 responses to save bandwidth. Compare this with no-store, which prohibits storage entirely and forces full origin fetches on every request.
Verification Workflow
Step 1: Confirm the shield is caching
curl -sI https://example.com/asset.js | grep -iE 'age|x-cache|cf-cache-status|x-served-by'
A response with Age: 0 and X-Cache: MISS on the first request is expected. On the second request, Age should be greater than 0 and X-Cache should show HIT. If Age is always 0, the response is bypassing cache — check for private, no-store, or Set-Cookie headers that prevent shield storage.
Step 2: Confirm request collapsing is active
Send two concurrent requests during a cache miss window (purge the cache first if necessary):
# Send two requests in parallel
curl -sI https://example.com/asset.js &
curl -sI https://example.com/asset.js &
wait
Both responses should return the same content. The second request’s CDN status header should show HIT or a coalesced indicator rather than a second MISS. A second MISS means the requests arrived after the key was populated (that is fine) or that collapsing is disabled.
Step 3: Verify stale-while-revalidate at the shield
After the s-maxage TTL expires (advance your test clock or use a very short TTL), send a request. The response Age header should exceed the s-maxage value — this confirms the shield is serving the stale object rather than waiting for a fresh upstream fetch. Immediately after, a background refresh should populate a new object with Age: 0.
Step 4: Inspect DevTools
In Chrome DevTools Network tab, click the asset request and check the Timing panel. A (from disk cache) or (from memory cache) label confirms browser-layer caching. A fast TTFB with a non-zero Age response header confirms edge or shield layer caching is active.
Step 5: Check origin request rate
Monitor your origin’s access logs or APM during a traffic spike. With shielding active, origin request rate should remain flat even as edge traffic scales. A proportional increase in origin requests during traffic growth indicates shielding is not working — investigate private/no-store responses or cache key fragmentation via How to Debug CDN Cache Key Mismatches.
Failure Modes and Gotchas
-
Set-Cookiein origin responses disables caching at the shield. Most CDNs refuse to cache any response that includes aSet-Cookieheader unless explicitly configured to do so. Move session cookies to a separate authenticated domain and strip them from cacheable asset responses. -
Stripping
If-None-Matchbreaks conditional revalidation. If your shield or reverse proxy strips conditional request headers before forwarding to origin, every revalidation fetches a full200 OKresponse instead of a lightweight304 Not Modified. Always preserveIf-None-MatchandIf-Modified-Sincein upstream proxy configuration. -
Broad
Varyheaders neutralise collapsing. EachVaryvariant is a separate key.Vary: Accept-Encodingwith uncontrolled encoding negotiation creates multiple entries per URL. NormaliseAccept-Encodingat the shield before cache key construction to prevent fragmentation. -
must-revalidatedoes not meanno-cache.must-revalidateonly activates when the stored response is stale — it has no effect on fresh responses. Engineers sometimes mistake it for a directive that forces per-request validation. A fresh response is served directly without any origin contact even withmust-revalidateset. -
Shield node locality affects revalidation latency. If the shield node is geographically close to origin but far from the majority of edge PoPs, background revalidation is fast but cache miss latency for PoPs is high. Choose shield location to minimise PoP-to-shield round-trip time, not shield-to-origin time.
-
Collapsing timeout causes duplicate origin fetches. In Nginx, if the first upstream request takes longer than
proxy_cache_lock_timeout, waiting requests send their own upstream fetches. During origin slowdowns, this can cause the thundering herd problem it was intended to prevent. Tuneproxy_cache_lock_timeoutto exceed your P99 origin response time. -
CloudFront collapses only within a single PoP. AWS CloudFront collapses concurrent misses from the same PoP but not across PoPs — there is no multi-region shield tier on standard plans. Deploy an intermediary Varnish or Nginx shield in front of CloudFront if cross-region collapsing is required.
-
Surrogate-Key / Cache-Tag invalidation must preserve active collapsing queues. Triggering a tag-based purge while the shield is mid-revalidation can cause a brief period where waiting collapsed requests receive stale or incomplete responses. Use the CDN vendor’s atomic purge API rather than key-delete-and-replace patterns.
FAQ
Does origin shielding work with Cache-Control: private?
No. RFC 9111 requires shared caches — including shield nodes — to honour private by refusing to store the response. A private response passes through the shield without being cached, so every subsequent request reaches origin directly.
What is the difference between request collapsing and request queuing?
Request collapsing holds concurrent cache-miss requests for the same key until the first upstream response arrives, then fans it out to all waiting clients from a single fetch. Request queuing serialises all requests to origin regardless of cache state. Collapsing is origin-protective; queuing is rate-limiting.
Why does stale-while-revalidate matter at the shield layer?
When s-maxage expires, stale-while-revalidate allows the shield to serve the previous response immediately while a single background request refreshes the object. This keeps hit rates high during refresh cycles and eliminates the brief miss window that otherwise causes a thundering herd at expiry.
Can I use origin shielding with Vary headers?
Yes, but each distinct Vary variant is a separate cache key. Request collapsing only applies within a single key, so broad Vary declarations fragment the cache into many keys and reduce the probability that concurrent requests share the same key — partly defeating the collapsing benefit.
How do I verify request collapsing is working?
Send two concurrent requests and inspect the Age header and CDN status header (X-Cache, CF-Cache-Status) in both responses. The second should return HIT or a collapsed-status indicator, not a second MISS. If both show MISS with Age: 0, your cache key is not deterministic or collapsing is disabled.
Related
- Understand how CDN nodes build lookup keys — and why key fragmentation defeats collapsing — in How CDN Cache Keys Are Generated.
- The full precedence model for
s-maxageversusmax-ageacross browser and shared-cache tiers is in Mastering max-age and s-maxage Directives. - Step-by-step guidance for isolating a specific cache key mismatch at the CDN layer is in How to Debug CDN Cache Key Mismatches.
- Tag-based invalidation patterns that work alongside shielded deployments are covered in Tag-Based Cache Invalidation Patterns.