Origin Shielding and Request Collapsing

TL;DR: Place a dedicated shield node between your edge PoPs and origin; enable request collapsing on the shield so that concurrent cache misses share a single upstream fetch rather than triggering parallel origin connections.

Cache-Control: public, s-maxage=86400, stale-while-revalidate=3600

Origin shielding and request collapsing topology Diagram showing browser clients sending requests to multiple edge PoPs, which forward cache misses to a single shield node, which makes one request to the origin server. Concurrent misses at the shield are collapsed into a single upstream fetch. Browser A Browser B Browser C Browser D Edge PoP 1 (cache MISS) Edge PoP 2 (cache MISS) Shield Node collapses concurrent misses → 1 fetch Origin 1 request only MISS → shield 1 upstream fetch Cache miss forwarded upstream Single collapsed fetch

Mechanism and RFC Alignment

Origin shielding introduces a dedicated mid-tier cache between distributed edge PoPs and the origin server. Every edge PoP that misses its local cache forwards the request to the shield node, not directly to origin. From the origin’s perspective, only one upstream client exists regardless of how many PoPs are serving traffic.

Request collapsing — also called request coalescing or request serialisation — intercepts concurrent identical requests at the shield. When multiple edge misses arrive simultaneously for the same cache key, the shield holds subsequent requests and satisfies them from a single upstream fetch once the response arrives.

RFC 9111 §4 defines the freshness evaluation model that governs both layers. Section 4.2.1 specifies that max-age sets the freshness lifetime for both shared and private caches, while s-maxage (§5.2.2.10) overrides max-age exclusively for shared caches including CDN edge nodes and shield nodes. RFC 9111 §3.2 further specifies that a response with private or no-store must not be stored by any shared cache — this is the normative basis for why those directives defeat shielding entirely.

Together, these mechanisms eliminate the thundering herd problem: when a cached entry expires, a burst of concurrent cache misses no longer triggers parallel origin connections. Without shielding, simultaneous edge misses from dozens of PoPs each independently fetch from origin, exhausting connection pools and database resources at exactly the moment traffic peaks.

Scope and Precedence

s-maxage targets shared caches and overrides max-age for every CDN node and shield tier. Browsers ignore s-maxage and fall back to max-age. This separation allows long shield and CDN TTLs while keeping browser TTLs short for independent invalidation control.

Directive Browser cache Edge PoP Shield node Origin
max-age=3600 1 h TTL 1 h TTL (unless overridden) 1 h TTL (unless overridden) ignored
s-maxage=86400 ignored 24 h TTL 24 h TTL ignored
private cached must bypass must bypass generates
no-store must not store must not store must not store generates
stale-while-revalidate=3600 browser-specific serve stale 1 h serve stale 1 h generates fresh

Request collapsing depends on deterministic cache key generation. If keys diverge because of unnormalised query parameters, leaked cookies, or overly broad Vary declarations, coalescing fails — each variant is treated as a distinct request with its own upstream fetch. Key normalisation requirements are covered in How CDN Cache Keys Are Generated.

Implementation Patterns

Static assets with long shield TTL

Cache-Control: public, s-maxage=86400, stale-while-revalidate=3600

The shield stores the asset for 24 hours. During the first 24 hours, no origin traffic is generated. After s-maxage expires, stale-while-revalidate=3600 allows the shield to serve the previous version for up to 1 hour while a single background request refreshes the object. This eliminates the miss window at TTL boundary that would otherwise cause a burst of concurrent origin fetches.

API responses with short TTL and error tolerance

Cache-Control: public, s-maxage=60, stale-while-revalidate=30, stale-if-error=300

Appropriate for API responses that change frequently. The shield caches for 60 seconds, serves stale for 30 seconds during background refresh, and falls back to the last cached response for up to 5 minutes if origin returns a 5xx error.

CDN-browser TTL split

Cache-Control: public, s-maxage=604800, max-age=300

The shield and edge hold the asset for 7 days, preventing any origin traffic. Browsers cache for only 5 minutes, allowing faster client-side invalidation. This pattern suits immutable versioned assets where you control the filename on deploy.

Authenticated responses — shield bypass required

Cache-Control: private, no-store

Responses scoped to a specific user must not be stored at the shield. private and no-store both instruct shared caches to bypass storage per RFC 9111. Attempting to cache user-specific responses at the shield risks serving one user’s data to another.

Conditional revalidation at the shield

ETag: "v1-abc123"
Cache-Control: public, s-maxage=3600, must-revalidate

must-revalidate requires the shield to contact origin when the stored response becomes stale, rather than serving stale content. Combine with ETag or Last-Modified so the shield can issue conditional requests and receive efficient 304 Not Modified responses instead of full 200 OK retransfers.

Server and CDN Configuration

Nginx acting as a shield node

proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=shield_cache:64m
                 max_size=10g inactive=7d use_temp_path=off;

server {
    listen 80;

    location / {
        proxy_cache            shield_cache;
        proxy_cache_key        "$scheme$host$request_uri";
        proxy_cache_use_stale  updating error timeout http_500 http_502 http_503;
        proxy_cache_lock       on;          # enables request collapsing
        proxy_cache_lock_timeout 15s;
        proxy_cache_lock_age   5s;
        proxy_pass             http://origin_upstream;

        # Preserve conditional request headers for efficient revalidation
        proxy_set_header       If-None-Match    $http_if_none_match;
        proxy_set_header       If-Modified-Since $http_if_modified_since;
    }
}

proxy_cache_lock on implements request collapsing: concurrent requests for the same uncached resource wait for the first upstream response rather than all fetching simultaneously. proxy_cache_lock_timeout caps how long a waiting request will wait before sending its own upstream fetch as a fallback.

Apache with mod_cache

CacheEnable disk /
CacheRoot /var/cache/apache2
CacheLock on
CacheLockPath /tmp/apache-cache-lock
CacheLockMaxAge 5
CacheHeader on
CacheDefaultExpire 3600
CacheIgnoreNoLastMod On

CacheLock on enables Apache’s request serialisation equivalent. CacheLockMaxAge sets the maximum seconds a request waits for the lock before fetching independently.

Cloudflare

Cloudflare enables request collapsing by default for cacheable requests — no explicit configuration is required. To control the shield-level TTL separately from the browser TTL, use s-maxage:

Cache-Control: public, s-maxage=86400, max-age=3600

In Cloudflare, the “Tiered Cache” feature (available on Business and Enterprise plans) designates an upper-tier PoP as the shield node. Enable it in the dashboard under Caching > Tiered Cache or via the API:

curl -X PATCH "https://api.cloudflare.com/client/v4/zones/{zone_id}/cache/tiered_cache_smart_topology_enable" \
  -H "Authorization: Bearer {token}" \
  -H "Content-Type: application/json" \
  -d '{"value":"on"}'

Fastly

Fastly enables request collapsing (called “request coalescing” in Fastly’s documentation) by default for cacheable objects. To bypass collapsing for a specific request in VCL:

sub vcl_recv {
    # Bypass collapsing for real-time data endpoints
    if (req.url ~ "^/api/realtime/") {
        set req.hash_always_miss = true;
    }
}

To control the shield TTL independently in VCL:

sub vcl_fetch {
    # Override TTL for the shield/CDN layer without affecting browser cache
    if (beresp.http.Cache-Control ~ "s-maxage") {
        set beresp.ttl = std.integer(regsuball(beresp.http.Cache-Control,
            ".*s-maxage=([0-9]+).*", "\1"), 0)s;
    }
}

stale-while-revalidate is the most important companion directive for shielded deployments. Without it, every TTL expiry creates a brief window where the shield serves cache misses to origin until the first fresh response returns. With stale-while-revalidate, the shield serves the previous version immediately and refreshes in the background, keeping hit rates at or near 100% even during active refresh cycles.

Vary headers expand the cache namespace and directly affect collapsing scope. Each distinct Vary variant occupies its own cache key, so collapsing only groups requests that share the exact same normalised key. Overly broad Vary declarations — particularly Vary: User-Agent — fragment the cache so severely that few requests ever share a key, effectively disabling the collapsing benefit. The interaction between Vary and edge routing is explored in Mapping Vary Headers to Edge Routing.

s-maxage and max-age create the CDN-browser TTL split that makes shielding useful. Long s-maxage values keep the shield populated between deployments; short max-age values let browsers re-validate independently without being constrained by the longer shield TTL. See Mastering max-age and s-maxage Directives for the complete precedence model.

no-cache requires the shield to revalidate with origin on every request before serving a stored response. While the object is still stored at the shield, it cannot be served without a round-trip to origin — this preserves correctness for frequently changing resources while still allowing conditional requests and 304 responses to save bandwidth. Compare this with no-store, which prohibits storage entirely and forces full origin fetches on every request.

Verification Workflow

Step 1: Confirm the shield is caching

curl -sI https://example.com/asset.js | grep -iE 'age|x-cache|cf-cache-status|x-served-by'

A response with Age: 0 and X-Cache: MISS on the first request is expected. On the second request, Age should be greater than 0 and X-Cache should show HIT. If Age is always 0, the response is bypassing cache — check for private, no-store, or Set-Cookie headers that prevent shield storage.

Step 2: Confirm request collapsing is active

Send two concurrent requests during a cache miss window (purge the cache first if necessary):

# Send two requests in parallel
curl -sI https://example.com/asset.js &
curl -sI https://example.com/asset.js &
wait

Both responses should return the same content. The second request’s CDN status header should show HIT or a coalesced indicator rather than a second MISS. A second MISS means the requests arrived after the key was populated (that is fine) or that collapsing is disabled.

Step 3: Verify stale-while-revalidate at the shield

After the s-maxage TTL expires (advance your test clock or use a very short TTL), send a request. The response Age header should exceed the s-maxage value — this confirms the shield is serving the stale object rather than waiting for a fresh upstream fetch. Immediately after, a background refresh should populate a new object with Age: 0.

Step 4: Inspect DevTools

In Chrome DevTools Network tab, click the asset request and check the Timing panel. A (from disk cache) or (from memory cache) label confirms browser-layer caching. A fast TTFB with a non-zero Age response header confirms edge or shield layer caching is active.

Step 5: Check origin request rate

Monitor your origin’s access logs or APM during a traffic spike. With shielding active, origin request rate should remain flat even as edge traffic scales. A proportional increase in origin requests during traffic growth indicates shielding is not working — investigate private/no-store responses or cache key fragmentation via How to Debug CDN Cache Key Mismatches.

Failure Modes and Gotchas

  1. Set-Cookie in origin responses disables caching at the shield. Most CDNs refuse to cache any response that includes a Set-Cookie header unless explicitly configured to do so. Move session cookies to a separate authenticated domain and strip them from cacheable asset responses.

  2. Stripping If-None-Match breaks conditional revalidation. If your shield or reverse proxy strips conditional request headers before forwarding to origin, every revalidation fetches a full 200 OK response instead of a lightweight 304 Not Modified. Always preserve If-None-Match and If-Modified-Since in upstream proxy configuration.

  3. Broad Vary headers neutralise collapsing. Each Vary variant is a separate key. Vary: Accept-Encoding with uncontrolled encoding negotiation creates multiple entries per URL. Normalise Accept-Encoding at the shield before cache key construction to prevent fragmentation.

  4. must-revalidate does not mean no-cache. must-revalidate only activates when the stored response is stale — it has no effect on fresh responses. Engineers sometimes mistake it for a directive that forces per-request validation. A fresh response is served directly without any origin contact even with must-revalidate set.

  5. Shield node locality affects revalidation latency. If the shield node is geographically close to origin but far from the majority of edge PoPs, background revalidation is fast but cache miss latency for PoPs is high. Choose shield location to minimise PoP-to-shield round-trip time, not shield-to-origin time.

  6. Collapsing timeout causes duplicate origin fetches. In Nginx, if the first upstream request takes longer than proxy_cache_lock_timeout, waiting requests send their own upstream fetches. During origin slowdowns, this can cause the thundering herd problem it was intended to prevent. Tune proxy_cache_lock_timeout to exceed your P99 origin response time.

  7. CloudFront collapses only within a single PoP. AWS CloudFront collapses concurrent misses from the same PoP but not across PoPs — there is no multi-region shield tier on standard plans. Deploy an intermediary Varnish or Nginx shield in front of CloudFront if cross-region collapsing is required.

  8. Surrogate-Key / Cache-Tag invalidation must preserve active collapsing queues. Triggering a tag-based purge while the shield is mid-revalidation can cause a brief period where waiting collapsed requests receive stale or incomplete responses. Use the CDN vendor’s atomic purge API rather than key-delete-and-replace patterns.

FAQ

Does origin shielding work with Cache-Control: private?

No. RFC 9111 requires shared caches — including shield nodes — to honour private by refusing to store the response. A private response passes through the shield without being cached, so every subsequent request reaches origin directly.

What is the difference between request collapsing and request queuing?

Request collapsing holds concurrent cache-miss requests for the same key until the first upstream response arrives, then fans it out to all waiting clients from a single fetch. Request queuing serialises all requests to origin regardless of cache state. Collapsing is origin-protective; queuing is rate-limiting.

Why does stale-while-revalidate matter at the shield layer?

When s-maxage expires, stale-while-revalidate allows the shield to serve the previous response immediately while a single background request refreshes the object. This keeps hit rates high during refresh cycles and eliminates the brief miss window that otherwise causes a thundering herd at expiry.

Can I use origin shielding with Vary headers?

Yes, but each distinct Vary variant is a separate cache key. Request collapsing only applies within a single key, so broad Vary declarations fragment the cache into many keys and reduce the probability that concurrent requests share the same key — partly defeating the collapsing benefit.

How do I verify request collapsing is working?

Send two concurrent requests and inspect the Age header and CDN status header (X-Cache, CF-Cache-Status) in both responses. The second should return HIT or a collapsed-status indicator, not a second MISS. If both show MISS with Age: 0, your cache key is not deterministic or collapsing is disabled.


Back to CDN Architecture & Edge Routing Strategies