Micro Caching stores dynamic content for extremely short time windows — typically between 1 and 10 seconds. This technique reduces pressure on origin servers during traffic spikes while maintaining data freshness for frequently changing content.
What is Micro Caching?
Micro Caching is a caching technique that stores dynamic content for very short time windows — typically between 1 and 10 seconds.
This window is short enough to ensure data remains fresh, but long enough to absorb bursts of simultaneous requests before they reach the origin server.
Micro Caching isn’t the same as traditional cache.
Traditional cache stores static content for minutes, hours, or days. Micro Caching operates on a seconds scale and is specifically designed for data that changes frequently, but not every millisecond.
Difference between caching strategies
It’s important to understand the distinction between Micro Caching and other caching strategies:
| Strategy | Question it answers | Focus |
|---|---|---|
| Micro Caching | How long to cache? | Short TTL for dynamic data |
| Tiered Caching | How many layers to cache? | Hierarchy between points of presence and origin |
| Granular Caching | What to cache and with what rule? | Segmentation by headers and cookies |
| Selective Caching | Which criteria to use? | Cache key optimization |
All four are complementary — not mutually exclusive.
How Micro Caching works
When a request arrives at a point of presence in the distributed infrastructure, the system checks if a cached response exists and is still valid:
- Cache HIT: If the content is cached and TTL hasn’t expired, the response is delivered immediately without contacting the origin.
- Cache MISS: If the content isn’t cached or TTL has expired, the request goes to the origin, and the response is cached for the configured TTL.
With Micro Caching, even a 3-second TTL can absorb hundreds or thousands of identical requests during a traffic spike, preventing origin overload.
Request arrives at point of presence ↓Check cache ↓Cache HIT (TTL valid) → Immediate responseCache MISS → Request origin → Cache response (TTL: 5s) → ResponseWhat can and cannot be cached with Micro Caching
The correct separation between cacheable and non-cacheable data is the foundation of a safe Micro Caching strategy.
Eligibility table and recommended TTL
| Resource | Cacheable? | Recommended TTL | Justification |
|---|---|---|---|
| Shipping calculation by ZIP code range | ✅ Yes | 5 to 10 seconds | Changes little in short intervals, but is heavily consulted during spikes |
| Promotion previews and eligibility | ✅ Yes | 2 to 5 seconds | Campaign rules are stable during flash sale execution |
| Stock level for display | ✅ Yes | 3 to 5 seconds | Provides stable response while central system processes reservations |
| Feature flags and session configurations | ✅ Yes | 5 to 30 seconds | Low variation, high query volume |
| Catalog and product fragments | ✅ Yes | 10 to 60 seconds | Read data with controlled versioning |
| Cart summary | ⚠️ With care | 1 to 2 seconds with key per session | Requires immediate invalidation when state changes |
| Payment methods list | ✅ Yes | 30 to 60 seconds | Rarely changes during an active session |
| Payment authorization | ❌ No | — | Always transactional — never cache |
| Order finalization | ❌ No | — | Unrepeatable write operation |
| Any state-altering operation | ❌ No | — | No cache without idempotency controls |
Practical rule: if data is read-only and can be shared between multiple users without risk of session contamination, it can probably be accelerated with Micro Caching.
Zero TTL: when time is the smallest possible
Zero TTL is an extreme case of Micro Caching where storage time is reduced to the functional minimum — often fractions of a second or the time needed to resolve a burst of simultaneous requests.
Zero TTL isn’t the same as cache bypass.
Bypass ignores cache completely and sends each request directly to the origin. Zero TTL still uses cache as a coordination mechanism — it stores the response long enough to prevent identical simultaneous requests from overloading the backend.
When to use Zero TTL
- Data that changes every few seconds but has concentrated query spikes
- High-concurrency endpoints where any positive TTL represents stale data risk
- Situations where Request Coalescing needs additional support for extreme spikes
Micro Caching and Granular Caching
If Micro Caching defines how long to cache, Granular Caching defines what to cache — and with what segmentation criteria.
Using Advanced Cache Keys, you can store different versions of the same resource based on specific HTTP request information:
Segmentation by cookie Allows caching cart summary fragments or session in isolation, ensuring one customer never receives another’s data.
Segmentation by header
Serves different API versions based on Device-Type, Accept-Language, or User-Segment — without duplicating logic in the application.
Segmentation by query string Essential for recommendation APIs, search filters, and campaign parameters that generate dynamic URLs.
How to combine Micro Caching and Granular Caching
The combination works like this:
- Micro Caching defines the time window — how many seconds data can be reused
- Granular Caching defines the key — which combination of headers, cookies, or query strings uniquely identifies this data
- Selective bypass ensures the boundary — critical operations like payment bypass cache completely
Shipping request ↓[Granular Cache Key: ZIP prefix + Device-Type] ↓[Micro Cache TTL: 5 seconds] ↓Cache HIT → immediate responseCache MISS → origin → populates cache → responseSelective bypass: protecting critical operations
It’s not enough to know what to cache. You need to ensure operations that shouldn’t be cached never go through cache — regardless of general configuration.
Selective bypass applies an explicit rule per endpoint or operation type:
| Operation | Correct behavior |
|---|---|
GET /api/shipping-options | Micro Caching with 5s TTL |
GET /api/promotions/eligibility | Micro Caching with 3s TTL |
POST /api/cart/update | Total bypass — always to origin |
POST /api/payment/authorize | Total bypass — always to origin |
POST /api/order/confirm | Total bypass — always to origin |
The boundary between what’s accelerated and what’s transactional must be explicit in configuration — not assumed.
Stale-While-Revalidate and Key-based Purge
Two techniques complement Micro Caching in practice:
Stale-while-revalidate While TTL expires and data is updated in the background, the user receives the cached version without perceiving latency. During traffic spikes, this preserves availability even when the origin starts to degrade.
Key-based purge When cart state changes or a price is updated, invalidation must be surgical — removing only the affected key, without impacting the rest of the application.
Programmable cache: from static configuration to contextual logic
The advancement of Micro Caching lies in going beyond static TTL rules. With programmable cache, you can define by code how each request should be handled directly in distributed infrastructure.
This means teams can:
- Test new cache strategies without rebuilding the infrastructure layer
- Adjust TTL per user segment in real-time
- Pre-warm caches before major campaigns
- Automate invalidations based on stock or promotion events
- Integrate cache rules directly into application workflows via APIs
During a flash sale, for example, Micro Caching rules can be automatically activated for high-concurrency endpoints — and deactivated when the spike passes — without manual intervention.
Real example: Marisa
Marisa is a concrete example of how intelligent cache transforms the relationship between performance and cost in enterprise e-commerce.
With Azion’s distributed infrastructure, Marisa started delivering more than 85% of their data directly in distributed layers, without querying the origin.
The result was:
| Metric | Result |
|---|---|
| Data delivered in distributed infrastructure | 85% |
| Bandwidth savings per day | 4.3 TB |
| Stability during high-demand periods | ✅ Maintained |
| Faster pages with lower costs | ✅ Confirmed |
Saving 4.3 TB of bandwidth per day isn’t just an infrastructure metric. It’s the difference between a system that handles the spike and one that silently degrades.
FAQ
What is Micro Caching?
It’s the technique of storing dynamic content for very short time windows — typically between 1 and 10 seconds — to reduce pressure on the origin without compromising data freshness.
What’s the difference between Micro Caching and Zero TTL?
Micro Caching uses TTL of seconds to balance performance and freshness. Zero TTL reduces this time to the functional minimum — enough to coordinate simultaneous requests without delivering stale data.
Can Micro Caching compromise transactional data?
No, if the boundary between cacheable data and transactional operations is explicit. Payment authorization, order finalization, and any state-altering operation should never be cached.
How to define the correct TTL for each resource?
The ideal TTL depends on two factors: the data update frequency and the risk of displaying a slightly outdated version. Shipping by ZIP code tolerates 5 to 10 seconds. Cart summary requires key per session and immediate invalidation when state changes.
What’s the difference between Micro Caching and Tiered Caching?
Micro Caching defines how long to cache — short TTL for dynamic data. Tiered Caching defines how many layers to distribute cache — hierarchy between points of presence and origin. They’re complementary strategies.
When to use Granular Caching with Micro Caching?
Whenever the same resource needs different versions per user segment, device, or location. Granular Caching defines the key; Micro Caching defines the time window.
Does programmable cache replace static TTL configuration?
It doesn’t replace — it expands. Static configuration defines default behavior. Programmable cache adds contextual logic: different TTL per segment, automatic invalidation by business event, pre-warming before campaigns.
Conclusion
Micro Caching solves the challenge of caching dynamic content with precision: TTL windows short enough to maintain data freshness, long enough to absorb traffic spikes before they overload the origin.
With Granular Caching as a complement and selective bypass as a safety boundary, it’s possible to accelerate most of the transactional flow without compromising the integrity of any critical operation.
Next steps
Check out Azion’s Cache solution and see how it implements Open Caching principles to ensure performance, resilience, and architectural freedom for global operations.
Want to implement Micro Caching safely?