What is Micro Caching?

Micro Caching stores dynamic content for extremely short time windows — typically between 1 and 10 seconds. This technique reduces pressure on origin servers during traffic spikes while maintaining data freshness for frequently changing content.

Micro Caching is a caching technique that stores dynamic content for very short time windows — typically between 1 and 10 seconds.

This window is short enough to ensure data remains fresh, but long enough to absorb bursts of simultaneous requests before they reach the origin server.

Micro Caching isn’t the same as traditional cache.

Traditional cache stores static content for minutes, hours, or days. Micro Caching operates on a seconds scale and is specifically designed for data that changes frequently, but not every millisecond.

Difference between caching strategies

It’s important to understand the distinction between Micro Caching and other caching strategies:

Strategy	Question it answers	Focus
Micro Caching	How long to cache?	Short TTL for dynamic data
Tiered Caching	How many layers to cache?	Hierarchy between points of presence and origin
Granular Caching	What to cache and with what rule?	Segmentation by headers and cookies
Selective Caching	Which criteria to use?	Cache key optimization

All four are complementary — not mutually exclusive.

How Micro Caching works

When a request arrives at a point of presence in the distributed infrastructure, the system checks if a cached response exists and is still valid:

Cache HIT: If the content is cached and TTL hasn’t expired, the response is delivered immediately without contacting the origin.
Cache MISS: If the content isn’t cached or TTL has expired, the request goes to the origin, and the response is cached for the configured TTL.

With Micro Caching, even a 3-second TTL can absorb hundreds or thousands of identical requests during a traffic spike, preventing origin overload.

Request arrives at point of presence
      ↓
Check cache
      ↓
Cache HIT (TTL valid) → Immediate response
Cache MISS → Request origin → Cache response (TTL: 5s) → Response

What can and cannot be cached with Micro Caching

The correct separation between cacheable and non-cacheable data is the foundation of a safe Micro Caching strategy.

Eligibility table and recommended TTL

Resource	Cacheable?	Recommended TTL	Justification
Shipping calculation by ZIP code range	✅ Yes	5 to 10 seconds	Changes little in short intervals, but is heavily consulted during spikes
Promotion previews and eligibility	✅ Yes	2 to 5 seconds	Campaign rules are stable during flash sale execution
Stock level for display	✅ Yes	3 to 5 seconds	Provides stable response while central system processes reservations
Feature flags and session configurations	✅ Yes	5 to 30 seconds	Low variation, high query volume
Catalog and product fragments	✅ Yes	10 to 60 seconds	Read data with controlled versioning
Cart summary	⚠️ With care	1 to 2 seconds with key per session	Requires immediate invalidation when state changes
Payment methods list	✅ Yes	30 to 60 seconds	Rarely changes during an active session
Payment authorization	❌ No	—	Always transactional — never cache
Order finalization	❌ No	—	Unrepeatable write operation
Any state-altering operation	❌ No	—	No cache without idempotency controls

Practical rule: if data is read-only and can be shared between multiple users without risk of session contamination, it can probably be accelerated with Micro Caching.

Zero TTL: when time is the smallest possible

Zero TTL is an extreme case of Micro Caching where storage time is reduced to the functional minimum — often fractions of a second or the time needed to resolve a burst of simultaneous requests.

Zero TTL isn’t the same as cache bypass.

Bypass ignores cache completely and sends each request directly to the origin. Zero TTL still uses cache as a coordination mechanism — it stores the response long enough to prevent identical simultaneous requests from overloading the backend.

When to use Zero TTL

Data that changes every few seconds but has concentrated query spikes
High-concurrency endpoints where any positive TTL represents stale data risk
Situations where Request Coalescing needs additional support for extreme spikes

Micro Caching and Granular Caching

If Micro Caching defines how long to cache, Granular Caching defines what to cache — and with what segmentation criteria.

Using Advanced Cache Keys, you can store different versions of the same resource based on specific HTTP request information:

Segmentation by cookie Allows caching cart summary fragments or session in isolation, ensuring one customer never receives another’s data.

Segmentation by header Serves different API versions based on Device-Type, Accept-Language, or User-Segment — without duplicating logic in the application.

Segmentation by query string Essential for recommendation APIs, search filters, and campaign parameters that generate dynamic URLs.

How to combine Micro Caching and Granular Caching

The combination works like this:

Micro Caching defines the time window — how many seconds data can be reused
Granular Caching defines the key — which combination of headers, cookies, or query strings uniquely identifies this data
Selective bypass ensures the boundary — critical operations like payment bypass cache completely

Shipping request
      ↓
[Granular Cache Key: ZIP prefix + Device-Type]
      ↓
[Micro Cache TTL: 5 seconds]
      ↓
Cache HIT → immediate response
Cache MISS → origin → populates cache → response

Selective bypass: protecting critical operations

It’s not enough to know what to cache. You need to ensure operations that shouldn’t be cached never go through cache — regardless of general configuration.

Selective bypass applies an explicit rule per endpoint or operation type:

Operation	Correct behavior
`GET /api/shipping-options`	Micro Caching with 5s TTL
`GET /api/promotions/eligibility`	Micro Caching with 3s TTL
`POST /api/cart/update`	Total bypass — always to origin
`POST /api/payment/authorize`	Total bypass — always to origin
`POST /api/order/confirm`	Total bypass — always to origin

The boundary between what’s accelerated and what’s transactional must be explicit in configuration — not assumed.

Stale-While-Revalidate and Key-based Purge

Two techniques complement Micro Caching in practice:

Stale-while-revalidate While TTL expires and data is updated in the background, the user receives the cached version without perceiving latency. During traffic spikes, this preserves availability even when the origin starts to degrade.

Key-based purge When cart state changes or a price is updated, invalidation must be surgical — removing only the affected key, without impacting the rest of the application.

Programmable cache: from static configuration to contextual logic

The advancement of Micro Caching lies in going beyond static TTL rules. With programmable cache, you can define by code how each request should be handled directly in distributed infrastructure.

This means teams can:

Test new cache strategies without rebuilding the infrastructure layer
Adjust TTL per user segment in real-time
Pre-warm caches before major campaigns
Automate invalidations based on stock or promotion events
Integrate cache rules directly into application workflows via APIs

During a flash sale, for example, Micro Caching rules can be automatically activated for high-concurrency endpoints — and deactivated when the spike passes — without manual intervention.

Real example: Marisa

Marisa is a concrete example of how intelligent cache transforms the relationship between performance and cost in enterprise e-commerce.

With Azion’s distributed infrastructure, Marisa started delivering more than 85% of their data directly in distributed layers, without querying the origin.

The result was:

Metric	Result
Data delivered in distributed infrastructure	85%
Bandwidth savings per day	4.3 TB
Stability during high-demand periods	✅ Maintained
Faster pages with lower costs	✅ Confirmed

Saving 4.3 TB of bandwidth per day isn’t just an infrastructure metric. It’s the difference between a system that handles the spike and one that silently degrades.

FAQ

What is Micro Caching?

It’s the technique of storing dynamic content for very short time windows — typically between 1 and 10 seconds — to reduce pressure on the origin without compromising data freshness.

What’s the difference between Micro Caching and Zero TTL?

Micro Caching uses TTL of seconds to balance performance and freshness. Zero TTL reduces this time to the functional minimum — enough to coordinate simultaneous requests without delivering stale data.

Can Micro Caching compromise transactional data?

No, if the boundary between cacheable data and transactional operations is explicit. Payment authorization, order finalization, and any state-altering operation should never be cached.

How to define the correct TTL for each resource?

The ideal TTL depends on two factors: the data update frequency and the risk of displaying a slightly outdated version. Shipping by ZIP code tolerates 5 to 10 seconds. Cart summary requires key per session and immediate invalidation when state changes.

What’s the difference between Micro Caching and Tiered Caching?

Micro Caching defines how long to cache — short TTL for dynamic data. Tiered Caching defines how many layers to distribute cache — hierarchy between points of presence and origin. They’re complementary strategies.

When to use Granular Caching with Micro Caching?

Whenever the same resource needs different versions per user segment, device, or location. Granular Caching defines the key; Micro Caching defines the time window.

Does programmable cache replace static TTL configuration?

It doesn’t replace — it expands. Static configuration defines default behavior. Programmable cache adds contextual logic: different TTL per segment, automatic invalidation by business event, pre-warming before campaigns.

Conclusion

Micro Caching solves the challenge of caching dynamic content with precision: TTL windows short enough to maintain data freshness, long enough to absorb traffic spikes before they overload the origin.

With Granular Caching as a complement and selective bypass as a safety boundary, it’s possible to accelerate most of the transactional flow without compromising the integrity of any critical operation.

Next steps

Check out Azion’s Cache solution and see how it implements Open Caching principles to ensure performance, resilience, and architectural freedom for global operations.

Want to implement Micro Caching safely?

Talk to an Azion specialist

Join our community

What is Micro Caching? | Short TTL Caching for Dynamic Content

Learn what Micro Caching is, how it stores dynamic content for short time windows (1-10 seconds), and when to use it to reduce origin load without compromising data freshness.

What is Micro Caching?

Difference between caching strategies

How Micro Caching works

What can and cannot be cached with Micro Caching

Eligibility table and recommended TTL

Zero TTL: when time is the smallest possible

When to use Zero TTL

Micro Caching and Granular Caching

How to combine Micro Caching and Granular Caching

Selective bypass: protecting critical operations

Stale-While-Revalidate and Key-based Purge

Programmable cache: from static configuration to contextual logic

Real example: Marisa

FAQ

What is Micro Caching?

What’s the difference between Micro Caching and Zero TTL?

Can Micro Caching compromise transactional data?

How to define the correct TTL for each resource?

What’s the difference between Micro Caching and Tiered Caching?

When to use Granular Caching with Micro Caching?

Does programmable cache replace static TTL configuration?

Conclusion

Next steps

Subscribe to our Newsletter