Architecture 7 min · Feb 2026

A Three-Layer Caching Strategy That Actually Scales

In-memory, distributed, and HTTP caching each solve a different problem. Layer them deliberately.

On a CRM platform I built a three-layer cache that cut latency dramatically. Layer one is a short-lived in-process cache for hot, request-scoped data. Layer two is Redis for shared state across instances. Layer three is HTTP/CDN caching for truly static responses.

The hard part isn't reading from cache — it's invalidation. I key entries carefully, prefer short TTLs with explicit busting on writes, and treat cache as an optimization that the system must remain correct without.

Measure before and after. A cache that improves p50 but wrecks p99 because of stampedes is a net loss. Single-flight and jittered TTLs are your friends.

Vivek Jalondhara

Full Stack Software Engineer

Get in touch

A Three-Layer Caching Strategy That Actually Scales

More writing

Why I Reach for TypeScript on Every Project

React Over Everything: How I Choose a Frontend Stack

Node.js vs. Django: Picking the Right Backend per Problem