A Three-Layer Caching Strategy That Actually Scales
In-memory, distributed, and HTTP caching each solve a different problem. Layer them deliberately.
On a CRM platform I built a three-layer cache that cut latency dramatically. Layer one is a short-lived in-process cache for hot, request-scoped data. Layer two is Redis for shared state across instances. Layer three is HTTP/CDN caching for truly static responses.
The hard part isn't reading from cache — it's invalidation. I key entries carefully, prefer short TTLs with explicit busting on writes, and treat cache as an optimization that the system must remain correct without.
Measure before and after. A cache that improves p50 but wrecks p99 because of stampedes is a net loss. Single-flight and jittered TTLs are your friends.
Vivek Jalondhara
Full Stack Software Engineer