The Real Reason Cache Invalidation is so Hard

Did you cache that?

I could use some clarity here

“There are only two hard things in Computer Science: cache invalidation and naming things.” Phil Karlton

Yes, cache invalidation is hard, but do you know what’s harder? Figuring out what people mean when they say the word cache. And that, accidentally or not, can itself go under the naming things part of Phil Karlton’s famous quote.

IT people have this tendency of slapping words together and expecting others to unscramble whatever they had in their minds when pronouncing them. I don’t know about you, but my mind reading skills only go so far. For that reason, whenever people bring up caching in a brainstorming or discovery session, my immediate reaction is asking them an inquisitive—and often angry—“What do you mean by caching?”

- I mean caching… (At this moment, they tend to give me a condescending smile, as if I’m crazy or just stupid.)
- Noted. But what kind of caching?
- Content caching…
- I could list at least three types of “content caching” from the top of my head. (I do the air quotation marks, like Dr. Evil from Austin Powels.)
- Caching…

You get the picture. Cache and caching are some of the most ambiguous words in the IT world, and that’s because we can have multiple types of caching working in parallel and affecting content independently. Taking that—and the caricaturesque communication (in)abilities of our IT fellows—into account, I decided to put together a simplified list of definitions for what coworkers et al. probably mean when they say those cursed words. As you’ll see, the list is non-exhaustive and contains web-related caching techniques exclusively. If you work with network caching and are looking for information on things such as overload bypass, dynamic client bypass and similar stuff, I wish you the best of luck…

Common types of web caches

If you do web development—front or back end, or both—caching can mean a bunch of different things:

  • browser, or client-side, caching. This type of caching consists of telling your end-user’s browser, or any other web client, to temporarily keep a local version of a web page so that it doesn’t need to bother making an HTTP request to get that page from your web server for a while. As a web developer, you do this by making your web site, API etc., send some specific HTTP response headers back to the browser. The browser then saves that page in the user’s file system. The format depends on the web client. Firefox, for instance, keeps an SQLite file somewhere under its cache directory (~/.cache/mozilla/firefox on my computer);

  • datastore, or key-value, caching. This one consists of getting a combination of values from the database, like MySQL or MongoDB, and storing it temporarily in RAM (Redis, Memcache etc.) The next time your back end needs to read that combination of values, or any single value inside of it, just get it from RAM;

  • full server-side caching. It works more or less like this: build content on the fly—make Java/Perl/etc. chew some stuff behind the scenes and spit an entire HTML page as a result—and store that content as a text file on the server’s file system or in a CDN (Content Delivery Network). The next time a client requests that resource, all your web application needs to do is get the final HTML from the file system or CDN;

  • partial server-side caching. Slightly similar to the previous item: build content on the fly—make Java/Perl/etc. chew some stuff behind the scenes and spit an HTML fragment as a result—and store that content as a text file in a web accelerator that supports ESI (Edge Side Includes), like Varnish. The next time the client requests that resource, Varnish won’t even need to talk to the back end—it will just assemble the pieces it already has.

This is not the whole story, of course—I just listed the most obvious use cases. Web client caching, for example, is not restricted to HTML files; browsers can cache most of what goes inside an HTTP response (a CSS file, a JPEG image etc.)

Also, key-value data stores are not limited to values taken from a database; you can store pretty much anything in Redis and Memcache, from HTML fragments (multi-line strings) to sequences of binary numbers representing images. Should you do it, though? I’m not too sure about that. If you want to cache an entire HTML page, why not just save it as a text file (.html) and send it to a CDN? Same for binary files, like images. It’s simpler and sometimes even faster to use a CDN in those cases.

It was common just a few years ago to stumble upon codebases where all CSS and JavaScript files were stored in MySQL and cached in Memcache. Just look around. Pick a random website you suspect was built around the 2000s (those are easy to spot); open the source code in your browser and you’ll probably see an abundant number of scripts (parse-stylesheets.php?timestamp=1049173200) whose sole purpose is to load the damn stylesheets from a datastore. Today, with the ubiquity of the cloud and services such as AWS Cloudfront and Akamai, you need a compelling reason not to use a CDN.

Finally, don’t forget that multiple caching techniques can—and often are—used together; we, as end-users, just don’t realize it. It’s indeed very common for web architects to use three or even four types of caching in a project: an ESI engine to cache HTML fragments; a key-value datastore for dynamic content; HTTP response headers for web client storage; and a CDN for static files.

/img/2018/09/2018-09-11-the-real-reason-cache-invalidation-is-so-hard/4-layered-caching.jpg

Example of a 4-layered cache architecture.

So, if you ever say the word cache in front of your colleagues, make sure you point out which one you’re referring to. Especially if that colleague is myself…

Conclusion

There’s only one hard thing in Computer Science: human communication. The most complex part of cache invalidation is figuring out what the heck people mean with the word cache. Once you get that sorted out, the rest is not that complicated; the tools are out there, and they’re pretty good.