RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


A Guide to Caching and Compression for High Performance Web Applications

Understanding HTTP's cache headers and compression capabilities is a prerequisite for building high-performance web applications.


ver the past several years, web applications have evolved from collections of simple HTML pages into highly scalable and interactive rich applications built using a variety of technologies. Designing and developing these applications is complex. In addition, decision makers are increasingly seeking to build even more rich interactive capabilities into such applications while still maintaining or improving their performance. But high performance comes at a cost. To build web applications that deliver a solid end user experience, developers need to address the potential performance bottlenecks.

This article focuses on caching—an imperative for delivering high performance applications—and also briefly touches on compression. Several companies produce and sell specialized compression and performance products. This article seeks to simply describe the things that developers can build into their applications at both the client and server levels before seeking specialized products to solve performance problems.

Performance Bottlenecks

The performance bottlenecks are primarily high latency, congestion, and server load. Caching can't address all three problems, but with careful design considerations, caching can improve performance. You can cache content at both the server and the client levels. It turns out that, on average, downloading HTML requires only 10 to 20 percent of the total end user response time; the other 80 to 90 percent is spent downloading all the other components in the page. Such components typically include images such as company logos, which don't change from page to page, yet must be downloaded each time a user requests a page containing that logo. Caching the logo can avoid several roundtrips to the server.

Figure 1. Caching on the Internet: The diagram shows a typical request along with the opportunities for retrieving cached information.

Simply put, cache is temporary storage. It replicates data on a different computer or in a different location than the original data source. With the right configuration, access to cached data access is faster than access to the original data. Using cached data also reduces server load and bandwidth consumption, resulting in enhanced performance from an end user's point of view.

Figure 1 provides a quick overview of how the Internet works, and where caching comes into play.


As you can see from Figure 1, it's both possible and beneficial to cache data on both the client and server. Figure 2 shows a different view of the three cache locations, which are:

  1. Client browser cache: The browser caches web objects and can respond to repeat requests without having to request data from the Internet.
  2. Server-side forward proxy cache: Although there are variants, these cache locations are typically inside the end user's firewall, and can respond to requests without needing to request data from original sources.
    Figure 2. Cache Configurations: This diagram shows the three typical cache locations.
  4. Server-side reverse proxy cache: Also known as gateway or surrogate cache, these cache servers operate on behalf of a customer's origin server. The term content delivery network (CDN), is a collection of these reverse proxy caches.

You can cache any object that might be requested more than once. However, there's always a danger that a cached copy of an object might become stale—in other words, no longer accurately reflect the original data. Therefore, you control caching for all cacheable objects using two basic parameters: freshness and validation.

Both freshness and validation can be determined using a combination of HTTP request and response headers:

  • Freshness determines whether an object can be served from the cache. You control it using expires and cache-control:max-age headers.
  • Validation determines whether an object has become stale. You control it using last-modified and if-modified-since headers.

Designing Highly Cacheable Web Applications

Figure 3. Determining Cacheability: The figure provides guidelines for determining whether an object should be cacheable.

Enterprise web applications have both dynamic and static components. With proper design, they can be architected to deliver the static components from cache and the dynamic components from an origin server. However, the first step is determining what to cache. Figure 3 provides guidelines that can help you determine whether an object is cacheable or dynamic (non-cacheable).

After application architects have differentiated between the cacheable and non-cacheable objects, developers should seek to maximize cache hits while simultaneously avoiding caching the dynamic objects. Here are some best practices:

  • Use Cache Control and Expires headers.
  • Use the Last Modified Time header.
  • Check for support of If-Modified-Since by the web server.
  • Investigate the feasibility of using a forward proxy cache for a small site, or leveraging professional help from a CDN company for large-scale enterprise sites.
  • Consider using datacenters or co-locations depending on the scalability of the web site.
  • Do-it-yourself coding is usually time- and effort-intensive. Depending on the scale of the web site, you may want to consider using open source caching mechanisms such as Squid on your proxy servers.
  • Definitely leverage a mix of caching mechanisms for file downloads.
  • Ensure that no user- or input-dependent dynamic transactions get cached. Creating a cache map of different objects can help to segregate cacheable from non-cacheable objects.
  • Be wary of Content Management Systems (CMSs) that completely ignore the cache headers.

Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date