Using Headers for Caching

One of my main goals while redoing my website is to optimize my content, both on the frontend and backend. The frontend is something that I've always considered to be the most important in terms of a fast user experience - I'm not crunching enough data on the backend to worry about long processing delays for users. However, depending on the connection, I've noticed delays upwards of five seconds in between pages, which is completely unacceptable.

There's a few things that a web developer can do to speed up the frontend. Decreasing the number and lowering the size of the unique elements are two of the most obvious. If the user only has to download two small stylesheets instead of five, your site will appear faster. What I wanted to do was take this a step further; repeat visitors will not have to download any new stylesheets, scripts, or images. The only http request will be the xHTML content. This can be handled by using the built in caching components of the internet.

There are several different ways that website can cache information for users to view it easily, including on the user's computer or on a proxy server. That doesn't matter too much, as long as the header information passed between the user and the web server understands that there is a cached version available. This header information can be sent by .htaccess or PHP, as long as it's sent before any web content. Below is a list of the headers that I plan to use on my site, as well as a short description for each one.

  1. Cache-Control: max-age={timestamp in seconds},public
    this tells the computer how long to keep the file
  2. Content-Type: {mime filetype}
    helps the user's computer find and understand the file
  3. ETag: {some hash, maybe md5}
    a caching tool that attaches a unique id to each file
  4. Expires: {RFC 2822 date}
    another header telling the computer how long to keep file
  5. Last Modified: {RFC 2822 date}
    whether or not the file has been modified since last view

As with any front-end programming, you need to state some things more than once for all browsers to recognize it. This set of headers, from what my testing has shown, does a pretty good job at covering the bases. Some of the headers are used to define the file (ETag, content-type) to see if the cached version is the same as the requested version, while others send file times (last modified, expires, cache-control) explaining how long the cached version is good for.

Since headers define the requested page, they must be sent before any content. With an MVC model, it's quite easy to include this set of headers in a static class that is initiated before any content is output. You can either output them using the header() command in PHP or looking up the .htaccess methods. Another important note is that if the file is cached, there is a chance that any changes you make on the site will not be visible to the user - you can either take your chances with the Last Modified/ETag or just rename the file.

A more advanced system would have the server check to see if the user has the cached version and send a 302 Not-Modified Tag, forcing them to use the cached file. I prefer to give the user's browser the choice. This works better for debugging and cuts down on the pre-output processing. I've used it to start caching images (yes, I'm routing my images through a PHP file), stylesheets, and Javascript, and have already gained a noticeable difference with page speed testers and general browsing on my redeveloped sites.