Caching with Varnish, Drupal 7 and Cache Actions

Tuesday, January 31, 2012 - 22:35

Drupal 7 can be used with Varnish and other reverse proxy servers if configured correctly. This blog post highlights how you can control your cache with Drupal, the Varnish module and the Cache Actions module.

Using a reverse proxy cache is an efficient way to cache your web site in order to get faster response times. Drupal 7 works with reverse proxy cache servers like Varnish out of the box and the integration can be extended by using the Varnish module. I'm going to show you how the integration works in this blog post.

Set up Drupal to work with Varnish

Start off with a clean Drupal installation and a working Varnish installation in front of it. You need to change some settings in Drupal to get it up and running with Varnish. Go to Administration => Configuration => Development => Performance and make the following settings:

  • Check the "Cache pages for anonymous users" checkbox
  • Set the "Expiration of cached pages" to a value higher that isn't none. This is important, because reverse proxy caching won't work at all otherwise.

You should now be good to go. Open up another browser (or log out), and go to your site. Open up firebug in firebox or the development tools in Chrome/Chromium and go to the network tab. Check the headers for the request to your drupal page (/). You should see something like this if everything is working:

Notice the Cache-Control Header. This header is something Drupal sends out to indicate that it wants this page cached for at least 900 seconds. If you don't see a header that looks similar to this, you have done something wrong with your caching settings.

Also look at the X-Varnish Header. If you have two Ids there, it means this page is served by Varnish. That the request took 1 millisecond is also an indicator =). You don't have to worry about the X-Drupal-Cache header. It will always be false since the Varnish will serve all cached pages.

Cache invalidation

You now have a working Varnish and Drupal setup, congratulations! The cache will be automaticly invalidated after the time you set in the caching settings has expired, and new content will be fetched. This is not good enough for some cases though, You probably want to get new content to show as soon as it is added/updated/deleted. You probably want something that resembles the standard drupal caching behavior which works like this:

  • All pages are stored in the Drupal cache (cache_page table) when they are first visited. This table stores the full HTML output.
  • All of the cache is flushed when content is added/updated/deleted, if you don't set the "Minimum cache lifetime" setting on the performance page. This setting will enforce the cache to live for a certain number of minutes.

This does not work with Varnish, since Drupal has no built-in way of telling Varnish when content has changed. This is where the Varnish module comes in. This module let's Drupal communicate directly with Varnish through an admin socket, which Varnish provides on a specific port, usually 6082. Follow these steps to get started with it:

  • Download the varnish module as usual and enable it.
  • Go to Administation => Configuration => Development => Varnish (admin/config/development/varnish) and enter the settings for that matches your Varnish setup. The module should support Varnish 2.0, 2.1 and 3.x. The Administration page has a indicator that tells you if your configuration is working.
  • The default Varnish configuration usually requires that you know a secret key to access the administration socket. It's found in /etc/varnish/secret on debian, just copy it and paste it into the "Varnish secret" box.
  • Make sure that you have selected the "Drupal Default" caching option.

The Varnish module comes with an alternate so called "Cache backend" for Drupal. Cache backends are swappable in Drupal, which means you can replace the standard caching functionality with another cache implementation. In order for the module to work as expected, you need to add the following to your settings.php:

$conf['cache_backends'] = array('sites/all/modules/varnish/varnish.cache.inc');
$conf['cache_class_cache_page'] = 'VarnishCache';

Change the path to the Varnish module to whatever fits your installation.

We should be all good to go now, go and create some nodes on your site, and open another browser that is not logged in. Go to the new nodes and make sure they get cached by watching their headers. Then change some of your content and refresh your logged out session. Your content should have been updated, and it will be cached directly after it has been viewed the first time. You know the equivalent of the Drupal caching functionality together with Varnish.

Advanced usage

The regular caching functionality works well for smaller sites and mimic the default Drupal behavior, but if you are running a big website with a lot of visitors, you probably want to avoid the standard functionality. Having the cache for your whole website flushed at the same time can be devestating for larger websites. The easy way out is to add a Minimum cache lifetime, this will ensure that all content is not purged too often, but it will still make sure it's purge through the Varnish admin socket. The other alternative is to simply not use the Varnish module at all.

The Varnish module also comes with support for the Expire module. This module will only purge the relevant pages where your content should exist. It is still under heavy development for Drupal 7, so you should probably drop by the issue queue and help out.

Another method for purging entries in Varnish is to use The Cache Actions module. It had special support for Varnish in Drupal 6, but it doesn't need to anymore in Drupal 7, since you can use the Drupal caching system directly independently of what cache backend you are using. This is great since you can use the same technique without Varnish. The following step-by-step guide shows you how to purge the page of a particular node when it is updated, but you can do a lot more, since Rules is very powerful:

  • Go to the Varnish settings page (admin/config/development/varnish) and select "None" under "Varnish Cache Clearing". This will make sure that all pages are not flushed when content is altered.
  • Enable the Cache actions and the Rules UI module
  • Go to Administration => Workflow => Rules and click "Create a new rule"
  • Select "After updating existing content" under "Node" as the reacting event
  • Add the "Clear a specific cache cid" action
  • Select cache_page as your cache bin.
  • Put "node/[node:nid]" in the value text area.

Try to edit one of your nodes, everything should still work. The difference now is that we are actually only purging the entry of the particular node that has been edited not the whole page. You can try this by visiting another node, notice that it is still cached by looking at the Varnish headers.

Alternative solutions

Another interesting module to look at is the Purge module. It does the same thing as the Varnish module does, but it uses HTTP purges instead of the administration socket in Varnish.

Attachment

application-octet-stream.png

Comments

"Set the "Expiration of cached pages" to a value higher that isn't none."

what does this expression about value mean?

Hello,

I follow all this article and It's works good when I update, the varnish cache is invalidated for node/[node:nid] but It's doesn't work for node alias url for the same node.
Do you have ideas why ?

Thanks

You may want to add another step to the cache actions rule. Only clearing the node url is not enough. Some parts of this node may be visible on some other pages (Views pages, blocks etc). You have to clear those pages as well.

Thank you for your tutorial!

I notice that this page http://drupal.org/project/varnish shows some additional mods to the settings.php. Perhaps this tutorial could/should be updated?