Making wordpress shine with Varnish caching system: Part 1
One of our sites at work (Weddingbee.com) is a simple wordpress blog with a big audience. An audience that likes to refresh the home page, a lot. Seriously, a large percentage of page views happen on page 1 and 2 of the blog. Because so much of the traffic is essentially "static", we decided to put a caching layer in front of Apache for serving this repeat content. However, like any good journey, it wasn't a straight walk in the park along the pavement, but more of a stroll through a forest on a dirt road. Everything was straightforward, but a few times, I needed to stop and figure out what the next step was.
Our Wordpress site has a lot of visitors, most of them are "lurkers", unregistered users who enjoy our content, but don't add to the conversation by logging in and commenters. We love em, but frankly, they get the most boring (from a backend code POV) view of the site. Varnish is perfect for them, it just shows each one the same version of the page that it shows everyone else. We do any dynamic sort of stuff with ajax calls to fill-in content. The rest of our users have accounts, they log in, comment sometimes, discuss topics on the boards, share private messages and take part in a social network for Weddings basically. These are the people who need a dynamic page which uses cookies, and we needed to setup varnish to do just that.
I'll be going over our various configurations and why we did what we did for them.
Let's start with the function vcl_recv. This is the "first" part that a request goes through.
set req.http.X-Forwarded-For = client.ip;
if (req.http.Cookie && req.http.Cookie ~ "wordpress_") {
set req.http.Cookie = regsuball(req.http.Cookie, "wordpress_test_cookie=", "; wpjunk=");
}
First, we set the X-Forwarded-For header so that it sends the actual IP of the visitor to our backend. We have some PHP code which writes this into PHP's $_SERVER['REMOTE_ADDR'] variable so it can be used normally. In our wordpress installation, the presence of the cookie wordpress_XXXXX indicates that a user is logged in. Unfortunately, there is also a cookie called wordpress_test_cookie. I don't want to care about that one, so I rewrite it to wpjunk so that it's ignore when we deal with cookies later.
## always cache these images & static assets
if (req.request == "GET" && req.url ~ "\.(css|js|gif|jpg|jpeg|bmp|png|ico|img|tga|wmf)$") {
remove req.http.cookie;
return(lookup);
}
if (req.request == "GET" && req.url ~ "(xmlrpc.php|wlmanifest.xml)") {
remove req.http.cookie;
return(lookup);
}
There are a lot of things that we don't want varnish to ever cache.
The admin page, POST requests, Apache's status page and a number of other pages should be let
through with a pipe request or a pass request. You also notice that I remove the cookies on these requests.
Since it's just static files, we don't need to send those extra bits over to our backend server.
Since it's never really said anywhere the difference
between PASS and PIPE, is that PASS continues to put the rest of the request through Varnish,
adding variables, modifying headers and doing all kinds of varnish-y things. PIPE ignores varnish
and just moves bits unaltered. Now you know.
#never cache POST requests
if (req.request == "POST")
{
return(pass);
}
### do not cache these files:
##never cache the admin pages, or the server-status page
if (req.request == "GET" && (req.url ~ "(wp-admin|bb-admin|server-status)"))
{
return(pipe);
}
### don't cache authenticated sessions
if (req.http.Cookie && req.http.Cookie ~ "(wordpress_|PHPSESSID)") {
return(pass);
}
Similarly, there are requests that we ALWAYS want looked up, because the content is static. This allows you to do that.
We also, make sure PASS requess for authenticated users (with the cookie wordpress_XXXX) or users who have used the
part of the site that requires a PHP session
if (req.http.Cookie)
{
set req.http.Cookie = ";" req.http.Cookie;
set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
set req.http.Cookie = regsuball(req.http.Cookie, ";(vendor_region|themetype2)=", "; \1=");
set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
if (req.http.Cookie == "") {
remove req.http.Cookie;
}
}
Here we remove a lot of the cookies. THe only cookes we keep are ones that are needed for generating
dynamic content on the backend. Those relevant cookies are vendor_region, and themetype2. Everything else
gets removed
Wow, that's a lot of work for one simple function. Here's the whole function in it's entirety
### Called when a client request is received
sub vcl_recv {
## always cache these images & static assets
if (req.request == "GET" && req.url ~ "\.(css|js|gif|jpg|jpeg|bmp|png|ico|img|tga|wmf)$") {
remove req.http.cookie;
return(lookup);
}
if (req.request == "GET" && req.url ~ "(xmlrpc.php|wlmanifest.xml)") {
remove req.http.cookie;
return(lookup);
}
set req.http.X-Forwarded-For = client.ip;
#never cache POST requests
if (req.request == "POST")
{
set req.backend = nitro;
return(pass);
}
### do not cache these files:
##never cache the admin pages, or the server-status page
if (req.request == "GET" && (req.url ~ "(wp-admin|bb-admin|server-status)"))
{
return(pipe);
}
#DO cache this ajax request
if(req.http.X-Requested-With == "XMLHttpRequest" && req.url ~ "recent_reviews")
{
return (lookup);
}
#dont cache ajax requests
if(req.http.X-Requested-With == "XMLHttpRequest" || req.url ~ "nocache" || req.url ~ "(control.php|wp-comments-post.php|wp-login.php|bb-login.php|bb-reset-password.php|register.php)")
{
return (pass);
}
if (req.http.Cookie && req.http.Cookie ~ "wordpress_") {
set req.http.Cookie = regsuball(req.http.Cookie, "wordpress_test_cookie=", "; wpjunk=");
}
### don't cache authenticated sessions
if (req.http.Cookie && req.http.Cookie ~ "(wordpress_|PHPSESSID)") {
return(pass);
}
### parse accept encoding rulesets to make it look nice
if (req.http.Accept-Encoding) {
if (req.http.Accept-Encoding ~ "gzip") {
set req.http.Accept-Encoding = "gzip";
} elsif (req.http.Accept-Encoding ~ "deflate") {
set req.http.Accept-Encoding = "deflate";
} else {
# unkown algorithm
remove req.http.Accept-Encoding;
}
}
if (req.http.Cookie)
{
set req.http.Cookie = ";" req.http.Cookie;
set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
set req.http.Cookie = regsuball(req.http.Cookie, ";(vendor_region|PHPSESSID|themetype2)=", "; \1=");
set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
if (req.http.Cookie == "") {
remove req.http.Cookie;
}
}
return(lookup);
}
Really helpful, thanks :)
Thanks for that.
I cannot get my WP site caching becuase even when logged out, my lookups (from various browsers) return a PHPSESSID Cookie and sometimes a w3tc_referrer cookie. I can see these in the Varnish log. Is there a safe way to remove this in the vcl file?
Incidentally I see PHPSESSID has appeared in line 64 of your script, but it was not in the snippet version. Is that a typo?
John
Is it possible to pass if a cookie does not exist?
Example first time someone visits the site a affiliate cookie is set.
As long as they have that cookie we want them to be served cached content.
Thanks
Yes, you would just need to do something along the lines of:
if (req.http.Cookie && req.http.Cookie ~ "served_cached_content") {
return(lookup);
}
About time someone did a proper wordpress varnish config.
Great job. I didn't use yours but I took some pointers from it for using over at http://primaryblogger.co.uk
Thanks!
Post new comment