0% found this document useful (0 votes)
184 views150 pages

Varnishfoo PDF

Uploaded by

ANKIT PANDEY
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
184 views150 pages

Varnishfoo PDF

Uploaded by

ANKIT PANDEY
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 150

Varnish Foo

A book about Varnish Cache-stuff, by Kristian Lyngstøl.


Version: c3-posted-77-g4a0a3bf
Date: 2017-01-16
Contents
1 Introduction 6
1.1 Target audience and format 7
1.2 What is Varnish 8
1.3 History 9
1.4 More than just cache 10
1.5 Where to get help 11
2 Working with HTTP caching 12
2.1 Tools: The browser 13
2.2 Tools: The command line tool 17
2.3 Tools: A web server 20
2.4 Tools: Varnish 22
2.5 Conditional GET requests 23
2.6 Cache control, age and grace 26
2.7 The Cache-Control header 34
2.8 stale-while-revalidate 36
2.9 Vary 39
2.10 Request methods 42
2.11 Cached status codes 43
2.12 Cookies and authorization 44
2.13 Summary 45
3 Architecture and operation 46
3.1 Architecture 47
3.2 Design principles of Varnish 48
3.3 The different categories of configuration 49
3.4 Command line arguments 50
3.5 Cache size and storage backend 52
3.6 Other useful varnishd arguments 53
3.7 Summary of varnishd arguments 55
3.8 Startup scripts 56
3.9 Parameters 57
3.10 Tools: varnishadm 59
3.11 Tools: varnishstat 60
3.12 A note on threads 62
3.13 Tools: varnishlog 63
3.14 Tools: varnishtop 69
3.15 Tools: varnishncsa and varnishhist 70
3.16 More on VSL queries 71
3.17 Summary 75
4 Introducing VCL 76
4.1 Working with VCL 77
4.2 Hello World 79
4.3 Basic language constructs 82
4.4 More on return-statements 85
4.5 Built-in VCL 86
4.6 Client requests 87
4.6.1 vcl_recv 87
4.6.2 vcl_recv - Massaging a request 89
4.6.3 vcl_pipe 95
4.6.4 vcl_hash 95
4.6.5 vcl_hit 96
4.6.6 vcl_miss 97
4.6.7 vcl_pass 97
4.6.8 vcl_synth 98
4.6.9 vcl_deliver 99
4.7 Backend requests 101
4.7.1 vcl_backend_fetch 101
4.7.2 vcl_backend_response 101
4.7.3 vcl_backend_error 103
4.8 Housekeeping 105
4.9 Varnish Modules 106
4.10 Bringing it together 107
4.11 Summary 110
5 Intelligent traffic routing 111
5.1 A closer look at a backend 112
5.2 Health probes 114
5.2.1 Reviewing health probe status 115
5.2.2 Forcing state 118
5.3 Load balancing 121
5.3.1 Basic round-robin and random load balancing 121
5.3.2 Health probes and directors 122
5.3.3 Dynamic use of directors 122
5.3.4 Fallback director 123
5.3.5 Stacking directors 124
6 Appendix A: State machine graphs 128
6.1 cache_req_fsm 130
6.2 cache_fetch 133
6.3 cache_http1_fsm 134
7 Appendix B: Varnish Three Letter Acronyms 135
8 Appendix C: Built-in VCL for Varnish 4.1.1 137
9 Appendix D: Regular expression cheat sheet 141
10 Appendix X: License 143
Page 6 Chapter 1 Introduction

1 Introduction
This is the only chapter written in first person.
I've worked on Varnish since late 2008, first for Redpill Linpro, then Varnish Software, then, after a brief
pause, for Redpill Linpro again. Over the years I've written code, written Varnish modules and blog posts,
tried to push the boundaries of what Varnish can do, debugged or analyzed countless Varnish sites,
probably held more training courses than anyone else, written training material, and generally helped
shape the Varnish community.
Today I find myself in a position where the training material I once maintained is no longer my
responsibility. But I still love writing, and there's an obvious need for documentation for Varnish.
I came up with a simple solution: I will write a book. Because I couldn't imagine that I would ever finish it if
I attempted writing a whole book in one go, I decided I would publish one chapter at a time on my blog.
This is the first chapter of that book.
You will find the source on https://github.com/KristianLyng/varnishfoo. While the format will be that of a
book, I intend to keep it alive with revisions.
I intend to cover as much Varnish-related content as possible, from administration to web development
and infrastructure. My hope is that one day, this will be good enough that it will be worth printing as more
than just a leaflet.
I am writing this in my spare time, I retain full ownership of the material. For now, the material is available
under a Creative Commons "CC-BY-SA" license. This is a Creative Commons license that allows full use
to you as the reader, including copies and modifications, as long as attribution is provided.
I hope you will enjoy this book, and I would appreciate any feedback you could give, positive or negative.
Chapter 1.1 Target audience and format Page 7

1.1 Target audience and format


This book covers a large range of subjects related to Varnish. The first few chapters are general enough
to be of interest to all, while later chapters specialize on certain aspects of Varnish usage.
Each chapter stands well on its own, but there are some cross-references. The book focuses on best
practices and good habits that will help you beyond what just a few examples or explanations will do.
Each chapter provides both theory and practical examples. Each example is tested with a recent Varnish
Version where relevant, and is based on experience from real-world Varnish installations.
Page 8 Chapter 1.2 What is Varnish

1.2 What is Varnish


Varnish is a web server.
Unlike most web servers, Varnish does not read content from a hard drive, or run programs that generates
content from SQL databases. Varnish acquires the content from other web servers. Usually it will keep a
copy of that content around in memory for a while to avoid fetching the same content multiple times, but
not necessarily.
There are numerous reasons you might want Varnish:

1. Your web server/application is a beastly nightmare where performance is measured in page views
per hour - on a good day.
2. Your content needs to be available from multiple geographically diverse locations.
3. Your web site consists of numerous different little parts that you need to glue together in a sensible
manner.
4. Your boss bought a service subscription and now has to justify the budget post.
5. You like Varnish.
6. ???
Varnish is designed around two simple concepts: Give you the means to fix or work around technical
challenges. And speed. Speed was largely handled very early on, and Varnish is quite simply fast. This is
achieved by being, at the core, simple. The less you have to do for each request, the more requests you
can handle.
The name suggests what it's all about:

From The Collaborative International Dictionary of English v.0.48 [gcide]:

Varnish \Var"nish\, v. t. [imp. & p. p. {Varnished}; p. pr. &


vb. n. {Varnishing}.] [Cf. F. vernir, vernisser. See
{Varnish}, n.]
[1913 Webster]
1. To lay varnish on; to cover with a liquid which produces,
when dry, a hard, glossy surface; as, to varnish a table;
to varnish a painting.
[1913 Webster]

2. To cover or conceal with something that gives a fair


appearance; to give a fair coloring to by words; to gloss
over; to palliate; as, to varnish guilt. "Beauty doth
varnish age." --Shak.
[1913 Webster]

Varnish can be used to smooth over rough edges in your stack, to give a fair appearance.
Chapter 1.3 History Page 9

1.3 History
The Varnish project began in 2005. The issue to be solved was that of a large Norwegian news site (or
alternatively a tiny international site). The first release came in 2006, and worked well for www.vg.no. In
2008, Varnish 2.0 came, which opened Varnish up to more sites, as long as they looked and behaved
similar to www.vg.no. As time progressed and more people started using Varnish, Varnish has been
adapted to a large and varied set of use cases.
From the beginning, the project was administered by Redpill Linpro, with the majority of development
being done by Poul-Henning Kamp through his own company and his Varnish Moral License. In 2010,
Varnish Software sprung out from Redpill Linpro. Varnish Cache has always been a free software project,
and while Varnish Software has been custodians of the infrastructure and large contributors of code and
cash, the project is independent and has a completely open development process.
Varnish Plus was born some time during 2011, all though it didn't go by that name at the time. It was the
result of somewhat conflicting interests. Varnish Software had customer obligations that required features,
and the development power to implement them, but they did not necessarily align with the goals and time
frames of Varnish Cache. Varnish Plus became a commercial test-bed for features that were not yet in
Varnish Cache for various reasons. As time passed, many of the features that begun life in Varnish Plus
have trickled into Varnish Cache proper in one way or an other (streaming, surrogate keys, and more),
and some have still to make it. Some may never make it. This book focuses on Varnish Cache proper, but
will reference Varnish Plus where it makes sense.
With Varnish 3.0, released in 2011, Varnish modules suddenly became very popular. These are modules
that are not part of the Varnish Cache code base, but are loaded at run-time to add features such as
cryptographic hash functions (vmod-digest) and memcached.
Varnish would not be where it is today without a large number of people and businesses. Varnish
Software have contributed and continues to contribute numerous tools, vmods, and core features.
Poul-Henning Kamp is still the gatekeeper of Varnish Cache code, and does the majority of the
architectural work. Over the years, there have been too many companies and individuals involved to list
them all here.
Today, Varnish is used by CDNs and news papers, APIs and blogs.
Page 10 Chapter 1.4 More than just cache

1.4 More than just cache


Varnish caches content, but can do much more. In 2008, it was used to rewrite URLs, normalize HTTP
headers and similar things. Today, it is used to implement paywalls (whether you like them or not), API
metering, load balancing, CDNs, and more.
Varnish has a powerful configuration language, the Varnish Configuration Language (VCL). VCL isn't
parsed the traditional way a configuration file is, but is translated to C code, compiled and linked into the
running Varnish. From the beginning, it was possible to bypass the entire translation process and provide
C code directly, which was never recommended. Much of the experimental in-line C code from past
Varnish versions have found new life in Varnish modules since their introduction.
There is also a often overlooked Varnish agent that provides a HTTP REST interface for managing
Varnish. This can be used to extract metrics, review or optionally change configuration, stop and start
Varnish, and more. The agent lives on https://github.com/varnish/vagent2, and is packaged for most
distributions today.
Using Varnish to gracefully handle operational issues is common. Serving cached content past its expiry
time while a web server is down, or switching to a different server, will give your users a better browsing
experience. And in a worst case scenario, at least the user can be presented with a real error message
instead of a refused or timed out connection.
Edge Side Includes is a means to build a single HTTP object (like a HTML page) from multiple smaller
object, with different caching properties. This lets content writers provide more fine-grained caching
strategies without having to be too smart about it.
Chapter 1.5 Where to get help Page 11

1.5 Where to get help


The official varnish documentation is available both as manual pages (run man -k varnish on a
machine with a properly installed Varnish package), and as online documentation found under
http://varnish-cache.org/docs/. You will also find a user-guide and a tutorial in the same on-line
documentation.
Varnish Software publishes their official training material, which is called "The Varnish Book" (Not to be
confused with THIS book about Varnish). This is available freely through their site at
http://varnish-software.com, after registration.
An other less known source of information for Varnish is the flow charts/dot-graphs used to document the
VCL state engine. The only official location for this is found in the source code of Varnish, under
doc/graphviz/. They can be generated, assuming you have graphviz installed:

# git clone http://github.com/varnish/Varnish-Cache/


Cloning into 'Varnish-Cache'...
(...)
# cd Varnish-Cache/
# cd doc/graphviz/
# for a in *dot; do dot -Tpng $a > $(echo $a | sed s/.dot/.png/); done
# ls *png

Alternatively, replace -Tpng and .png with -Tsvg and .svg respectively to get vector graphics, or
-Tpdf/.pdf for pdfs.
For convenience, the graphs from Varnish 4.1 are included in Appendix A. If you don't quite grasp what
these tell you yet, don't be too alarmed. These graphs are provided early as they are useful to have
around as reference material and because there is no official location to find them pre-generated. A brief
explanation for each is included, mostly to help you in later chapters.
Page 12 Chapter 2 Working with HTTP caching

2 Working with HTTP caching


Before you dig into the inner workings of Varnish, it's important to make sure you have the tools you need
and some background information on basic caching.
This chapter looks at how HTTP caching works on multiple points in the delivery chain, and how these
mechanisms work together. Not every aspect of HTTP caching is covered, but those relevant to Varnish
are covered in detail. Including several browser-related concerns.
There are a multitude of tools to chose from when you are working with Varnish. This chapter provides a
few suggestions and a quick guide to each tool, but makes no claim on whether one tool is better than the
other. The goal is to establish what sort of tasks your chosen tool needs to be able to accomplish.
Only the absolute minimum of actual Varnish configuration is covered - yet several mechanisms to control
Varnish through backend responses are provided. Most of these mechanisms are well defined in the
HTTP 1.1 standard, as defined in RFC2616.
Chapter 2.1 Tools: The browser Page 13

2.1 Tools: The browser


A browser is an important tool. Most of todays web traffic is, unsurprisingly, through a web browser.
Therefor, it is important to be able to dig deeper into how they work with regards to cache. Most browsers
have a developer- or debug console, but we will focus on Chrome.
Both Firefox and Chrome will open the debug console if you hit <F12>. It's a good habit to test and
experiment with more than one browser, and luckily these consoles are very similar. A strong case in
favor of Chrome is Incognito Mode, activated through <Ctrl>+<Shift>+N. This is an advantage both
because it removes old cookies and because most extensions are disabled. Most examples use Chrome
to keep things consistent and simple, but could just as well have been performed on Firefox.
The importance of Incognito Mode can be easily demonstrated. The following is a test with a typical
Chrome session:

Notice the multiple extensions that are active, one of them is inserting a bogus call to
socialwidgets.css. The exact same test in Incognito Mode:
Page 14 Chapter 2.1 Tools: The browser

The extra request is gone. Regardless of browser choice, your test environment should be devoid of most
extensions and let you easily get rid of all cookies.
You will also quickly learn that a refresh isn't always just a refresh. In both Firefox and Chrome, a refresh
triggered by <F5> or <Ctrl>+r will be "cache aware". What does that mean?
Look closer on the screenshots above, specially the return code. The return code is a
304 Not Modified, not a 200 OK. The browser had the image in cache already and issued a
conditional GET request. A closer inspection:
Chapter 2.1 Tools: The browser Page 15

The browser sends Cache-Control: max-age=0 and an If-Modified-Since-header. The web


server correctly responds with 304 Not Modified. We'll look closer at those, but for now, let's use a
different type of refresh: <Shift>+<F5> in Chrome or <Shift>+<Ctrl>+r in Firefox:
Page 16 Chapter 2.1 Tools: The browser

The cache-related headers have changed somewhat, and the browser is no longer sending a
If-Modified-Since header. The result is a 200 OK with response body instead of an empty
304 Not Modified.
These details are both the reason you need to test with a browser - because this is how they operate -
and why a simpler tool is needed in addition to the browser.
Chapter 2.2 Tools: The command line tool Page 17

2.2 Tools: The command line tool


The browser does a lot more than issue HTTP requests, specially with regards to cache. A good request
synthesizer is a must to debug and experiment with HTTP and HTTP caching without stumbling over the
browser. There are countless alternatives available.
Your requirement for a simple HTTP request synthesizer should be:

• Complete control over request headers and request method - even invalid input.
• Stateless behavior - no caching at all
• Show complete response headers.
Some suggestions for Windows users are curl in Powershell, Charles Web Debugging Proxy, the "Test
and Rest Client" in PhpStorm, an "Adanced REST client" Chrome extension, or simply SSH'ing to a
GNU/Linux VM and using one of the many tools available there. The list goes on, and so it could for Mac
OS X and Linux too.
HTTPie is a small CLI tool which has the above properties. It's used throughout this book because it is a
good tool, but also because it's easy to see what's going on without knowledge of the tool.
HTTPie is available on Linux, Mac OS X and Windows. On a Debian or Ubuntu system HTTPie can be
installed with apt-get install httpie. For other platforms, see http://httpie.org. Testing httpie is
simple:

$ http http://kly.no/misc/dummy.png
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Connection: keep-alive
Content-Length: 178
Content-Type: image/png
Date: Wed, 25 Nov 2015 18:49:33 GMT
Last-Modified: Wed, 02 Sep 2015 06:46:21 GMT
Server: Really new stuff so people don't complain
Via: 1.1 varnish-v4
X-Cache: MISS from access-gateway.hospitality.swisscom.com
X-Varnish: 15849590

+-----------------------------------------+
| NOTE: binary data not shown in terminal |
+-----------------------------------------+

In many situations, the actual data is often not that interesting, while a full set of request headers are very
interesting. HTTPie can show you exactly what you want:

$ http -p Hh http://kly.no/misc/dummy.png
GET /misc/dummy.png HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: kly.no
User-Agent: HTTPie/0.8.0

HTTP/1.1 200 OK
Page 18 Chapter 2.2 Tools: The command line tool

Accept-Ranges: bytes
Age: 81
Connection: keep-alive
Content-Length: 178
Content-Type: image/png
Date: Wed, 25 Nov 2015 18:49:33 GMT
Last-Modified: Wed, 02 Sep 2015 06:46:21 GMT
Server: Really new stuff so people don't complain
Via: 1.1 varnish-v4
X-Cache: HIT from access-gateway.hospitality.swisscom.com
X-Varnish: 15849590

The -p option to http can be used to control output. Specifically:

• -p H will print request headers.


• -p h will print response headers.
• -p B will print request body.
• -p b will print response body.
These can combined. In the above example -p H and -p h combine to form -p Hh. See
http --help and man http for details. Be aware that there has been some mismatch between actual
command line arguments and what the documentation claims in the past, this depends on the version of
HTTPie.
The example shows the original request headers and full response headers.
Faking a Host-header is frequently necessary to avoid changing DNS just to test a Varnish setup. A
decent request synthesizer like HTTPie does this:

$ http -p Hh http://kly.no/ "Host: example.com"


GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: example.com
User-Agent: HTTPie/0.8.0

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Connection: keep-alive
Content-Encoding: gzip
Content-Type: text/html
Date: Wed, 25 Nov 2015 18:58:10 GMT
Last-Modified: Tue, 24 Nov 2015 20:51:14 GMT
Server: Really new stuff so people don't complain
Transfer-Encoding: chunked
Via: 1.1 varnish-v4
X-Cache: MISS from access-gateway.hospitality.swisscom.com
X-Varnish: 15577233

Adding other headers is done the same way:


Chapter 2.2 Tools: The command line tool Page 19

$ http -p Hh http://kly.no/ "If-Modified-Since: Tue, 24 Nov 2015 20:51:14 GMT"


GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: kly.no
If-Modified-Since: Tue, 24 Nov 2015 20:51:14 GMT
User-Agent: HTTPie/0.8.0

HTTP/1.1 304 Not Modified


Age: 5
Connection: keep-alive
Content-Encoding: gzip
Content-Type: text/html
Date: Wed, 25 Nov 2015 18:59:28 GMT
Last-Modified: Tue, 24 Nov 2015 20:51:14 GMT
Server: Really new stuff so people don't complain
Via: 1.1 varnish-v4
X-Cache: MISS from access-gateway.hospitality.swisscom.com
X-Varnish: 15880392 15904200

We just simulated what our browser did, and verified that it really was the If-Modified-Since header
that made the difference earlier. To have multiple headers, just list them one after an other:

$ http -p Hh http://kly.no/ "Host: example.com" "User-Agent: foo" "X-demo: bar"


GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: example.com
User-Agent: foo
X-demo: bar

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 10
Connection: keep-alive
Content-Encoding: gzip
Content-Length: 24681
Content-Type: text/html
Date: Wed, 25 Nov 2015 19:01:08 GMT
Last-Modified: Tue, 24 Nov 2015 20:51:14 GMT
Server: Really new stuff so people don't complain
Via: 1.1 varnish-v4
X-Cache: MISS from access-gateway.hospitality.swisscom.com
X-Varnish: 15759349 15809060
Page 20 Chapter 2.3 Tools: A web server

2.3 Tools: A web server


Regardless of what web server is picked as an example in this book, it's the wrong one. So the first on an
alphabetical list was chosen: Apache.
Any decent web server will do what you need. What you want is a web server where you can easily
modify response headers. If you are comfortable doing that with NodeJS or some other slightly more
modern tool than Apache, then go ahead. If you really don't care and just want a test environment, then
keep reading. To save some time, these examples are oriented around Debian and/or Ubuntu-systems,
but largely apply to any modern GNU/Linux distribution (and other UNIX-like systems).
Note that commands that start with # are executed as root, while commands starting with $ can be run
as a regular user. This means you either have to login as root directly, through su - or sudo -i, or
prefix the command with sudo if you've set up sudo on your system.
The first step is getting it installed and configured:

# apt-get install apache2


(...)
# a2enmod cgi
# cd /etc/apache2
# sed -i 's/80/8080/g' ports.conf sites-enabled/000-default.conf
# service apache2 restart

This installs Apache httpd, enables the CGI module, changes the listening port from port 80 to 8080, then
restarts the web server. The listening port is changed because eventually Varnish will take up residence
on port 80.
You can verify that it works through two means:

# netstat -nlpt
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program na
tcp6 0 0 :::8080 :::* LISTEN 1101/apache2
# http -p Hh http://localhost:8080/
GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:8080
User-Agent: HTTPie/0.8.0

HTTP/1.1 200 OK
Accept-Ranges: bytes
Connection: Keep-Alive
Content-Encoding: gzip
Content-Length: 3078
Content-Type: text/html
Date: Wed, 25 Nov 2015 20:23:09 GMT
ETag: "2b60-525632b42b90d-gzip"
Keep-Alive: timeout=5, max=100
Last-Modified: Wed, 25 Nov 2015 20:19:01 GMT
Server: Apache/2.4.10 (Debian)
Vary: Accept-Encoding
Chapter 2.3 Tools: A web server Page 21

netstat reveals that apache2 is listening on port 8080. The second command issues an actual
request. Both are useful to ensure the correct service is answering.
To provide a platform for experimenting with response header, it's time to drop in a CGI script:

# cd /usr/lib/cgi-bin
# cat > foo.sh <<_EOF_
#!/bin/bash
echo "Content-type: text/plain"
echo
echo "Hello. Random number: ${RANDOM}"
date
_EOF_
# chmod a+x foo.sh
# ./foo.sh
Content-type: text/plain

Hello. Random number: 12111


Wed Nov 25 20:26:59 UTC 2015

You may want to use an editor, like nano, vim or emacs instead of using cat. To clarify, the exact
content of foo.sh is:

#!/bin/bash
echo "Content-type: text/plain"
echo
echo "Hello. Random number: ${RANDOM}"
date

We then change permissions for foo.sh, making it executable by all users, then verify that it does what
it's supposed to. If everything is set up correctly, scripts under /usr/lib/cgi-bin are accessible
through http://localhost:8080/cgi-bin/:

# http -p Hhb http://localhost:8080/cgi-bin/foo.sh


GET /cgi-bin/foo.sh HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:8080
User-Agent: HTTPie/0.8.0

HTTP/1.1 200 OK
Connection: Keep-Alive
Content-Length: 57
Content-Type: text/plain
Date: Wed, 25 Nov 2015 20:31:00 GMT
Keep-Alive: timeout=5, max=100
Server: Apache/2.4.10 (Debian)

Hello. Random number: 12126


Wed Nov 25 20:31:00 UTC 2015

If you've been able to reproduce the above example, you're ready to start start testing and experimenting.
Page 22 Chapter 2.4 Tools: Varnish

2.4 Tools: Varnish


We need an intermediary cache, and what better example than Varnish? We'll refrain from configuring
Varnish beyond the defaults for now, though.
For now, let's just install Varnish. This assumes you're using a Debian or Ubuntu-system and that you
have a web server listening on port 8080, as Varnish uses a web server on port 8080 by default:

# apt-get install varnish


# service varnish start
# http -p Hhb http://localhost:6081/cgi-bin/foo.sh
GET /cgi-bin/foo.sh HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:6081
User-Agent: HTTPie/0.8.0

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Wed, 25 Nov 2015 20:38:09 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 5

Hello. Random number: 26


Wed Nov 25 20:38:09 UTC 2015

As you can see from the above example, a typical Varnish installation listens to port 6081 by default, and
uses 127.0.0.1:8080 as the backend web server. If the above example doesn't work, you can change
the listening port of Varnish by altering the -a argument in /etc/default/varnish and issuing
service varnish restart, and the backend web server can be changed in
/etc/varnish/default.vcl, then issue a restart with service varnish restart. We'll cover
both of these files in detail in later chapters.
Chapter 2.5 Conditional GET requests Page 23

2.5 Conditional GET requests


In the tool-examples earlier we saw real examples of a conditional GET requests. In many ways, they are
quite simple mechanisms to allow a HTTP client - typically a browser - to verify that they have the most
up-to-date version of the HTTP object. There are two different types of conditional GET requests:
If-Modified-Since and If-None-Match.
If a server sends a Last-Modified-header, the client can issue a If-Modified-Since header on
later requests for the same content, indicating that the server only needs to transmit the response body if
it's been updated.
Some times it isn't trivial to know the modification time, but you might be able to uniquely identify the
content anyway. For that matter, the content might have been changed back to a previous state. This is
where the entity tag, or ETag response header is useful.
An Etag header can be used to provide an arbitrary ID to an HTTP response, and the client can then
re-use that in a If-None-Match request header.
Modifying /usr/lib/cgi-bin/foo.sh, we can make it provide a static ETag header:

#!/bin/bash
echo "Content-type: text/plain"
echo "Etag: testofetagnumber1"
echo
echo "Hello. Random number: ${RANDOM}"
date

Let's see what happens when we talk directly to Apache:

# http http://localhost:8080/cgi-bin/foo.sh
HTTP/1.1 200 OK
Connection: Keep-Alive
Content-Length: 57
Content-Type: text/plain
Date: Wed, 25 Nov 2015 20:43:25 GMT
Etag: testofetagnumber1
Keep-Alive: timeout=5, max=100
Server: Apache/2.4.10 (Debian)

Hello. Random number: 51126


Wed Nov 25 20:43:25 UTC 2015

# http http://localhost:8080/cgi-bin/foo.sh
HTTP/1.1 200 OK
Connection: Keep-Alive
Content-Length: 57
Content-Type: text/plain
Date: Wed, 25 Nov 2015 20:43:28 GMT
Etag: testofetagnumber1
Keep-Alive: timeout=5, max=100
Server: Apache/2.4.10 (Debian)

Hello. Random number: 12112


Wed Nov 25 20:43:28 UTC 2015
Page 24 Chapter 2.5 Conditional GET requests

Two successive requests yielded updated content, but with the same Etag. Now let's see how Varnish
handles this:

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Wed, 25 Nov 2015 20:44:53 GMT
Etag: testofetagnumber1
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32770

Hello. Random number: 5213


Wed Nov 25 20:44:53 UTC 2015

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 2
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Wed, 25 Nov 2015 20:44:53 GMT
Etag: testofetagnumber1
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32773 32771

Hello. Random number: 5213


Wed Nov 25 20:44:53 UTC 2015

It's pretty easy to see the difference in the output. However, there are two things happening here of
interest. First, Etag doesn't matter for this test because we never send If-None-Match! So our
http-command gets a 200 OK, not the 304 Not Modified that we were looking for. Let's try that
again:

# http http://localhost:6081/cgi-bin/foo.sh "If-None-Match:


testofetagnumber1"
HTTP/1.1 304 Not Modified
Age: 0
Connection: keep-alive
Content-Type: text/plain
Date: Wed, 25 Nov 2015 20:48:52 GMT
Etag: testofetagnumber1
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 8

Now we see Etag and If-None-Match at work. Also note the absence of a body: we just saved
bandwidth.
Chapter 2.5 Conditional GET requests Page 25

Let's try to change our If-None-Match header a bit:

# http http://localhost:6081/cgi-bin/foo.sh "If-None-Match: testofetagnumber2"


HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Wed, 25 Nov 2015 20:51:10 GMT
Etag: testofetagnumber1
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 11

Hello. Random number: 12942


Wed Nov 25 20:51:10 UTC 2015

Content!
To summarize:

Server Client Server


Last-Modified If-Modified-Since 200 OK with full response body, or
304 Not Modified with no
ETag If-None-Match response body.

Warning
The examples above also demonstrates that supplying static Etag headers or bogus
Last-Modified headers can have unexpected side effects. foo.sh provides new content every
time. Talking directly to the web server resulted in the desired behavior of the client getting the
updated content, but only because the web server ignored the conditional part of the request.
The danger is not necessarily Varnish, but proxy servers outside of the control of the web site,
sitting between the client and the web server. Even if a web server ignores If-None-Match and
If-Modified-Since headers, there is no guarantee that other proxies do! Make sure to only
provide Etag and Last-Modified-headers that are correct, or don't provide them at all.
Page 26 Chapter 2.6 Cache control, age and grace

2.6 Cache control, age and grace


An HTTP object has an age. This is how long it is since the object was fetched or validated from the origin
source. In most cases, an object starts acquiring age once it leaves a web server.
Age is measured in seconds. The HTTP response header Age is used to forward the information
regarding age to HTTP clients. You can specify maximum age allowed both from a client and server. The
most interesting aspect of this is the HTTP header Cache-Control. This is both a response- and
request-header, which means that both clients and servers can emit this header.
The Age header has a single value: the age of the object measured in seconds. The Cache-Control
header, on the other hand, has a multitude of variables and options. We'll begin with the simplest:
max-age=. This is a variable that can be used both in a request-header and response-header, but is most
useful in the response header. Most web servers and many intermediary caches (including Varnish),
ignores a max-age field received in a HTTP request-header.
Setting max-age=0 effectively disables caching, assuming the cache obeys:

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: max-age=0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 15:41:53 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32776

Hello. Random number: 19972


Fri Nov 27 15:41:53 UTC 2015

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: max-age=0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 15:41:57 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32779

Hello. Random number: 92124


Fri Nov 27 15:41:57 UTC 2015

This example issues two requests against a modified http://localhost:6081/cgi-bin/foo.sh.


The modified version has set max-age=0 to tell Varnish - and browsers - not to cache the content at all.
A similar example can be used for max-age=10:

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Chapter 2.6 Cache control, age and grace Page 27

Accept-Ranges: bytes
Age: 0
Cache-Control: max-age=10
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 15:44:32 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 14

Hello. Random number: 19982


Fri Nov 27 15:44:32 UTC 2015

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 8
Cache-Control: max-age=10
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 15:44:32 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32782 15

Hello. Random number: 19982


Fri Nov 27 15:44:32 UTC 2015

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 12
Cache-Control: max-age=10
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 15:44:32 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 19 15

Hello. Random number: 19982


Fri Nov 27 15:44:32 UTC 2015

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 2
Cache-Control: max-age=10
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Page 28 Chapter 2.6 Cache control, age and grace

Date: Fri, 27 Nov 2015 15:44:44 GMT


Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 65538 20

Hello. Random number: 9126


Fri Nov 27 15:44:44 UTC 2015

This example demonstrates several things:

• Varnish emits an Age header, telling you how old the object is.
• Varnish now caches.
• Varnish delivers a 12-second old object, despite max-age=10!
• Varnish then deliver a 2 second old object? Despite no other request in-between.
What this example is showing, is Varnish's default grace mode. The simple explanation is that Varnish
keeps an object a little longer (10 seconds by default) than the regular cache duration. If the object is
requested during this period, the cached variant of the object is sent to the client, while Varnish issues a
request to the backend server in parallel. This is also called stale while revalidate. This happens even with
zero configuration for Varnish, and is covered detailed in later chapters. For now, it's good to just get used
to issuing an extra request to Varnish after the expiry time to see the update take place.
Let's do an other example of this, using a browser, and 60 seconds of max age and an ETag header set
to something random so our browser can do conditional GET requests:
Chapter 2.6 Cache control, age and grace Page 29

On the first request we get a 27 second old object.


Page 30 Chapter 2.6 Cache control, age and grace

The second request is a conditional GET request because we had it in cache. Note that our browser has
already exceeded the max-age, but still made a conditional GET request. A cache (browser or otherwise)
may keep an object longer than the suggested max-age, as long as it verifies the content before using it.
The result is the same object, now with an age of 65 seconds.
Chapter 2.6 Cache control, age and grace Page 31

The third request takes place just 18 seconds later. This is not a conditional GET request, most likely
because our browser correctly saw that the Age of the previous object was 65, while max-age=60
instructed the browser to only keep the object until it reached an age of 60 - a time which had already
past. Our browser thus did not keep the object at all this time.
Page 32 Chapter 2.6 Cache control, age and grace

Similarly, we can modify foo.sh to emit max-age=3600 and Age: 3590, pretending to be a cache.
Speaking directly to Apache:

# http http://localhost:8080/cgi-bin/foo.sh
HTTP/1.1 200 OK
Age: 3590
Cache-Control: max-age=3600
Connection: Keep-Alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 16:07:36 GMT
ETag: 11235
Keep-Alive: timeout=5, max=100
Server: Apache/2.4.10 (Debian)

Hello. Random number: 54251


Fri Nov 27 16:07:36 UTC 2015

# http http://localhost:8080/cgi-bin/foo.sh
HTTP/1.1 200 OK
Age: 3590
Cache-Control: max-age=3600
Connection: Keep-Alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 16:07:54 GMT
ETag: 12583
Keep-Alive: timeout=5, max=100
Server: Apache/2.4.10 (Debian)

Hello. Random number: 68323


Fri Nov 27 16:07:54 UTC 2015

Nothing too exciting, but the requests returns what we should have learned to expect by now.
Let's try three requests through Varnish:

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 3590
Cache-Control: max-age=3600
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 16:08:50 GMT
ETag: 9315
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 65559

Hello. Random number: 22609


Fri Nov 27 16:08:50 UTC 2015

The first request is almost identical to the one we issued to Apache, except a few added headers.
Chapter 2.6 Cache control, age and grace Page 33

15 seconds later, we issue the same command again:

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 3605
Cache-Control: max-age=3600
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 16:08:50 GMT
ETag: 9315
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32803 65560

Hello. Random number: 22609


Fri Nov 27 16:08:50 UTC 2015

Varnish replies with a version from grace, and has issued an update to Apache in the background. Note
that the Age header is now increased, and is clearly beyond the age limit of 3600.
4 seconds later, the third request:

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 3594
Cache-Control: max-age=3600
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 16:09:05 GMT
ETag: 24072
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 65564 32804

Hello. Random number: 76434


Fri Nov 27 16:09:05 UTC 2015

Updated content!
The lessons to pick up from this is:

• Age is not just an informative header. It is used by intermediary caches and by browser caches.
• max-age is relative to Age and not to when the request was made.
• You can have multiple tiers of caches, and max-age=x will be correct for the end user if all
intermediary caches correctly obey it and adds to Age.
Page 34 Chapter 2.7 The Cache-Control header

2.7 The Cache-Control header


The Cache-Control header has a multitude of possible values, and can be supplied as both a
request-header and response-header. Varnish ignores any Cache-Control header received from a client -
other caches might obey them.
It is defined in RFC2616, 14.9. As Varnish ignores all Cache-Control headers in a client request, we
will focus on the parts relevant to a HTTP response, here's an excerpt from RFC2616:

Cache-Control = "Cache-Control" ":" 1#cache-directive

cache-directive = cache-request-directive
| cache-response-directive

(...)

cache-response-directive =
"public" ; Section 14.9.1
| "private" [ "=" <"> 1#field-name <"> ] ; Section 14.9.1
| "no-cache" [ "=" <"> 1#field-name <"> ]; Section 14.9.1
| "no-store" ; Section 14.9.2
| "no-transform" ; Section 14.9.5
| "must-revalidate" ; Section 14.9.4
| "proxy-revalidate" ; Section 14.9.4
| "max-age" "=" delta-seconds ; Section 14.9.3
| "s-maxage" "=" delta-seconds ; Section 14.9.3
| cache-extension ; Section 14.9.6

cache-extension = token [ "=" ( token | quoted-string ) ]

Among the above directives, Varnish only obeys s-maxage and max-age by default. It's worth looking
closer specially at must-revalidate. This allows a client to cache the content, but requires it to send a
conditional GET request before actually using the content.
s-maxage is of special interest to Varnish users. It instructs intermediate caches, but not clients (e.g.:
browsers). Varnish will pick the value of s-maxage over max-age, which makes it possible for a web
server to emit a Cache-Control header that gives different instructions to browsers and Varnish:

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: s-maxage=3600,max-age=5
Connection: keep-alive
Content-Type: text/plain
Date: Fri, 27 Nov 2015 23:21:47 GMT
Server: Apache/2.4.10 (Debian)
Transfer-Encoding: chunked
Via: 1.1 varnish-v4
X-Varnish: 2

Hello. Random number: 7684


Fri Nov 27 23:21:47 UTC 2015

# http http://localhost:6081/cgi-bin/foo.sh
Chapter 2.7 The Cache-Control header Page 35

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 8
Cache-Control: s-maxage=3600,max-age=5
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 23:21:47 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 5 3

Hello. Random number: 7684


Fri Nov 27 23:21:47 UTC 2015

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 16
Cache-Control: s-maxage=3600,max-age=5
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 23:21:47 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 7 3

Hello. Random number: 7684


Fri Nov 27 23:21:47 UTC 2015

The first request populates the cache, the second returns a cache hit after 8 seconds, while the third
confirms that no background fetch has caused an update by returning the same object a third time.
Two important things to note here:

• The Age header is accurately reported. This effectively disables client-side caching after Age has
reached 5 seconds.
• There could be other intermediate caches that would also use s-maxage.
The solution to both these issues is the same: Remove or reset the Age-header and remove or reset the
s-maxage-part of the Cache-Control header. Varnish does not do this by default, but we will do both
in later chapters. For now, just know that these are challenges.
Page 36 Chapter 2.8 stale-while-revalidate

2.8 stale-while-revalidate
In addition to RFC2616, there's also the more recent RFC5861 which defines two additional variables for
Cache-Control:

stale-while-revalidate = "stale-while-revalidate" "=" delta-seconds

and:

stale-if-error = "stale-if-error" "=" delta-seconds

These two variables map very well to Varnish's grace mechanics, which existed a few years before
RFC5861 came about.
Varnish 4.1 implements stale-while-revalidate for the first time, but not stale-if-error.
Varnish has a default stale-while-revalidate value of 10 seconds. Earlier examples ran into this:
You could see responses that were a few seconds older than max-age, while a request to revalidate the
response was happening in the background.
A demo of default grace, pay attention to the Age header:

# http -p h http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: max-age=5
Connection: keep-alive
Content-Length: 56
Content-Type: text/plain
Date: Sun, 29 Nov 2015 15:10:56 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 2

# http -p h http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 4
Cache-Control: max-age=5
Connection: keep-alive
Content-Length: 56
Content-Type: text/plain
Date: Sun, 29 Nov 2015 15:10:56 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 5 3

# http -p h http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 8
Cache-Control: max-age=5
Connection: keep-alive
Content-Length: 56
Content-Type: text/plain
Chapter 2.8 stale-while-revalidate Page 37

Date: Sun, 29 Nov 2015 15:10:56 GMT


Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32770 3

# http -p h http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 4
Cache-Control: max-age=5
Connection: keep-alive
Content-Length: 56
Content-Type: text/plain
Date: Sun, 29 Nov 2015 15:11:03 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 65538 32771

On the third request, Varnish is returning an object that is 8 seconds old, despite the max-age=5 second.
When this request was received, Varnish immediately fired off a request to the web server to revalidate
the object, but returned the result from cache. This is also demonstrated by the fourth request, where Age
is already 4. The fourth request gets the result from the backend-request started when the third request
was received. So:

1. Request: Nothing in cache. Varnish requests content from backend, waits, and responds with that
result.
2. Request: Standard cache hit.
3. Request: Varnish sees that the object in cache is stale, initiates a request to a backend server, but
does NOT wait for the response. Instead, the result from cache is returned.
4. Request: By now, the backend-request initiated from the third request is complete. This is thus a
standard cache hit.
This behavior means that slow backends will not affect client requests if content is cached.
If this behavior is unwanted, you can disable grace by setting stale-while-revalidate=0:

# http -p h http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: max-age=5, stale-while-revalidate=0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Thu, 03 Dec 2015 12:50:36 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 12

# http -p h http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 3
Cache-Control: max-age=5, stale-while-revalidate=0
Page 38 Chapter 2.8 stale-while-revalidate

Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Thu, 03 Dec 2015 12:50:36 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32773 13

# http -p h http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: max-age=5, stale-while-revalidate=0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Thu, 03 Dec 2015 12:50:42 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32775

# http -p h http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 1
Cache-Control: max-age=5, stale-while-revalidate=0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Thu, 03 Dec 2015 12:50:42 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 15 32776

This was added in Varnish 4.1.0. We can now see that no background fetching was done at all, and no
stale objects were delivered. In other words:

1. Request: Nothing in cache. Varnish requests content from backend, waits, and responds with that
result.
2. Request: Standard cache hit.
3. Request: Nothing in cache. Varnish fetches content form backend, waits and responds with that
result.
4. Request: Standard cache hit.
Chapter 2.9 Vary Page 39

2.9 Vary
The Vary-header is exclusively meant for intermediate caches, such as Varnish. It is a comma-separated
list of references to request headers that will cause the web server to produce a different variant of the
same content. An example is needed:

# http -p Hhb http://localhost:6081/cgi-bin/foo.sh "X-demo: foo"


GET /cgi-bin/foo.sh HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:6081
User-Agent: HTTPie/0.8.0
X-demo: foo

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 6
Cache-Control: s-maxage=3600
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 23:56:47 GMT
Server: Apache/2.4.10 (Debian)
Vary: X-demo
Via: 1.1 varnish-v4
X-Varnish: 12 32771

Hello. Random number: 21126


Fri Nov 27 23:56:47 UTC 2015

# http -p Hhb http://localhost:6081/cgi-bin/foo.sh "X-demo: bar"


GET /cgi-bin/foo.sh HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:6081
User-Agent: HTTPie/0.8.0
X-demo: bar

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: s-maxage=3600
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 23:56:57 GMT
Server: Apache/2.4.10 (Debian)
Vary: X-demo
Via: 1.1 varnish-v4
X-Varnish: 32773

Hello. Random number: 126


Fri Nov 27 23:56:57 UTC 2015
Page 40 Chapter 2.9 Vary

# http -p Hhb http://localhost:6081/cgi-bin/foo.sh "X-demo: foo"


GET /cgi-bin/foo.sh HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:6081
User-Agent: HTTPie/0.8.0
X-demo: foo

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 15
Cache-Control: s-maxage=3600
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 23:56:47 GMT
Server: Apache/2.4.10 (Debian)
Vary: X-demo
Via: 1.1 varnish-v4
X-Varnish: 14 32771

Hello. Random number: 21126


Fri Nov 27 23:56:47 UTC 2015

# http -p Hhb http://localhost:6081/cgi-bin/foo.sh "X-demo: bar"


GET /cgi-bin/foo.sh HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:6081
User-Agent: HTTPie/0.8.0
X-demo: bar

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 8
Cache-Control: s-maxage=3600
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 23:56:57 GMT
Server: Apache/2.4.10 (Debian)
Vary: X-demo
Via: 1.1 varnish-v4
X-Varnish: 32776 32774

Hello. Random number: 126


Fri Nov 27 23:56:57 UTC 2015

These four requests demonstrates that two objects are entered into the cache for the same URL,
accessible by modifying the arbitrarily chosen X-demo request header - which is not a real header.
Chapter 2.9 Vary Page 41

The most important use-case for Vary is to support content encoding such as gzip. In earlier versions of
Varnish, the web server needed to do the compression and Varnish would store the compressed content
and (assuming a client asked for it), the uncompressed content. This was supported through the Vary
header, which the server would set to Vary: Accept-Encoding. Today, Varnish understands gzip and
this isn't needed. There are two more examples of Vary-usage.
Mobile devices are often served different variants of the same contents, so called mobile-friendly pages.
To make sure intermediate caches supports this, Varnish must emit a Vary: User-Agent string,
suggesting that for each different User-Agent header sent, a unique variant of the cache must be
made.
The second such header is the nefarious Cookie header. Whenever a page is rendered differently
based on a cookie, the web server should send Vary: Cookie. However, hardly anyone do this in the
real world, which has resulted in cookies being treated differently. Varnish does not cache any content if
it's requested with a cookie by default, nor does it cache any response with a Set-Cookie-header. This
clearly needs to be overridden, and will be covered in detail in later chapters.
The biggest problem with the Vary-header is the lack of semantic details. The Vary header simply states
that any variation in the request header, however small, mandates a new object in the cache. This causes
numerous headaches. Here are some examples:

• Accept-Enoding: gzip,deflate and Accept-Encoding: deflate,gzip will result in two


different variants.
• Vary: User-Agent will cause a tremendous amount of variants, since the level of detail in modern
User-Agent headers is extreme.
• It's impossible to say that only THAT cookie will matter, not the others.
Many of these things can be remedied or at least worked around in Varnish. All of it will be covered in
detail in separate chapters.
On a last note, Varnish has a special case were it refuse to cache any content with a response header of
Vary: *.
Page 42 Chapter 2.10 Request methods

2.10 Request methods


Only the GET request method is cached. However, Varnish will re-write a HEAD request to a GET
request, cache the result and strip the response body before answering the client. A HEAD request is
supposed to be exactly the same as a GET request, with the response body stripped, so this makes
sense. To see this effect, issue a HEAD request first directly to Apache:

# http -p Hhb HEAD http://localhost:8080/cgi-bin/foo.sh


HEAD /cgi-bin/foo.sh HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:8080
User-Agent: HTTPie/0.8.0

HTTP/1.1 200 OK
Connection: Keep-Alive
Content-Length: 29
Content-Type: text/plain
Date: Sat, 28 Nov 2015 00:30:33 GMT
Keep-Alive: timeout=5, max=100
Server: Apache/2.4.10 (Debian)

# tail -n1 /var/log/apache2/access.log


::1 - - [28/Nov/2015:00:30:33 +0000] "HEAD /cgi-bin/foo.sh HTTP/1.1" 200 190 "-" "HTTPie

The access log shows a HEAD request. Issuing the same request to Varnish:

# http -p Hhb HEAD http://localhost:6081/cgi-bin/foo.sh


HEAD /cgi-bin/foo.sh HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:6081
User-Agent: HTTPie/0.8.0

HTTP/1.1 200 OK
Age: 0
Connection: keep-alive
Content-Length: 29
Content-Type: text/plain
Date: Sat, 28 Nov 2015 00:32:05 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 2

# tail -n1 /var/log/apache2/access.log


127.0.0.1 - - [28/Nov/2015:00:32:05 +0000] "GET /cgi-bin/foo.sh HTTP/1.1" 200 163 "-" "H

The client sees the same result, but the web server has logged a GET request. Please note that
HEAD-requests include a Content-Lenght as if a GET-request was issued. It is only the response body
itself that is absent.
Chapter 2.11 Cached status codes Page 43

2.11 Cached status codes


Only a subset of response odes allow cacheing, even if an s-maxage or similar is provided. Quoting
directly from Varnish source code, specifically bin/varnishd/cache/cache_rfc2616.c, the list is:

case 200: /* OK */
case 203: /* Non-Authoritative Information */
case 204: /* No Content */
case 300: /* Multiple Choices */
case 301: /* Moved Permanently */
case 304: /* Not Modified - handled like 200 */
case 404: /* Not Found */
case 410: /* Gone */
case 414: /* Request-URI Too Large */

That means that if you provide s-maxage on a 500 Internal Server Error, Varnish will still not
cache it by default. Varnish will cache the above status codes even without any cache control headers.
The default cache duration is 2 minutes.
In addition to the above, there are two more status codes worth mentioning:

case 302: /* Moved Temporarily */


case 307: /* Temporary Redirect */
/*
* https://tools.ietf.org/html/rfc7231#section-6.1
*
* Do not apply the default ttl, only set a ttl if Cache-Control
* or Expires are present. Uncacheable otherwise.
*/
expp->ttl = -1.;

Responses with status codes 302 Moved Temporarily or 307 Temporary Redirect are only
cached if Cache-Control or Expires explicitly allows it, but not cached by default.
In other words:

• max-age=10 + 500 Internal Server Error: Not cached


• max-age=10 + 302 Moved Temporarily: Cached
• No Cache-Control + 302 Moved Temporarily: Not cached
• No Cache-Control + 404 Not Found: Cached
Page 44 Chapter 2.12 Cookies and authorization

2.12 Cookies and authorization


Requests with a cookie-header or HTTP basic authorization header are tricky at best to cache. Varnish
takes a "better safe than sorry" approach, and does not cache responses to requests with either a
Cookie-header, Authorization-header by default. Responses with Set-Cookie are not cached.
Because cookies are so common, this will generally mean that any modern site is not cached by default.
Fortunately, Varnish has the means to override that default. We will investigate that in detail in later
chapters.
Chapter 2.13 Summary Page 45

2.13 Summary
There are a few other headers worth mentioning. The ancient Pragma header is still seen, and
completely ignored by Varnish and generally replaced by Cache-Control. One header Varnish does
care about is Expires. Expires is generally deprecated, but still valid.
If s-maxage and max-age is missing from Cache-Control, then Varnish will use an Expires
header. The format of the Expires header is that of an absolute date - the same format as Date and
Last-Modified. Don't use this unless you want a headache.
In other words, to cache by default:

• The request method must be GET or HEAD.


• There can be no Cookie-header or Authorize-header in the request.
• There can be no Set-Cookie on the reply.
• The status code needs to be 200, 203, 204, 300, 301, 304, 404, 410, 414.
• OR the status code can be 302 or 307 IF Cache-Control or Expires enables caching.
• Vary must NOT be *.
Varnish decides cache duration (TTL) in the following order:

• If Cache-Control has s-maxage, that value is used.


• Otherwise, if Cache-Control has max-age, that value is used.
• Otherwise, if Expires is present, that value is used.
• Lastly, Varnish uses a default fall-back value. This is 2 minutes by default, as dictated by the
default_ttl parameter.
Our goal when designing cache policies is to push as much of the logic to the right place. The right place
for setting cache duration is usually in the application, not in Varnish. A good policy is to use s-maxage.
Page 46 Chapter 3 Architecture and operation

3 Architecture and operation


If you are working with Varnish, you need to know how to get information from it, how to start and stop it
and how to configure it.
This chapter explains the architecture of Varnish, how Varnish deals with logs, best practices for running
Varnish and debugging your Varnish installation.
When you're done reading this chapter, you'll know how to tell what goes into Varnish and what comes out
in the other end. You'll have an idea of what it takes to operate Varnish in the long-term and some very
basic tuning needs.
What you will not learn in this chapter is what every single Varnish parameter is for. Advanced tuning is a
topic for a later chapter, as it's mostly not relevant for every-day operation. Neither will you see much of
the Varnish Configuration Language (VCL). VCL requires a chapter or two all by itself.
The Varnish developers use numerous three letter acronyms for the components and concepts that are
covered in this chapter. We will only use them sparsely and where they make sense. Many of them are
ambiguous and some refer to different things depending on context. An effort is made to keep a list of the
relevant acronyms and their meaning. That list can be found at
https://www.varnish-cache.org/trac/wiki/VTLA, with a copy attached in Appendix B.
Chapter 3.1 Architecture Page 47

3.1 Architecture
Varnish architecture is not just of academic interest. Every single tool you use is affected by it, and
understanding the architecture will make it easier to understand how to use Varnish.
Varnish operates using two separate processes. The management process and the child process. The
child is where all the work gets done. A simplified overview:

The management process, which is also the parent process, handles initialization, parsing of VCL,
interactive administration through the CLI interface, and basic monitoring of the child process.
Varnish has two different logging mechanisms. The manager process will typically log to syslog, like you
would expect, but the child logs to a shared memory log instead. This shared memory can be accessed by
Varnish itself and any tool that knows where to find the log and how to parse it.
A shared memory log was chosen over a traditional log file for two reasons. First of all, it is quite fast, and
doesn't eat up disk space. The second reason is that a traditional log file is often limited in information.
Compromises have to be made because it is written to disk and could take up a great deal of space if
everything you might need during a debug session was always included. With a shared memory log,
Varnish can add all the information it has, always. If you are debugging, you can extract everything you
need, but if all you want is statistics, that's all you extract.
The shared memory log, abbreviated shmlog, is a round-robin style log file which is typically about 85MB
large. It is split in two parts. The smallest bit is the part for counters, used to keep track of any part of
Varnish that could be covered by a number, e.g. number of cache hits, number of objects, and so forth.
This part of the shmlog is roughly 2MB by default, by varies depending on setup.
The biggest part of the shmlog is reserved for fifo-style log entries, directly related to requests typically.
This is 80MB by default. Once those 80MB are filled, Varnish will continue writing to the log from the top. If
you wish to preserve any of the data, you need to extract it before it's overwritten. This is where the
various Varnish tools come into the picture.
VCL is not a traditionally parsed configuration format, but a shim layer on top of C and the Varnish run
time library (VRT). You are not so much configuring Varnish with VCL as programming it. Once you've
written your VCL file, Varnish will translate it to C and compile it, then link it directly into the child process.
Both varnishadm and varnish-agent are tools that can influence a running Varnish instance, while
any tool that only works on the shmlog is purely informational and has no direct impact on the running
Varnish instance.
Page 48 Chapter 3.2 Design principles of Varnish

3.2 Design principles of Varnish


Varnish is designed to solve real problems and then largely get out of your way. If the solution to your
problem is to buy more RAM, then Varnish isn't going to try to work around that issue. Varnish relies on
features provided by 64-bit operating systems and multi-core hardware.
Varnish also uses a great deal of assert() statements and other fail safes in the code base. An
assert() statement is a very simple mechanism. assert(x == 0); means "make sure x is 0". If x is
not 0, Varnish will abort. In most cases, that means the entire child process shuts down, only to have the
manager start it back up. You lose all connections, you lose all cache.
Hopefully, you wont run into assert errors. They are there to handle what is believed to be the unthinkable.
A more realistic example can be:

• Create an object, called foo. Set foo.magic to 0x123765.


• Store foo in the cache.
• (time passes)
• Read foo from the cache.
• Assert that foo.magic is still 0x123765.
This is a crude safe guard against memory corruption, and is used for almost all data structures that are
kept around for a while in Varnish. An arbitrary magic value is picked during development, and whenever
the object is used, that value is read back and checked. If it doesn't match, your memory was corrupted.
Either by something Varnish did or by the host it's running on.
Assert errors are there to make sure that you don't use a corrupt system. The theory is that if something
so bad that the code doesn't account for it happens, then it's better to just stop and start up. You might
lose some up-time (usually in the order of a couple of seconds), but at least your Varnish instance is back
up in a predictable state.
Other typical assert statements will check the return codes for library calls that should never fail.
If Varnish does hit an assert error, it will (try to) log it to syslog. In addition to that, it keeps the last panic
message available through varnishadm panic.show:

# varnishadm panic.show
Child has not panicked or panic has been cleared
Command failed with error code 300
#
Chapter 3.3 The different categories of configuration Page 49

3.3 The different categories of configuration


Varnish has three categories of configuration settings. Certain things must be configured before Varnish
starts and can't be changed during run-time. These settings are few in number, and are provided on the
command line. Even among command line arguments, several can be changed during run time to some
degree. The working directory to be used and the management interface are among the settings that are
typically provided as command line arguments.
The second category of configuration Varnish uses is the run-time parameters. These can be changed
after Varnish has started, but depending on the nature of the parameter it could take some time before the
change is visible. Parameters can be changed through the CLI, but need to be added as a command line
argument in a startup script to be permanent.
Parameters usually control some purely operational aspect of Varnish, not policy. Default values for
Varnish parameters are frequently tuned between Varnish releases as feedback from real-world use
reaches developers. As such, most parameters can be left to the default values. Some examples of what
parameters can modify is the number of threads Varnish can use, the size of the shared memory log, what
user to run as and default timeout values.
Many of the command line arguments passed to varnishd are actually short-hands for their respective
parameters.
The third type of configuration is the Varnish Configuration Language script, usually just referred to as
your VCL or VCL file. This is where you will specify caching policies, what backends you have and how to
pick a backend. VCL can be changed at run-time with little or no penalty to performance, but like
parameters, changes are not retroactive. If your VCL says "cache this for 5 years" and the content is
cached, then changing your VCL to "cache this for 1 minute" isn't going to alter the cache duration for
content that has already been cached.
VCL is easily the most extensive part of Varnish, but you can get a lot done with a few simple techniques.
In this chapter, VCL is not a focus, but is only briefly mentioned and used to avoid building bad habits.
To summarize:
Command line arguments
Stored in startup-scripts. Takes effect on (re)starting Varnish. Some can be modified after startup,
some can not. Often just a short-hand for setting default values for parameters. Examples: "how
much memory should Varnish use", "what port should the management interface use", "what are the
initial values for parameters"
Parameters
Stored in startup-scripts, but can be changed at run-time. Upon re-start, the values from the startup
scripts are used. Changes operational aspects of Varnish, often in great detail. Examples: "how large
should the stack for a thread be", "what are the default values for cache duration", "what is the
maximum amount of headers Varnish supports".
Varnish Configuration Language
Stored in one or more separate VCL files, usually in /etc/varnish/. Can be changed on-the-fly.
Uses a custom-made configuration language to define caching policies. Examples: "Retrieve content
for www.example.com from backend server at prod01.example.net", "Strip Cookie headers for these
requests", "Output an error message for this URL".
Page 50 Chapter 3.4 Command line arguments

3.4 Command line arguments


Command line arguments are rarely entered directly, but usually kept in
/lib/systemd/system/varnish.service or similar startup scripts. Before we look at startup scripts,
we'll look at running varnishd by hand.
Varnish hasn't got the best track record of verifying command line arguments. Just because Varnish starts
with the arguments you provided doesn't mean Varnish actually used them as you expected. Make sure
you double check if you deviate from the standard usage.
-a specifies what port Varnish listens to. Most installations simply use -a :80, but it's worth noting that
you can have Varnish listening on multiple sockets. This is especially useful in Varnish 4.1 where you can
have Varnish listen for regular HTTP traffic on port 80, and SSL-terminated traffic through the PROXY
protocol on 127.0.0.1:1443 (for example). In Varnish 4.0, this is accomplished by having a white-space
separated list of address:port pairs:

varnishd -b localhost:8080 ... -a "0.0.0.0:80 127.0.0.1:81"

In Varnish 4.1, you can supply multiple -a options instead.


Be careful. Varnish 4.0 will still accept multiple -a options, but only the last one will be used.
Another subtle detail worth noting is that the varnishd default value for -a is listening to port 80. But
we have seen in previous installations that a default Varnish installation listens on port 6081, not port 80.
This is because port 6081 is a convention specified in startup scripts. Here's an example from a default
Debian Jessie installation's /lib/systemd/system/varnish.service:

ExecStart=/usr/sbin/varnishd -a :6081 -T localhost:6082 \


-f /etc/varnish/default.vcl \
-S /etc/varnish/secret \
-s malloc,256m

In addition to telling Varnish where to listen, you need to tell it where to get content. You can achieve this
through the -b <address[:port]> argument, but that is typically limited to testing. In almost all other
cases you will want to specify an -f file option instead. -f file tells Varnish where to find the VCL
file it should use, and that VCL file will have to list any backend servers Varnish should use. When you
use -b, Varnish generates a simple VCL file for you behind the scenes:

# varnishd -b pathfinder.kly.no:6085
# varnishadm vcl.show boot
vcl 4.0;
backend default {
.host = "pathfinder.kly.no:6085";
}

The -T option specifies a listening socket for Varnish's management CLI. Since its introduction, the
convention has been to run the CLI interface on 127.0.0.1:6082, and this is seen in most Varnish
distributions. However the actual default for the varnishd binary in Version 4 and newer is a random
port and secret file.
The -S argument lets you specify a file which contains a shared secret that management tools can use to
authenticate to Varnish. This is referred to as the secret file and should contain data, typically 256 bytes
randomly generated at installation. The content is never sent over the network, but used to verify clients.
All tools that are to interact with Varnish must be able to read the content of this file.
varnishadm and other tools that use the management port will read the -T and -S argument from the
shmlog if you don't provide them on the command line. As seen here:
Chapter 3.4 Command line arguments Page 51

# varnishd -b localhost:8080
# netstat -nlpt
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.1:37860 0.0.0.0:* LISTEN 2172/varnishd
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN -
tcp6 0 0 :::80 :::* LISTEN -
tcp6 0 0 ::1:35863 :::* LISTEN 2172/varnishd
# varnishadm status
Child in state running
# varnishadm -T localhost:37860 status
Authentication required
# varnishadm -T localhost:37860 -S /var/lib/varnish/c496eeac1030/_.secret status
Child in state running

varnishadm works with zero arguments, but if you add -T you also have to specify the -S argument.
varnishadm can re-use multiple options from varnishd (-T, -S, -n).
Many Varnish installations default to using -S /etc/varnish/secret. This is a good habit in case you
end up with multiple Varnish instances over multiple machines.
Page 52 Chapter 3.5 Cache size and storage backend

3.5 Cache size and storage backend


The -s argument is used to set how large Varnish's cache will be, and what underlying method is used to
cache. Varnish provides three storage backends, called malloc, file and persistent. The most
used, by far, is malloc. It works by allocating the memory needed with the malloc() library call, and
adds as little logic as possible on top of it. Under the hood, Varnish uses the jemalloc library to achieve
better performance for multi-threaded applications. If you specify a larger cache than you have physical
memory, it is up to your operating system to utilize swap instead.
The second alternative is file. This allocates a file on your file system, then uses mmap() to map it into
memory. Varnish never makes an attempt to commit the content to disk. The file is merely provided in
case your cache is larger than your physical memory. It is not possible to re-use a file previously used with
-s file to regain the cached content you had before a restart or similar event. What is written to the file
is for all practical purposes random.
The last alternative is persistent. This is by far the most complex alternative, and is meant to provide a
persistent storage of cache between restarts. It doesn't make a guarantee that all of the content is there,
though, only that the majority is there and that what's there is intact.
As of Varnish 4.1, both persistent and file are deprecated. Persistent is deprecated because it is
very complex and has not received near enough testing and feedback to be regarded as production
quality. It is used by several large Varnish installations, but use at your own risk. For file, the
deprecation is less severe. The malloc alternative is simply better for most use cases, and maintaining
two different methods with similar properties was deemed unnecessary. Unlike persistent, file is
considered quite stable, just sub-optimal.
If you do end up using -s malloc, the next question is usually "how large should the cache be?". There
is no easy answer to this, but as a rule, starting out with 80% of the memory your machine has available is
usually safe. Varnish will use a little more memory than just what you specify for -s malloc, so you need
to anticipate that too. How much more depends on your traffic. Many small objects have a larger
overhead, while large objects have less of an overhead.
Chapter 3.6 Other useful varnishd arguments Page 53

3.6 Other useful varnishd arguments


-n dir is used to control the Varnish working directory and name. The directory argument can either just
be a simple name, like -n frontserver, in which case Varnish will use a working directory named
frontserver in its default path, typically /var/lib/varnish/frontserver/. You can also provide a
full path instead. Whenever you alter -n, you need to provide that same -n argument to any Varnish-tool
you want to use. There are two use cases for -n:

1. Running multiple Varnish instances on the same machine. Give each a different -n to make this
work.
2. Run varnishd as a user that doesn't have access to the default working directory. This can be
handy during development or testing to avoid having to start Varnish as the root user.
If you look in the working directory, you can see your shmlog file and the compiled VCL, among other
things:

# varnishd -b localhost:8080
# varnishd -b localhost:8110 -a :81 -n test
# ls /var/lib/varnish/
3da4db675c6b test
# ls /var/lib/varnish/3da4db675c6b/
_.secret _.vsm vcl.QakoKN_T.so
# ls /var/lib/varnish/test/
_.secret _.vsm vcl.Lnayret_.so

A common task is to verify that your VCL is correct before you try loading it. This can be done with the -C
option. It will either give you a syntax error for your VCL or a whole lot of C code, which happens to be
your VCL translated to C. However, that isn't very useful alone. The following script is slightly more useful:

#!/bin/bash
if [ -z "$1" ]; then
echo "Usage: $0 <file>"
exit 1;
fi
FOO=$(varnishd -C -f "$1")
ret=$?
if [ "x$ret" = "x0" ]; then
echo "Syntax OK"
exit 0
fi
echo "$FOO"
exit $ret

Running it:

# ./check_syntax.sh bad.vcl
Message from VCC-compiler:
VCL version declaration missing
Update your VCL to Version 4 syntax, and add
vcl 4.0;
on the first line the VCL files.
('input' Line 1 Pos 1)
bad
###
Page 54 Chapter 3.6 Other useful varnishd arguments

Running VCC-compiler failed, exited with 2

VCL compilation failed

# ./check_syntax.sh good.vcl
Syntax OK

Similar scripts are usually part of the "reload" scripts used in various start-up scripts.
Chapter 3.7 Summary of varnishd arguments Page 55

3.7 Summary of varnishd arguments


There are more command line arguments than these, and they are all documented in the manual page
varnishd(1). This is a summary of the ones used in this chapter:
-a <listen address>
Listen address. Typically set to :80. Format for specifying multiple listening sockets varies between
Varnish 4.0 and 4.1.
-b <address[:port]>
Specify backend address. Mostly for testing, mutually exclusive with -f (VCL).
-f <vclfile>
Specify path to VCL file to use at startup.
-T address:port
Set management/CLI listening address. Used for controlling Varnish. varnishd default is random,
but 127.0.0.1:6082 is a common value used in default installations.
-S <secret file>
Used to secure the management CLI. Points to a file with random data that both varnishd and
management clients like varnishadm must have access to. Often set to /etc/varnish/secret.
Shouldn't matter where it is as long as varnishadm can read it and the shmlog.
-s <method,options>
Used to control how large the cache can be and the storage engine. Alternatives are
-s persistent,(options), -s file,(options) and -s malloc,(size).
-s malloc,256m (or more) is strongly recommended.
-n <name or path>
Specifies working directory, and/or name of the instance. Only needed if multiple varnishd
instances run on the same machine.
-C -f <vclfile>
Only parse the VCL, then exit. If the VCL file compiles (i.e.: The syntax is correct), it outputs the raw
C code then exits with a return code of 0 (true), otherwise describes the syntax error and exits with a
non-0 status code (false).
Page 56 Chapter 3.8 Startup scripts

3.8 Startup scripts


Varnish Cache development focuses on GNU/Linux and FreeBSD, with some occasional attention
directed towards Solaris.
But the vast majority of Varnish Cache operational focus is on GNU/Linux, more specifically on
Fedora-derived systems, such as Red Hat Enterprise Linux (RHEL), Fedora and CentOS, or on Debian
and Ubuntu. These are the distributions where Varnish packaging is best maintained and they deliver
top-quality Varnish packages.
The startup scripts provided for those distributions are solid, and should be used whenever possible.
Since before GNU/Linux existed, System V-styled init scripts have been used to boot Unix-like machines.
This has been the case for GNU/Linux too. Until recently, when upstart and systemd came around.
By now, all the major GNU/Linux distributions use or are preparing to use systemd. That means that if
you have older installations, the specific way Varnish is started will be different than how it's started on
newer installations. In the end, though, it all boils down to one thing: you have to know into which file you
need to add your varnishd start-up arguments, and what commands to use to start and stop it.
Where your distribution keeps its configuration will vary, but in short:

• They all keep VCL and secret files in /etc/varnish by default.


• With systemd, startup arguments are kept in /lib/systemd/system/varnish.service for
both distribution families. That file should be copied to
/etc/systemd/system/varnish.service if you mean to modify it.
• Recent RHEL/Fedora packages use /etc/varnish/varnish.params. A similar strategy is
expected for other distributions too in the future.
• Before systemd, Debian/Ubuntu kept startup arguments in /etc/default/varnish.
• Before systemd, Red Had Enterprise Linux/CentOS/Fedora kept startup arguments in
/etc/sysconfig/varnish.
For starting and stopping, it's a little simpler:

• If you have systemd, use systemctl <start|stop|reload|restart> varnish.service.


• If have System V scripts, use service varnish <stop|start|reload|restart>.
To enable or disable starting Varnish at boot, you can use systemctl
<enable|disable> varnish.service on Systemd-systems.
Chapter 3.9 Parameters Page 57

3.9 Parameters
Run-time parameters in Varnish allow you to modify aspects of Varnish that should normally be left alone.
The default values are meant to suite the vast majority of installations, and usually do. However,
parameters exist for a reason.
Varnish 4.0 has 93 parameters, which can be seen using varnishadm on a running Varnish server:

# varnishadm param.show
acceptor_sleep_decay 0.9 (default)
acceptor_sleep_incr 0.001 [s] (default)
acceptor_sleep_max 0.050 [s] (default)
(...)

You can also get detailed information on individual parameters:

# varnishadm param.show default_ttl


default_ttl
Value is: 120.000 [seconds] (default)
Default is: 120.000
Minimum is: 0.000

The TTL assigned to objects if neither the backend nor the VCL
code assigns one.

NB: This parameter is evaluated only when objects are


created.To change it for all objects, restart or ban
everything.

Changing a parameter takes effect immediately, but is not always immediately visible, as the above
description of default_ttl demonstrates. Changing default_ttl will affect any new object entered into the
cache, but not what is already there.
Many of the parameters Varnish exposes are meant for tweaking very intricate parts of Varnish, and even
the developers may not know the exact consequence of modifying it, this is usually flagged through a
warning, e.g.:

# varnishadm param.show timeout_linger


timeout_linger
Value is: 0.050 [seconds] (default)
Default is: 0.050
Minimum is: 0.000

How long time the workerthread lingers on an idle session


before handing it over to the waiter.
When sessions are reused, as much as half of all reuses happen
within the first 100 msec of the previous request completing.
Setting this too high results in worker threads not doing
anything for their keep, setting it too low just means that
more sessions take a detour around the waiter.

NB: We do not know yet if it is a good idea to change this


parameter, or if the default value is even sensible. Caution
is advised, and feedback is most welcome.
Page 58 Chapter 3.9 Parameters

You can change parameters using varnishadm param.set:

# varnishadm param.set default_ttl 15


# varnishadm param.show | grep default_ttl
default_ttl 15.000 [seconds]

However, this is stored exclusively in the memory of the running Varnish instance, if you want to make it
permanent, you need to add it to the varnishd command line as a -p argument. E.g.:

# varnishd -b localhost:1111 -p default_ttl=10 -p prefer_ipv6=on

The usual work flow for adjusting parameters is:

1. Start Varnish
2. Modify parameters through varnishadm
3. Test
4. Go back to step 2 if it doesn't work as intended
5. When it works as intended, save the changes to your startup script as -p arguments.
Most parameters can and should be left alone, but reading over the list is a good idea. The relevant
parameters are referenced when we run across the functionality.
Chapter 3.10 Tools: varnishadm Page 59

3.10 Tools: varnishadm


Controlling a running Varnish instance is accomplished with the varnishadm tool, which talks to the
management process through the CLI interface.
You can run varnishadm in two different modes: interactive, or with the CLI command you wish to issue
as part of the varnishadm command line. The examples have so far used the latter form, e.g.:

# varnishadm status
Child in state running

If you just type varnishadm, you enter the interactive mode:

# varnishadm
200
-----------------------------
Varnish Cache CLI 1.0
-----------------------------
Linux,4.2.0-0.bpo.1-amd64,x86_64,-smalloc,-smalloc,-hcritbit
varnish-4.0.2 revision bfe7cd1

Type 'help' for command list.


Type 'quit' to close CLI session.

varnish> status
200
Child in state running
varnish> quit
500
Closing CLI connection

Both modes are functionally identical. One benefit of using the interactive mode is that you don't have to
worry about yet an other level of quotation marks once you start dealing with more complex commands
than vcl.load and param.list. For now, it's just a matter of style. An other difference is that
varnishadm in interactive mode also offer rudimentary command line completion, something your shell
might not.
The CLI, and varnishadm by extension, uses HTTP-like status codes. If a command is issued
successfully, you will get a 200 in return. These are just similar to HTTP, though, and do not match fully.
When you are using varnishadm, you are communicating with Varnish through the management
process, over a regular TCP connection. It is possible to run varnishadm from a remote host, even if it
is not generally advised. To accomplish this, you must:

• Use a -T option that binds the CLI to an externally-available port. E.g.: Not -T localhost:6082.
• Copy the secret file from the Varnish host to the one you wish to run varnishadm from.
• Make sure all firewalls etc are open.
• Issue varnishadm with -T and -S.
CLI communication is not encrypted. The authentication is reasonably secure, in that it is not directly
vulnerable to replay attacks (the shared secret is never transmitted), but after authentication, the
connection can be hijacked. Never run varnishadm over an untrusted network. The best practice is to
keep it bound to localhost.
You do not need root-privileges to run varnishadm, the user just needs read-permission to the secret file
and either read permission to the shmlog or knowledge of the -T and -S arguments.
Page 60 Chapter 3.11 Tools: varnishstat

3.11 Tools: varnishstat


varnishstat is the simplest, yet one of the most useful log-related tools. With no arguments,
varnishstat opens an interactive view of Varnish-counters:

varnishstat reads counters from the shmlog and makes sense of them. It can also be accessed in
manners better suited for scripting, either varnishstat -1 (plain text), varnishstat -j (JSON) or
varnishstat -x (XML). The real-time mode collects data over time, to provide you with meaningful
interpretation. Knowing that you have had 11278670 cache hits over the last six and a half days might be
interesting, but knowing that you have 25.96 cache hits per seconds right now is far more useful. The
same can be achieved through varnishtat -1 and similar by executing the command twice and
comparing the values.
Looking at the upper left corner of the screenshot above, you'll see some durations:

This tells you the uptime of the management and child process. Every once in a while, these numbers
might differ. That could happen if you manually issue a stop command followed by a start command
through varnishadm, or if Varnish is hitting a bug and throwing an assert() error.
In the upper right corner, you will see six numbers:

The first line tells you the time frame of the second. It will start at "1 1 1" and grow to eventually read "10
100 1000". When you start varnishstat, it only has one second of data, but it collects up to a thousand
seconds.
The avg(n) line tells you the cache hit rate during the last (n) seconds, where, n refers to the line
above. In this example, we have a cache hit rate of 1.0 (aka: 100%) for the last 10 seconds, 0.9969
(99.69%) for the last 100 seconds and 0.9951 (99.51%) for the last 236 seconds. Getting a high cache hit
Chapter 3.11 Tools: varnishstat Page 61

rate is almost always good, but it can be a bit tricky. It reports how many client requests were served by
cache hits, but it doesn't say anything about how many backend requests were triggered. If you are using
grace mode, cache hit rate can be 100% while you are issuing requests to the web server.
The main area shows 7 columns:
NAME
This one should be obvious. The name of the counter.
CURRENT
The actual value. This is the only value seen in varnishstat -j and similar.
CHANGE
"Change per second". Or put an other way: The difference between the current value and the value
read a second earlier. Can be read as "cache hit per second" or "client reuqests per second".
AVERAGE
Average change of the counter, since start-up. The above example has had 19 client requests per
second on average. It's basically CURRENT divided by MAIN.uptime.
AVERAGE_n
Similar the cache hit rate, this is the average over the last n seconds. Note that the header says
AVERAGE_1000 immediately, but the actual time period is the same as the Hitrate n: line, so it
depends on how long varnishstat has been running.
An interactive varnishstat does not display all counters by default. It will hide any counter with a value
of 0, in the interest of saving screen real-estate. In addition to hiding counters without a value, each
counter has a verbosity level attached to it. By default, it only displays informational counters.
A few key bindings are worth mentioning:
<UP>/<DOWN>/<Page UP>/<Page Down>
Scroll the list of counters.
<d>
Toggle displaying unseen counters.
<v>
Similar to <d>, but cycles through verbosity levels instead of toggling everything.
<q>
Quit.
Page 62 Chapter 3.12 A note on threads

3.12 A note on threads


Now that you've been acquainted with parameters and counters, it might be worth looking at threads.
Varnish uses one worker thread per active TCP connection. A typical user can easily set up 5 or more
concurrent TCP sessions, depending on the content and browser. Varnish also organizes worker threads
into thread pools. Each pool of threads is managed by a separate thread, and can grow and shrink on
demand. By default, Varnish uses two thread pools, this can be tuned with the thread_pools
parameter.
Each thread pool starts up with thread_pool_min threads, by default, that is 100 threads. The upper
limit for threads used per thread pool is thread_pool_max, which in turn defaults to 5000. Even when
this limit is reached, Varnish has several layers of queues that will be used. You can see the state of the
session queue in the counter called MAIN.thread_queue_len. You can also observe how many
threads are used by looking at MAIN.threads. Since Varnish also removes threads that are unused,
looking at MAIN.threads_created is also interesting. If you see a high number of threads created,
that means Varnish is spawning new threads frequently, only to remove them later.
Traditionally, thread parameters were some of the few parameters that always made sense to tune. This
is no longer the case. Originally, Varnish shipped with very conservative default values where Varnish
would start with just 10 threads total. Today, it uses 200 by default, with a maximum of 10000. Even 200
can be a bit low, but it's nowhere near as drastic as what the old default of 10 threads was. As such, most
sites will operate very well using default thread parameters today.
It's worth repeating a small detail here: The thread parameters are per thread pool. That means that:

• Setting thread_pools=1 and thread_pool_min=10 gives you a minimum of 10 threads.


• Setting thread_pools=2 and thread_pool_min=100 gives you a minimum of 200 threads.
(this is the default).
• Setting thread_pools=5 and thread_pool_min=10 gives you a minimum of 50 threads.
And so forth. If you search the web, you might also run into pages that suggest setting thread_pools to
the same number as the number of CPU cores you have available. This was believed to be
advantageous, but further testing and experience has demonstrated that the biggest gain is changing it
from 1 thread pool to 2. Any number above 2 doesn't seem to make a significant difference. On the other
hand, a value of 2 is known to work very well.
In addition to worker threads, which make up the bulk of the threads Varnish uses, there are several other
more specialized threads that you rarely have to deal with. That can be the ban lurker thread, expiry
thread or acceptor thread, for example. Looking at a Varnish 4.0 installation on GNU/Linux, you can see
the consequence of this:

# varnishstat -1f MAIN.threads


MAIN.threads 200 . Total number of threads
# pidof varnishd
19 18
# grep Threads /proc/19/status
Threads: 217

The MAIN.threads counter states 200 threads, but investigating the /proc filesystem, you can see
that the worker process is actually using 217 threads. The worker threads are the only ones that we
usually have to worry about, though.
In summary: Threads rarely need tuning in Varnish 4, and the old best practices no longer apply. Varnish
will use one thread per active TCP connection, and scale automatically.
Chapter 3.13 Tools: varnishlog Page 63

3.13 Tools: varnishlog


Where varnishstat is a simple way to view and work with counters in Varnish, varnishlog is a
simple way to view and work with the rest of the shmlog. With no arguments, it will output all log data in
real time in a semi-ordered manner. However, most Varnish installations has far too much traffic for that to
be useful. You need to be able to filter and group data to be able to use varnishlog productively.
Normally varnishlog will only parse new data. Since the shmlog contains up to 80MB of old data, it's
some times useful to look at this data too. This can be achieved with the -d argument.
You can also select if you want backend-traffic (-b), client-traffic (-c) or everything. By default, you get
everything. Let's take a look at a single request:

# varnishlog -cd
* << Request >> 2
- Begin req 1 rxreq
- Timestamp Start: 1450446455.943883 0.000000 0.000000
- Timestamp Req: 1450446455.943883 0.000000 0.000000
- ReqStart ::1 59310
- ReqMethod GET
- ReqURL /
- ReqProtocol HTTP/1.1
- ReqHeader Host: localhost
- ReqHeader Connection: keep-alive
- ReqHeader Accept-Encoding: gzip, deflate
- ReqHeader Accept: */*
- ReqHeader User-Agent: HTTPie/0.8.0
- ReqHeader X-Forwarded-For: ::1
- VCL_call RECV
- VCL_return hash
- ReqUnset Accept-Encoding: gzip, deflate
- ReqHeader Accept-Encoding: gzip
- VCL_call HASH
- VCL_return lookup
- Debug "XXXX MISS"
- VCL_call MISS
- VCL_return fetch
- Link bereq 3 fetch
- Timestamp Fetch: 1450446455.945022 0.001139 0.001139
- RespProtocol HTTP/1.1
- RespStatus 200
- RespReason OK
- RespHeader Date: Fri, 18 Dec 2015 13:47:35 GMT
- RespHeader Server: Apache/2.4.10 (Debian)
- RespHeader Last-Modified: Thu, 03 Dec 2015 12:43:12 GMT
- RespHeader ETag: "2b60-525fdbbd7f800-gzip"
- RespHeader Vary: Accept-Encoding
- RespHeader Content-Encoding: gzip
- RespHeader Content-Type: text/html
- RespHeader X-Varnish: 2
- RespHeader Age: 0
- RespHeader Via: 1.1 varnish-v4
- VCL_call DELIVER
- VCL_return deliver
- Timestamp Process: 1450446455.945037 0.001154 0.000015
- Debug "RES_MODE 8"
Page 64 Chapter 3.13 Tools: varnishlog

- RespHeader Transfer-Encoding: chunked


- RespHeader Connection: keep-alive
- RespHeader Accept-Ranges: bytes
- Timestamp Resp: 1450446455.945157 0.001274 0.000119
- Debug "XXX REF 2"
- ReqAcct 130 0 130 356 3092 3448
- End

This is a lot of data, but represents a single client request. If your Varnish server is slightly more used than
this one, you will have far more log entries.
The very first column is used to help you group requests. The single * tells you that this particular line is
just informing you about the following grouping. << Request >> 2 tells you that the following is
grouped as a request, and the vxid is 2. A vixid is an ID attached to all log records. You will also see it in
the response header X-Varnish.
Next, you see what is more typical entries. Each log line starts with a - to indicate that it's related to the
above grouping, using the same vxid. Other grouping methods might have more dashes here to indicate
what happened first and last. The actual grouping is a logic done in the varnishlog tool itself, using
information from the shmlog. It is useful, because the shmlog is the result of hundreds, potentially
thousands of threads writing to a log at the same time. Without grouping it, tracking a single request would
be very hard.
Each line starts with a vxid followed by a log tag. Each type of tag has a different format, documented in
the vsl(7) manual page. In our example, the first real log line has the tag Begin.
You can tell varnishlog to only output some tags using the -i command line argument:

# varnishlog -d -i ReqURL
* << BeReq >> 3

* << Request >> 2


- ReqURL /

* << Session >> 1

* << BeReq >> 32771

* << Request >> 32770


- ReqURL /demo/

* << Session >> 32769

This also demonstrate why grouping is sometimes unwanted. You can change grouping method using -g.
Or disable it entirely with -g raw:

# varnishlog -d -g raw -i ReqURL


2 ReqURL c /
32770 ReqURL c /demo/

Here you can see the vxid directly, instead of a -.


You can also exclude individual tags with -x, or use a regular expression to match their content using -I.
The latter can be interesting if you want to look at a specific header.
Chapter 3.13 Tools: varnishlog Page 65

More importantly, however, is the use of the -q option, to specify a VSL query. VSL stands for Varnish
Shared memory Log and refers to the part of the log we are working with, and a VSL query allows you to
filter it intelligently. It is documented in the manual page vsl-query(7).
Let's look at the difference between the default (vxid) grouping and request grouping, while using a
VSL query:

# varnishlog -d -q 'ReqUrl eq "/demo/"'


* << Request >> 32770
- Begin req 32769 rxreq
- Timestamp Start: 1450447223.693214 0.000000 0.000000
- Timestamp Req: 1450447223.693214 0.000000 0.000000
- ReqStart ::1 59320
- ReqMethod GET
- ReqURL /demo/
- ReqProtocol HTTP/1.1
- ReqHeader Host: localhost
- ReqHeader Connection: keep-alive
- ReqHeader Accept-Encoding: gzip, deflate
- ReqHeader Accept: */*
- ReqHeader User-Agent: HTTPie/0.8.0
- ReqHeader X-Forwarded-For: ::1
- VCL_call RECV
- VCL_return hash
- ReqUnset Accept-Encoding: gzip, deflate
- ReqHeader Accept-Encoding: gzip
- VCL_call HASH
- VCL_return lookup
- Debug "XXXX MISS"
- VCL_call MISS
- VCL_return fetch
- Link bereq 32771 fetch
- Timestamp Fetch: 1450447223.693667 0.000454 0.000454
- RespProtocol HTTP/1.1
- RespStatus 404
- RespReason Not Found
- RespHeader Date: Fri, 18 Dec 2015 14:00:23 GMT
- RespHeader Server: Apache/2.4.10 (Debian)
- RespHeader Content-Type: text/html; charset=iso-8859-1
- RespHeader X-Varnish: 32770
- RespHeader Age: 0
- RespHeader Via: 1.1 varnish-v4
- VCL_call DELIVER
- VCL_return deliver
- Timestamp Process: 1450447223.693677 0.000463 0.000010
- RespHeader Content-Length: 280
- Debug "RES_MODE 2"
- RespHeader Connection: keep-alive
- Timestamp Resp: 1450447223.693712 0.000499 0.000036
- Debug "XXX REF 2"
- ReqAcct 135 0 135 232 280 512
- End
Page 66 Chapter 3.13 Tools: varnishlog

With the default grouping, we see just the client request and response. Reading the details, the
Link bereq 32771 fetch line tells us that this request is linked to a different one with vxid 32771.
Also, the VCL_return fetch indicates that (the default) VCL told Varnish to fetch the data.
Using a different grouping mode, you can see the linked backend request too. Switching to -g request,
the output includes the linked request too:

# varnishlog -d -g request -q 'ReqUrl eq "/"'


* << Request >> 2
- Begin req 1 rxreq
- Timestamp Start: 1450446455.943883 0.000000 0.000000
- Timestamp Req: 1450446455.943883 0.000000 0.000000
- ReqStart ::1 59310
- ReqMethod GET
- ReqURL /
- ReqProtocol HTTP/1.1
- ReqHeader Host: localhost
- ReqHeader Connection: keep-alive
- ReqHeader Accept-Encoding: gzip, deflate
- ReqHeader Accept: */*
- ReqHeader User-Agent: HTTPie/0.8.0
- ReqHeader X-Forwarded-For: ::1
- VCL_call RECV
- VCL_return hash
- ReqUnset Accept-Encoding: gzip, deflate
- ReqHeader Accept-Encoding: gzip
- VCL_call HASH
- VCL_return lookup
- Debug "XXXX MISS"
- VCL_call MISS
- VCL_return fetch
- Link bereq 3 fetch
- Timestamp Fetch: 1450446455.945022 0.001139 0.001139
- RespProtocol HTTP/1.1
- RespStatus 200
- RespReason OK
- RespHeader Date: Fri, 18 Dec 2015 13:47:35 GMT
- RespHeader Server: Apache/2.4.10 (Debian)
- RespHeader Last-Modified: Thu, 03 Dec 2015 12:43:12 GMT
- RespHeader ETag: "2b60-525fdbbd7f800-gzip"
- RespHeader Vary: Accept-Encoding
- RespHeader Content-Encoding: gzip
- RespHeader Content-Type: text/html
- RespHeader X-Varnish: 2
- RespHeader Age: 0
- RespHeader Via: 1.1 varnish-v4
- VCL_call DELIVER
- VCL_return deliver
- Timestamp Process: 1450446455.945037 0.001154 0.000015
- Debug "RES_MODE 8"
- RespHeader Transfer-Encoding: chunked
- RespHeader Connection: keep-alive
- RespHeader Accept-Ranges: bytes
- Timestamp Resp: 1450446455.945157 0.001274 0.000119
- Debug "XXX REF 2"
Chapter 3.13 Tools: varnishlog Page 67

- ReqAcct 130 0 130 356 3092 3448


- End
** << BeReq >> 3
-- Begin bereq 2 fetch
-- Timestamp Start: 1450446455.943931 0.000000 0.000000
-- BereqMethod GET
-- BereqURL /
-- BereqProtocol HTTP/1.1
-- BereqHeader Host: localhost
-- BereqHeader Accept: */*
-- BereqHeader User-Agent: HTTPie/0.8.0
-- BereqHeader X-Forwarded-For: ::1
-- BereqHeader Accept-Encoding: gzip
-- BereqHeader X-Varnish: 3
-- VCL_call BACKEND_FETCH
-- VCL_return fetch
-- BackendOpen 17 default(127.0.0.1,,8080) 127.0.0.1 54806
-- Backend 17 default default(127.0.0.1,,8080)
-- Timestamp Bereq: 1450446455.944036 0.000105 0.000105
-- Timestamp Beresp: 1450446455.944924 0.000993 0.000888
-- BerespProtocol HTTP/1.1
-- BerespStatus 200
-- BerespReason OK
-- BerespHeader Date: Fri, 18 Dec 2015 13:47:35 GMT
-- BerespHeader Server: Apache/2.4.10 (Debian)
-- BerespHeader Last-Modified: Thu, 03 Dec 2015 12:43:12 GMT
-- BerespHeader ETag: "2b60-525fdbbd7f800-gzip"
-- BerespHeader Accept-Ranges: bytes
-- BerespHeader Vary: Accept-Encoding
-- BerespHeader Content-Encoding: gzip
-- BerespHeader Content-Length: 3078
-- BerespHeader Content-Type: text/html
-- TTL RFC 120 -1 -1 1450446456 1450446456 1450446455 0 0
-- VCL_call BACKEND_RESPONSE
-- VCL_return deliver
-- Storage malloc s0
-- ObjProtocol HTTP/1.1
-- ObjStatus 200
-- ObjReason OK
-- ObjHeader Date: Fri, 18 Dec 2015 13:47:35 GMT
-- ObjHeader Server: Apache/2.4.10 (Debian)
-- ObjHeader Last-Modified: Thu, 03 Dec 2015 12:43:12 GMT
-- ObjHeader ETag: "2b60-525fdbbd7f800-gzip"
-- ObjHeader Vary: Accept-Encoding
-- ObjHeader Content-Encoding: gzip
-- ObjHeader Content-Type: text/html
-- Fetch_Body 3 length stream
-- Gzip u F - 3078 11104 80 80 24554
-- BackendReuse 17 default(127.0.0.1,,8080)
-- Timestamp BerespBody: 1450446455.945101 0.001169 0.000177
-- Length 3078
-- BereqAcct 133 0 133 283 3078 3361
-- End
Page 68 Chapter 3.13 Tools: varnishlog

Now you see both the client-request and the backend request. The "top" request is the client request. The
backend request starts with ** << BeReq >> 3. The two stars indicate that it's nested one level
deeper than the above request, as does the two leading dashes for the request lines.
Using a VSL query with -g raw will be similar to -i or -I:

# varnishlog -d -g raw -q 'ReqUrl eq "/"'


2 ReqURL c /

An other option for grouping is -g session. This will behave similar to -g request for many tests, but
it's for a single HTTP session. Or in other words: If a client re-uses a connection to issue multiple HTTP
requests, -g request will separate each request, but -g session will group them all together.
To summarize grouping:
-g raw
Disables grouping all together.
-g vxid
Default grouping mode. Based on Varnish ID numbers, so each client request and backend request is
separate, as is the session data.
-g request
Groups each request together, including backend requests triggered by client requests.
-g session
Group by HTTP (or TCP) session. Will frequently produce huge amounts of data.
VSL queries are used in other tools too, as are many of the options that apply to varnishlog.
Chapter 3.14 Tools: varnishtop Page 69

3.14 Tools: varnishtop


To quote the manual page:

The varnishtop utility reads varnishd(1) shared memory logs and


presents a continuously updated list of the most commonly occurring
log entries. With suitable filtering using the -I, -i, -X and -x
options, it can be used to display a ranking of requested documents,
clients, user agents, or any other information which is recorded in
the log.

This is the output of varnishlog -i ReqUrl:

list length 7 e979e205720e

2.86 ReqURL /?1


0.72 ReqURL /?25556
0.70 ReqURL /?5879
0.70 ReqURL /?12292
0.69 ReqURL /?26317
0.67 ReqURL /?30808
0.50 ReqURL /?12592

The number on the left is a decaying average, then you see the log tag (ReqURL) and the value. This
shows us that /?1 has been requested more frequently than any of the other URLs. Over time, the
number on the left will reach zero if no tag matching that value is seen again.
A few very useful examples:
varnishtop -i BereqURL
See URLs requested from a backend. Want to tweak your cache hit rate? Start at the top of this list.
varnishtop -I ReqHeader:User-Agent
See User-Agent headers from clients.
varnishtop -i ReqURL
Frequently requested URLs.
varnishtop -I ReqHeader:Host
Frequently requested hosts.
Page 70 Chapter 3.15 Tools: varnishncsa and varnishhist

3.15 Tools: varnishncsa and varnishhist


If you need or want traditional access logs, varnishncsa is the tool for you. Most distributions provide
startup scripts that will run varnishncsa in the background, in which case all you have to do is enable
them. With systemd, that would be systemctl enable varnishncsa.service.
The varnishhist tool can draw a histogram of response time distribution, size distribution or any other
number-based tag. It can make for an interesting demo, but is not particularly useful unless you have very
specific questions that you need answered.
Chapter 3.16 More on VSL queries Page 71

3.16 More on VSL queries


While varnishlog -q 'ReqURL eq "/foo"' is useful, you can also do more advanced searches.
VSL queries are valid for varnishlog and other log tools, with varying effects.
String operators eq and neq can be used to evaluate exact matches, or you can use regular
expressions, either negated with !~ or regular matching using ~ for comparison:

# varnishncsa -d -q 'ReqURL ~ "/?[0-9]"'


::1 - - [18/Dec/2015:14:23:33 +0000] "GET http://localhost/?12592 HTTP/1.1" 200 3092 "-"
::1 - - [18/Dec/2015:14:23:42 +0000] "GET http://localhost/?30808 HTTP/1.1" 200 3092 "-"
(...)

An other helpful way to use a VSL query is to investigate the details of the Timestamp tag. Quoting
directly from the vsl(7) manual page:

Timestamp - Timing information


Contains timing information for the Varnish worker threads.

Time stamps are issued by Varnish on certain events, and


show the absolute time of the event, the time spent since the
start of the work unit, and the time spent since the last
timestamp was logged. See vsl(7) for information about the
individual timestamps.

The format is:

%s: %f %f %f
| | | |
| | | +- Time since last timestamp
| | +---- Time since start of work unit
| +------- Absolute time of event
+----------- Event label

Looking at this, you can see that a regular expression might not be the most useful tool. However, you
could extract the actual field you want using a [field] syntax:

# varnishlog -d -c -q 'Timestamp[3] >= 1.0'


* << Request >> 16
- Begin req 15 rxreq
- Timestamp Start: 1450454500.617483 0.000000 0.000000
- Timestamp Req: 1450454500.617483 0.000000 0.000000
- ReqStart ::1 60074
- ReqMethod GET
- ReqURL /cgi-bin/foo.sh
- ReqProtocol HTTP/1.1
- ReqHeader Host: localhost
- ReqHeader Connection: keep-alive
- ReqHeader Accept-Encoding: gzip, deflate
- ReqHeader Accept: */*
- ReqHeader User-Agent: HTTPie/0.8.0
- ReqHeader X-Forwarded-For: ::1
- VCL_call RECV
- VCL_return hash
Page 72 Chapter 3.16 More on VSL queries

- ReqUnset Accept-Encoding: gzip, deflate


- ReqHeader Accept-Encoding: gzip
- VCL_call HASH
- VCL_return lookup
- Debug "XXXX MISS"
- VCL_call MISS
- VCL_return fetch
- Link bereq 17 fetch
- Timestamp Fetch: 1450454501.623769 1.006286 1.006286
- RespProtocol HTTP/1.1
- RespStatus 200
- RespReason OK
- RespHeader Date: Fri, 18 Dec 2015 16:01:40 GMT
- RespHeader Server: Apache/2.4.10 (Debian)
- RespHeader Content-Type: text/plain
- RespHeader X-Varnish: 16
- RespHeader Age: 0
- RespHeader Via: 1.1 varnish-v4
- VCL_call DELIVER
- VCL_return deliver
- Timestamp Process: 1450454501.623783 1.006300 0.000014
- RespHeader Content-Length: 57
- Debug "RES_MODE 2"
- RespHeader Connection: keep-alive
- RespHeader Accept-Ranges: bytes
- Timestamp Resp: 1450454501.623817 1.006334 0.000034
- Debug "XXX REF 2"
- ReqAcct 144 0 144 224 57 281
- End

The above example extracts the third field of the Timestamp tag and matches if it has a value of 1.0 or
higher. This is very useful if you need to investigate reports of slow requests.
It's worth noting that "1" and "1.0" are not necessarily the same. If you use just "1", you are likely doing an
integer comparison, which means that any digit after the decimal point is ignored. So
Timestamp[3] > 1.0 will match if Timestamp[3] is 1.006334, as seen here, but
Timestamp[3] > 1 will not, because it will be considered the same as 1 > 1. In short: Use 1.0
instead of just 1.
An other nifty way to use VSL queries is to investigate the TTL tag. This log tag is used to report how an
object gets its cache duration:

# varnishlog -g raw -d -i TTL


3 TTL b RFC 120 -1 -1 1450446456 1450446456 1450446455 0 0
32771 TTL b RFC 120 -1 -1 1450447224 1450447224 1450447223 0 0
32774 TTL b RFC 120 -1 -1 1450448614 1450448614 1450448613 0 0

These lines tell us that the objects in question all got a TTL of 120 seconds. Let's try to modify some
headers from a backend and try again:

# varnishlog -d -q 'TTL[2] > 120'


* << BeReq >> 32790
- Begin bereq 32789 fetch
- Timestamp Start: 1450455456.550332 0.000000 0.000000
Chapter 3.16 More on VSL queries Page 73

- BereqMethod GET
- BereqURL /cgi-bin/foo.sh
- BereqProtocol HTTP/1.1
- BereqHeader Host: localhost
- BereqHeader Accept: */*
- BereqHeader User-Agent: HTTPie/0.8.0
- BereqHeader X-Forwarded-For: ::1
- BereqHeader Accept-Encoding: gzip
- BereqHeader X-Varnish: 32790
- VCL_call BACKEND_FETCH
- VCL_return fetch
- BackendClose 17 default(127.0.0.1,,8080) toolate
- BackendOpen 17 default(127.0.0.1,,8080) 127.0.0.1 55746
- Backend 17 default default(127.0.0.1,,8080)
- Timestamp Bereq: 1450455456.550474 0.000142 0.000142
- Timestamp Beresp: 1450455456.552757 0.002426 0.002283
- BerespProtocol HTTP/1.1
- BerespStatus 200
- BerespReason OK
- BerespHeader Date: Fri, 18 Dec 2015 16:17:36 GMT
- BerespHeader Server: Apache/2.4.10 (Debian)
- BerespHeader Cache-Control: max-age=3600
- BerespHeader Age: 10
- BerespHeader Content-Length: 56
- BerespHeader Content-Type: text/plain
- TTL RFC 3600 -1 -1 1450455457 1450455447 1450455456 0 3600
- VCL_call BACKEND_RESPONSE
- VCL_return deliver
- Storage malloc s0
- ObjProtocol HTTP/1.1
- ObjStatus 200
- ObjReason OK
- ObjHeader Date: Fri, 18 Dec 2015 16:17:36 GMT
- ObjHeader Server: Apache/2.4.10 (Debian)
- ObjHeader Cache-Control: max-age=3600
- ObjHeader Content-Type: text/plain
- Fetch_Body 3 length stream
- BackendReuse 17 default(127.0.0.1,,8080)
- Timestamp BerespBody: 1450455456.552814 0.002482 0.000057
- Length 56
- BereqAcct 151 0 151 172 56 228
- End

You can still see the TTL header, but now it reads 3600. Unfortunately, there's a miss-match between the
documentation and implementation in Varnish 4.0 and 4.1. The documentation suggests that the first
number should take Age into account, but as we just demonstrated, that is clearly not happening (if it
was, then the first number of the TTL line should have read 3590). However, the other numbers are
correct, so you can infer the Age from that, but not really use it directly in a VSL query.
Combining multiple queries is also possible:

# varnishlog -cdq 'ReqHeader:User-agent ~ "HTTP" and Hit and ReqUrl ~ "demo"'


* << Request >> 65541
- Begin req 65540 rxreq
Page 74 Chapter 3.16 More on VSL queries

- Timestamp Start: 1450457044.299308 0.000000 0.000000


- Timestamp Req: 1450457044.299308 0.000000 0.000000
- ReqStart ::1 60290
- ReqMethod GET
- ReqURL /demo
- ReqProtocol HTTP/1.1
- ReqHeader Host: localhost
- ReqHeader Connection: keep-alive
- ReqHeader Accept-Encoding: gzip, deflate
- ReqHeader Accept: */*
- ReqHeader User-Agent: HTTPie/0.8.0
- ReqHeader X-Forwarded-For: ::1
- VCL_call RECV
- VCL_return hash
- ReqUnset Accept-Encoding: gzip, deflate
- ReqHeader Accept-Encoding: gzip
- VCL_call HASH
- VCL_return lookup
- Hit 2147549187
- VCL_call HIT
- VCL_return deliver
- RespProtocol HTTP/1.1
- RespStatus 404
- RespReason Not Found
- RespHeader Date: Fri, 18 Dec 2015 16:44:02 GMT
- RespHeader Server: Apache/2.4.10 (Debian)
- RespHeader Content-Type: text/html; charset=iso-8859-1
- RespHeader X-Varnish: 65541 65539
- RespHeader Age: 1
- RespHeader Via: 1.1 varnish-v4
- VCL_call DELIVER
- VCL_return deliver
- Timestamp Process: 1450457044.299346 0.000038 0.000038
- RespHeader Content-Length: 279
- Debug "RES_MODE 2"
- RespHeader Connection: keep-alive
- Timestamp Resp: 1450457044.299370 0.000062 0.000024
- Debug "XXX REF 2"
- ReqAcct 134 0 134 238 279 517
- End

These examples are mostly meant to get you started and give you an idea of what you can do. The best
reference pages for these tools are the manual pages, and the vsl-query(7) and vsl(7) manual
pages. Even if they some times do get out of date.
Chapter 3.17 Summary Page 75

3.17 Summary
You have seen how to modify Varnish parameters and command line arguments, how to use the various
tools and you've been introduced to the architecture of Varnish.
The perhaps most important lesson to pick up in this chapter, however, is that you do not want to alter
Varnish parameters and startup scripts unless you have a very strong reason to do so. With the exception
of how much memory Varnish is to use and the default listening port provided by startup scripts, the
default values are tuned for real web sites and can be used even on quite high-traffic sites.
Of the tools demonstrated here, varnishlog, varnishadm and varnishstat are the real work
horses. Mastering a few simple VSL queries will make operating on the shared memory log a breeze,
even when your site is serving thousands of requests per second and you need to find that one URL that's
miss-behaving.
When you are introduced to the Varnish Configuration Language in the chapters to come, these tools will
be right at the center of your work flow.
Page 76 Chapter 4 Introducing VCL

4 Introducing VCL

Warning
I expect this chapter to change significantly throught its creation, and possibly throught the creation
of the rest of the book.
I advise against reviewing the pedagogical aspects of the chapter until the text is complete (as in: a
summary exists). Or until this warning is removed.
That said, the content itself is correct and you are welcome to read it and coment.

The Varnish Configuration Language is a small custom programming language that gives you the
mechanism to hook into Varnish's request handling state engine at various crucial stages.
Mastering VCL is a matter of learning the language itself, understanding what the different states mean
and how you can utilize the tools that are you at your disposal.
This chapter is split into several logical parts. It begins with a short introduction and some example code,
goes on to detail the individual VCL states available in both client request handling and backend request
handling, then closes up by tying it all together in more practical terms.
VCL is officially documented in the vcl manual page (man vcl), but you would do well if you revisit the
state diagrams provided in appendix A. In addition to the flow chart, each sub chapter detailing each of the
individual VCL states starts with a simple table giving you an overview of relevant facts about that state,
and a list of the most typical uses of the state is - if there are any at all.
What you will not find in this chapter is an extensive description of every keyword and operator available.
That is precisely what the manual page is for.
Since VCL leans heavily on regular expressions, there is also a small cheat sheet in appendix D, including
VCL snippets.
Chapter 4.1 Working with VCL Page 77

4.1 Working with VCL


VCL is normally stored in /etc/varnish/. Most startup-scripts usually refer to
/etc/varnish/default.vcl, but you are free to call it whatever you want, as long as your startup
scripts refer to them.
To use new VCL, you have two choices:

1. Restart Varnish, losing all cache


2. Reload the VCL without restarting Varnish
During development of entirely new VCL, the first option is usually the best. Reloading VCL without
dropping the cache is a benefit in production, but when you are testing your VCL, old (potentially "wrong")
objects can add a level of confusion that is best avoided. An example of this is if you are trying to fix
re-write rules. You might end up caching content incorrectly due to re-write rules, then fix your rules but
find the old content due to the previously wrong VCL.
Reloading VCL is always done through the CLI, but most startup scripts provide shorthands that does the
job for you. You can do it manually using varnishadm:

# varnishadm vcl.list
active 0 boot

# varnishadm vcl.load foo-1 /etc/varnish/default.vcl


VCL compiled.
# varnishadm vcl.list
active 0 boot
available 0 foo-1

# varnishadm vcl.use foo-1


VCL 'foo-1' now active
# varnishadm vcl.list
available 0 boot
active 0 foo-1

This also demonstrates that Varnish operates with multiple loaded VCLs, but only one can be active at a
time. The VCL needs a run-time name, which can be anything. The boot name refers to the VCL varnish
booted up with initially.
Compiling and loading the VCL is done with vcl.load <name> <file>, and this is where any syntax
errors would be detected. After it is loaded, you need to call vcl.use to make it the active VCL. You can
also switch back to the previous one with vcl.use if you like.
A more practical way is to use your startup scripts. E.g:

# systemctl reload varnish


# varnishadm vcl.list
available 0 boot
available 0 foo-1
active 0 4ca9d8e9-25d0-4b52-b4b1-247f038061a6

This example from Debian demonstrates that the startup script will pick a random VCL name, load it and
then issue vcl.use for you.
Over time, VCL files might "pile up" in Varnish, taking up some resources. This is specially true for
backends, where even unused VCL will have active health checks if health checks are defined in the
relevant VCL. You can explicitly discard old VCL with vcl.discard:
Page 78 Chapter 4.1 Working with VCL

# varnishadm vcl.list
available 0 boot
available 0 foo-1
active 0 4ca9d8e9-25d0-4b52-b4b1-247f038061a6

# varnishadm vcl.discard boot

# varnishadm vcl.list
available 0 foo-1
active 0 4ca9d8e9-25d0-4b52-b4b1-247f038061a6

This is not necessary if you restart Varnish instead of reloading.


As of Varnish 4.1.1, Varnish also has a concept of cooldown time, where old VCL will be set in a "cold"
state after a period of time. While "cold", health checks are not active.
Chapter 4.2 Hello World Page 79

4.2 Hello World


VCL can be split into two parts, the global scope and the request handling. Both of those can again be
further divided.
Backends, access control lists (ACLs), initialization and finalization functions are all defined in a global
scope, but referenced in the request handling.
The request handling is where you will do most of your VCL-work. With Varnish 4.0, that is further divided
into client requests and backend requests. In older versions of Varnish, there was no separation between
backend and client requests, but today they represent two somewhat isolated state machines and are
executed in different threads.
The following is a minimal VCL that defines a backend and sets a custom response header:

vcl 4.0;

backend foo {
.host = "127.0.0.1";
.port = "8080";
}

sub vcl_deliver {
set resp.http.X-hello = "Hello, world";
}

The first line is a VCL version string. Right now, there is only one valid VCL version. Even for Varnish 4.1,
the VCL version is 4.0. This is intended to make transitions to newer versions of Varnish simpler. Every
VCL file starts with vcl 4.0; until a significant change in the VCL language is announced.
Next up, we define a backend server named foo. This is where Varnish will fetch content. We set the IP
of the backend and port. You can have multiple backends, as long as they have different names. If you
only define a single backend, you don't need to explicitly reference it anywhere, but if you have multiple
backends you need to be explicit about which to use when. We will deal primarily with simple backends in
this chapter.
Last, but not least, we provide some code for the vcl_deliver state. If you look at the
cache_req_fsm.svg in appendix A, you will find vcl_deliver at the bottom left. It is the last VCL
before the request is delivered back to the client.
Page 80 Chapter 4.2 Hello World

The set resp.http.X-hello = "Hello, world"; line demonstrates how you can alter variables.
set <variable> = <value>; is the general syntax here. Each VCL state has access to different
variables. The different variables are split up in families: req, bereq, beresp, resp, obj, client and
server.
In the state diagram (again, see Appendix A), looking closer at the box where vcl_deliver is listed,
you will find resp.* and req.* listed, suggesting that those families of variables are available to us in
vcl_deliver.
In our specific example, resp.http.X-hello refers to the artificial response header X-hello which
we just invented. You can set any response header you want, but as general rule (and per RFC), prefixing
custom-headers with X- is the safest choice to avoid conflicts with other potential intermediaries that are
out of your control.
Let's see how it looks:

# http -p h localhost
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Connection: keep-alive
Content-Encoding: gzip
Content-Type: text/html
Date: Sat, 06 Feb 2016 22:26:04 GMT
ETag: "2b60-52b20c692a380-gzip"
Last-Modified: Sat, 06 Feb 2016 21:37:34 GMT
Server: Apache/2.4.10 (Debian)
Transfer-Encoding: chunked
Vary: Accept-Encoding
Via: 1.1 varnish-v4
X-Varnish: 2
X-hello: Hello, world

And there you are, a custom VCL header. You can also use unset variable; to remove headers, and
overwrite existing headers.

vcl 4.0;

backend foo {
.host = "127.0.0.1";
.port = "8080";
}

sub vcl_deliver {
set resp.http.X-hello = "Hello, world";
unset resp.http.X-Varnish;
unset resp.http.Via;
unset resp.http.Age;
set resp.http.Server = "Generic Webserver 1.0";
}

The result would be:

# systemctl restart varnish


# http -p h localhost:6081
Chapter 4.2 Hello World Page 81

HTTP/1.1 200 OK
Accept-Ranges: bytes
Connection: keep-alive
Content-Encoding: gzip
Content-Type: text/html
Date: Sun, 07 Feb 2016 12:24:36 GMT
ETag: "2b60-52b20c692a380-gzip"
Last-Modified: Sat, 06 Feb 2016 21:37:34 GMT
Server: Generic Webserver 1.0
Transfer-Encoding: chunked
Vary: Accept-Encoding
X-hello: Hello, world
Page 82 Chapter 4.3 Basic language constructs

4.3 Basic language constructs


Grab a rain coat, you are about to get a bucket full of information thrown at you. Many of the concepts in
the following example will be expanded upon greatly.

# Comments start with hash


// Or C++ style //
/* Or
* multi-line C-style comments
* like this.*/

# Remember, always start with "vcl 4.0;". The VCL version. Even in
# Varnsih 4.1
vcl 4.0;

# White space is largely optional


backend foo{.host="localhost";.port="80";}

# vcl_recv is an other VCL state you can modify. It is the first


# one in the request chain, and we will discuss it in great detail
# shortly.
sub vcl_recv {
# You can use tilde (~) to do regular expression matching
# text strings, or various other "logical" matchings on
# things suchs as IP addresses
if (req.url ~ "^/foo") {
set req.http.x-test = "foo";
} elsif (req.url ~ "^/bar") {
set req.http.x-test = "bar";
}
}

# You can define the same VCL function as many times as you want.
# Varnish will concatenate them together into one big function.
sub vcl_recv {
# Use regsub() to do regular expression substitution.
# regsub() returns a string and takes the format of
# regsub(<input>,<expression>,<substitution>)
set req.url = regsub(req.url, "cat","dog");

# The input of regsub() doesn't have to match where you


# are storing it, even if it is the most common form.
set req.http.x-base-url = regsub(req.url, "\?.*$","");

# Be warned: regsub() only does a single substitution. If


# you want to substitute all occurences of the pattern, you
# need to use regsuball() instead. So regsuball() is
# equivalent to the "/g" option you might have seen in
# other languages.
set req.http.X-foo = regsuball(req.url,"foo","bar");
}

# You can define your own sub routines, but they can't start with
# vcl_. Varnish reserves all VCL function names that start with
# vcl_ for it self.
Chapter 4.3 Basic language constructs Page 83

sub check_request_method {
# Custom sub routines can be accessed anywhere, as long as
# the variables and return methods used are valid where the
# subroutine is called.
if (req.method == "POST" || req.method == "PUT") {
# The "return" statement is a terminating statement
# and serves to exit the VCL processing entirely,
# until the next state is reached.
#
# Different VCL states have different return
# statements available to them. A return statement
# tells varnish what to do next.
#
# In this specific example, return (pass); tells
# varnish to bypass the cache for this request.
return (pass);
}
}

sub vcl_recv {
# Calling the custom-sub is simple.
# There are no arguments or return values, because under
# the hood, "call" just copies the VCL into where the call
# was made. It is not a true function call.
call check_request_method;

# As a consequence, you can not write recursive custom


# functions.

# You can use == to check for exact matches. Both for


# strings and numbers. Varnish either does the right thing
# or throws a syntax error at you.
if (req.method == "POST") {
# This will never execute. The 'check_request_method'
# already checked the request method and if it was
# POST, it would have issued "return(pass);"
# already, thereby terminating the VCL state and
# never reaching this code.
set req.http.x-post = "yes";
}

# The Host header contains the verbatim Host header, as


# supplied by the client. Some times, that includes a port
# number, but typically only if it is user-visible (e.g.:
# the user entered http://www.example.com:8080/)
if (req.http.host == "www.example.com" && req.url == "/login") {
# return (pass) is an other return statement. It
# instructs Varnish to by-pass the cache for this
# request.
return (pass);
}
}

# Last but not least: You do not have to specify all VCL functions.
Page 84 Chapter 4.3 Basic language constructs

# Varnish provides a built-in which is always appended to your own


# VCL, and it is designed to be sensible and safe.

Note
All VCL code examples are tested for syntax errors against Varnish 4.1.1, and are provided in
complete form, with the only exception being that smaller examples will leave out the backend
and vcl 4.0; lines to preserve brevity.
Chapter 4.4 More on return-statements Page 85

4.4 More on return-statements


A central mechanism of VCL is the return-statement, some times referred to as a terminating statement. It
is important to understand just what this means.
All states end with a return-statement. If you do not provide one, VCL execution will "fall through" to the
built-in VCL, which always provides a return-statement.
Similarly, if you provide multiple definitions of vcl_recv or some other function, they will all be glued
together as a single block of code. Any call foo; statement will be inlined (copied into the code). In
other words, the following two examples produce the same C code:
With custom function:

sub clean_host_header {
# Strip leading www in host header to avoid caching the same
# content twice if it is accessed both with and without a
# leading wwww.
set req.http.Host = regsub(req.http.Host, "^www\.","");
}

sub vcl_recv {
call clean_host_header;
}

Without:

sub vcl_recv {
set req.http.Host = regsub(req.http.Host, "^www\.","");
}

Which form you chose is a matter of style. However, it is usually helpful to split logical bits of code into
separate custom functions. This lets you split cleaning of Host header into a single block of code that
doesn't get mixed with device detection (for example).
But because the custom functions are in-lined, a return (pass); issued in a custom-function would
mean that the custom function never returned - that VCL state was terminated and Varnish would move
on to the next phase of request handling.
Each state has different return statements available. You can see these in the request flow chart, at the
bottom of each box.
Page 86 Chapter 4.5 Built-in VCL

4.5 Built-in VCL


Varnish works out of the box with no VCL, as long as a backend is provided. This is because Varnish
provides built-in VCL, sometimes confusingly referred to as the default VCL for historic reasons.
This VCL can never be removed or overwritten, but it can be bypassed. You can find it in
/usr/share/doc/varnish/builtin.vcl or similar for your distribution. It is included in Appendix C
for your convenience.
The built-in VCL is designed to make Varnish behave safely on any site. It is a good habit to let it execute
whenever possible. Chapter 1 already demonstrated how you can influence the cache with no VCL at all,
and it should be a goal to provide as simple VCL as possible.
Each of the built-in VCL functions will be covered individually when we are dealing with the individual
states.
Chapter 4.6 Client requests Page 87

4.6 Client requests


With Varnish 4.0, VCL became split in two different state engines, so to speak. The client-side processing
and the backend-processing is isolated, and can in fact take place in parallel in the case of a background
fetch.
There are a number of states in each of these code paths, some more critical than others. Before we can
begin looking at the more complex ways to utilize VCL, we will go through each function, starting with the
client side.

4.6.1 vcl_recv

vcl_recv
Context Client request
Variables req, req_top, client, server local, remote, storage, now
Return statements purge, hash, pass, pipe, synth
Next state vcl_hash, vcl_pass, vcl_pipe, vcl_synth
Typical use
• Request validation
• Request normalization
• Cookie normalization/cleanup
• URL rewrites
• Backend selection
• Purging
• Request classification (Mobile, IP, etc)
• Request-based cache policies
The first VCL function that is run after a request is received is called vcl_recv. The only processing
Varnish has done at this point is parse the request into manageable structures.
As the extensive list of typical use cases suggests, it is one of the most versatile VCL functions available.
Almost every Varnish server has a good chunk of logic and policy in vcl_recv.
Let's go through the built-in vcl_recv function:

sub vcl_recv {
if (req.method == "PRI") {
/* We do not support SPDY or HTTP/2.0 */
return (synth(405));
}
if (req.method != "GET" &&
req.method != "HEAD" &&
req.method != "PUT" &&
req.method != "POST" &&
req.method != "TRACE" &&
req.method != "OPTIONS" &&
req.method != "DELETE") {
/* Non-RFC2616 or CONNECT which is weird. */
return (pipe);
}
Page 88 Chapter 4.6 Client requests

if (req.method != "GET" && req.method != "HEAD") {


/* We only deal with GET and HEAD by default */
return (pass);
}
if (req.http.Authorization || req.http.Cookie) {
/* Not cacheable by default */
return (pass);
}
return (hash);
}

The built-in VCL is meant to provide a safe, standards-compliant cache that works with most sites.
However, what it is not meant to do is provide a perfect cache hit rate.
Walking through the list from the top, it starts out by checking if the request method is PRI, which is a
request method for the SPDY protocol, and/or HTTP/2.0. This is currently unsuported, so Varnish
terminates the VCL state with a synth(405).
This will cause Varnish to synthesize an error message with a pre-set status code of 405. If you leave out
the status message (e.g "File Not Found" and "Internal Server Error"), Varnish will pick the standard
response message matching that status code.
You can provide your own error message and even change the status code later if you decide to add a
vcl_synth function.
Next, Varnish checks if the request method is one of the valid RFC 2616 request methods (with the
exception of CONNECT). If it is not, then Varnish issues return (pipe);, which causes Varnish to enter
"pipe mode".
In pipe mode, Varnish connects the client directly to the backend and stops interpreting the data stream at
all. This is best suited for situations where you need to do something Varnish doesn't support, and should
be a last resort. If you do issue a pipe return, then you should probably also have
set req.http.Connection = "close";. This will tell your origin server to close the connection after
a single request. If you do not, then the client will be free to issue other, potentially cacheable, requests
without Varnish being any the wiser.
In short: If in doubt, don't use pipe.
Next, Varnish checks if the request method is GET or HEAD. If it is not, then Varnish issues
return (pass);. This is the best method of disabling cache based on client input. Unlike in pipe mode,
Varnish still parses the request and and potentially buffers it if you use pass. In fact, it goes through all the
normal VCL states as any other request, allowing you to do things like retry the request if the backend
failed.
At the very end is the biggest challenge with the built-in VCL. If the request has an Authorization
header, indicating HTTP Basic Authentication, or if the request has a Cookie header, the request is
passed (not cached). Since almost all web sites today will have clients sending cookies, this is one of the
most important jobs a VCL author has.
At the end, if none of the other return statements have been issued, Varnish issues a return (hash);.
This tells Varnish to create a cache hash and look it up in the cache. Exactly how that cache hash is
constructed is defined in vcl_hash.
To summarize the built in VCL:

• Reject SPDY / HTTP/2.0 requests


• Pipe unknown (possibly unsafe) request methods directly to the backend
• By-pass cache for anything except GET and HEAD requests
Chapter 4.6 Client requests Page 89

• By-pass cache for requests with Authorization or Cookie headers.


And the return states that are valid are:

• return (synth()); to generate a response from Varnish. E.g: error messages and more.
• return (pipe); to connect the client directly to the backend. Avoid if possible.
• return (pass); to bypass the cache, but otherwise process the request as normal.
• return (hash); to get ready to check the cache for content.
• return (purge); to invalidate matching content in the cache (covered in greater detail later).

4.6.2 vcl_recv - Massaging a request


A typical thing to do in vcl_recv is to handle URL rewrites, and to normalize a request. For example,
your site might be available on both www.example.com and example.com. Varnish has no way of
knowing that these host names are the same so without intervention, they would take up two separate
namespaces in your cache: you would cache the content twice.
Similarily, you might offer sports news under both http://example.com/sports/ and
http://sports.example.com/. Same problem.
The best solution to this problem is to do internal rewriting in Varnish, changing one of them to the other.
This is quite easy in VCL.

sub normalize_sports {
if (req.http.host == "sports.example.com") {
set req.http.host = "example.com";
set req.url = "/sports" + req.url;
}
}

sub strip_www {
set req.http.host = regsub(req.http.host,"^www\.","");
}

sub vcl_recv {
call normalize_sports;
call strip_www;
}

Notice how the above VCL split the logically separate problems into two different sub routines. We could
just as easily have placed them both directly in vcl_recv, but the above form will yield a VCL file that is
easier to read and organize over time.
In normalize_sports we do an exact string comparison between the client-provided Host header
and sports.example.com. In HTTP, the name of the header is case insensitive, so it doesn't matter if
you type req.http.host, req.http.Host or req.http.HoST. Varnish will figure it out.
If the Host header does match the sports-domain, we change the Host header to the primary domain
name, example.com, and then set the url to be the same as it was, but with "/sports" prefixed. Note how
the example uses "/sports", not "/sports/". That is because req.url always starts with a /.
The second function, strip_www, uses the regsub() function to do a regular expression substitution.
The result of that substitution is stored back onto the Host header.
regsub() takes three arguments. The input, the regular expression and what to change it with. If you are
unfamiliar with regular expressions, there's a brief introduction and cheat sheet in appendix D.
Page 90 Chapter 4.6 Client requests

Note how we do not check if the Host header contains www. before we issue the regsub(). That is
because the process of checking and the process of substitution is the same, so there would be no gain.
Testing your work as you go is crucial. You have many alternatives to test this. I have modified the
foo.sh CGI script to output HTTP headers, so I can see what the backend sees. Here's an example:

# http localhost/cgi-bin/foo.sh "Host: example.com"


HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: max-age=10
Connection: keep-alive
Content-Encoding: gzip
Content-Type: text/plain
Date: Tue, 09 Feb 2016 21:19:41 GMT
Server: Apache/2.4.10 (Debian)
Transfer-Encoding: chunked
Vary: Accept-Encoding
Via: 1.1 varnish-v4
X-Varnish: 2

Hello. Random number: 13449


Tue Feb 9 21:19:41 UTC 2016
HTTP_ACCEPT='*/*'
HTTP_ACCEPT_ENCODING=gzip
HTTP_HOST=example.com
HTTP_USER_AGENT=HTTPie/0.8.0
HTTP_X_FORWARDED_FOR=::1
HTTP_X_VARNISH=3

# http localhost/cgi-bin/foo.sh "Host: www.example.com"


HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 3
Cache-Control: max-age=10
Connection: keep-alive
Content-Encoding: gzip
Content-Length: 175
Content-Type: text/plain
Date: Tue, 09 Feb 2016 21:19:41 GMT
Server: Apache/2.4.10 (Debian)
Vary: Accept-Encoding
Via: 1.1 varnish-v4
X-Varnish: 32770 3

Hello. Random number: 13449


Tue Feb 9 21:19:41 UTC 2016
HTTP_ACCEPT='*/*'
HTTP_ACCEPT_ENCODING=gzip
HTTP_HOST=example.com
HTTP_USER_AGENT=HTTPie/0.8.0
HTTP_X_FORWARDED_FOR=::1
HTTP_X_VARNISH=3

# http localhost/cgi-bin/foo.sh "Host: example.com"


Chapter 4.6 Client requests Page 91

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 6
Cache-Control: max-age=10
Connection: keep-alive
Content-Encoding: gzip
Content-Length: 175
Content-Type: text/plain
Date: Tue, 09 Feb 2016 21:19:41 GMT
Server: Apache/2.4.10 (Debian)
Vary: Accept-Encoding
Via: 1.1 varnish-v4
X-Varnish: 32772 3

Hello. Random number: 13449


Tue Feb 9 21:19:41 UTC 2016
HTTP_ACCEPT='*/*'
HTTP_ACCEPT_ENCODING=gzip
HTTP_HOST=example.com
HTTP_USER_AGENT=HTTPie/0.8.0
HTTP_X_FORWARDED_FOR=::1
HTTP_X_VARNISH=3

The test issues three requests. The first is a cache miss for www.example.com, the second is a cache hit
for example.com. Looking at the content, we can easily see that it's the same. Our rewrite apparently
worked!
The third request is again for www.example.com and is also a cache hit. This is included so you can
look closer at what happens to the X-Varnish header.
In the cache miss, it had a value of "2", however, the backend reports that HTTP_X_VARNISH=3. The
second request gets a X-Varnish response of X-Varnish: 32770 3. The first number is the xid of
the request being processed, while the second is the xid of the backend request that generated the
content. You can verify that the two last requests gives the same content by looking at that header instead
of the content.
We can also see this in varnishlog. Since we already covered varnishlog in detail, we aren't going
to repeat that, except as it pertains to VCL. This is from the above requests:

* << Request >> 32770


- Begin req 32769 rxreq
- Timestamp Start: 1455052784.964533 0.000000 0.000000
- Timestamp Req: 1455052784.964533 0.000000 0.000000
- ReqStart ::1 46964
- ReqMethod GET
- ReqURL /cgi-bin/foo.sh
- ReqProtocol HTTP/1.1
- ReqHeader Connection: keep-alive
- ReqHeader Host: www.example.com
- ReqHeader Accept-Encoding: gzip, deflate
- ReqHeader Accept: */*
- ReqHeader User-Agent: HTTPie/0.8.0
- ReqHeader X-Forwarded-For: ::1
- VCL_call RECV
- ReqUnset Host: www.example.com
Page 92 Chapter 4.6 Client requests

- ReqHeader host: example.com


- VCL_return hash
- ReqUnset Accept-Encoding: gzip, deflate
- ReqHeader Accept-Encoding: gzip
- VCL_call HASH
- VCL_return lookup
- Hit 2147483651
- VCL_call HIT
- VCL_return deliver
- RespProtocol HTTP/1.1
- RespStatus 200
- RespReason OK
- RespHeader Date: Tue, 09 Feb 2016 21:19:41 GMT
- RespHeader Server: Apache/2.4.10 (Debian)
- RespHeader Cache-Control: max-age=10
- RespHeader Vary: Accept-Encoding
- RespHeader Content-Encoding: gzip
- RespHeader Content-Type: text/plain
- RespHeader X-Varnish: 32770 3
- RespHeader Age: 3
- RespHeader Via: 1.1 varnish-v4
- VCL_call DELIVER
- VCL_return deliver
- Timestamp Process: 1455052784.964572 0.000039 0.000039
- RespHeader Content-Length: 175
- Debug "RES_MODE 2"
- RespHeader Connection: keep-alive
- RespHeader Accept-Ranges: bytes
- Timestamp Resp: 1455052784.964609 0.000076 0.000037
- Debug "XXX REF 2"
- ReqAcct 151 0 151 304 175 479
- End

What you want to take special notice of is this bit:

- VCL_call RECV
- ReqUnset Host: www.example.com
- ReqHeader host: example.com
- VCL_return hash

This tells you that the RECV functions in VCL was called, or vcl_recv if you'd like, then it tells you that
the Host header was first unset, then set again with a changed value, and last, it reveals the return
statement from vcl_recv: hash.
Testing the other rewrite is also pretty easy:

# http localhost/cgi-bin/foo.sh "Host: sports.example.com"


HTTP/1.1 404 Not Found
Age: 0
Connection: keep-alive
Content-Length: 298
Content-Type: text/html; charset=iso-8859-1
Date: Tue, 09 Feb 2016 21:32:03 GMT
Server: Apache/2.4.10 (Debian)
Chapter 4.6 Client requests Page 93

Via: 1.1 varnish-v4


X-Varnish: 2

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">


<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /sports/cgi-bin/foo.sh was not found on this server.</p>
<hr>
<address>Apache/2.4.10 (Debian) Server at example.com Port 8080</address>
</body></html>

# varnishlog -d -g session -q 'ReqHeader:Host ~ "sports.example.com"'


* << Session >> 1
- Begin sess 0 HTTP/1
- SessOpen ::1 46980 :80 ::1 80 1455053523.467424 12
- Link req 2 rxreq
- SessClose REM_CLOSE 0.008
- End
** << Request >> 2
-- Begin req 1 rxreq
-- Timestamp Start: 1455053523.467464 0.000000 0.000000
-- Timestamp Req: 1455053523.467464 0.000000 0.000000
-- ReqStart ::1 46980
-- ReqMethod GET
-- ReqURL /cgi-bin/foo.sh
-- ReqProtocol HTTP/1.1
-- ReqHeader Connection: keep-alive
-- ReqHeader Host: sports.example.com
-- ReqHeader Accept-Encoding: gzip, deflate
-- ReqHeader Accept: */*
-- ReqHeader User-Agent: HTTPie/0.8.0
-- ReqHeader X-Forwarded-For: ::1
-- VCL_call RECV
-- ReqUnset Host: sports.example.com
-- ReqHeader host: example.com
-- ReqURL /sports/cgi-bin/foo.sh
-- ReqUnset host: example.com
-- ReqHeader host: example.com
-- VCL_return hash
-- ReqUnset Accept-Encoding: gzip, deflate
-- ReqHeader Accept-Encoding: gzip
-- VCL_call HASH
-- VCL_return lookup
-- Debug "XXXX MISS"
-- VCL_call MISS
-- VCL_return fetch
-- Link bereq 3 fetch
-- Timestamp Fetch: 1455053523.467898 0.000435 0.000435
-- RespProtocol HTTP/1.1
-- RespStatus 404
-- RespReason Not Found
-- RespHeader Date: Tue, 09 Feb 2016 21:32:03 GMT
Page 94 Chapter 4.6 Client requests

-- RespHeader Server: Apache/2.4.10 (Debian)


-- RespHeader Content-Type: text/html; charset=iso-8859-1
-- RespHeader X-Varnish: 2
-- RespHeader Age: 0
-- RespHeader Via: 1.1 varnish-v4
-- VCL_call DELIVER
-- VCL_return deliver
-- Timestamp Process: 1455053523.467942 0.000478 0.000043
-- RespHeader Content-Length: 298
-- Debug "RES_MODE 2"
-- RespHeader Connection: keep-alive
-- Timestamp Resp: 1455053523.467967 0.000503 0.000025
-- Debug "XXX REF 2"
-- ReqAcct 154 0 154 228 298 526
-- End
*** << BeReq >> 3
--- Begin bereq 2 fetch
--- Timestamp Start: 1455053523.467534 0.000000 0.000000
--- BereqMethod GET
--- BereqURL /sports/cgi-bin/foo.sh
--- BereqProtocol HTTP/1.1
--- BereqHeader Accept: */*
--- BereqHeader User-Agent: HTTPie/0.8.0
--- BereqHeader X-Forwarded-For: ::1
--- BereqHeader host: example.com
--- BereqHeader Accept-Encoding: gzip
--- BereqHeader X-Varnish: 3
--- VCL_call BACKEND_FETCH
--- VCL_return fetch
--- BackendOpen 17 default(127.0.0.1,,8080) 127.0.0.1 46558
--- Backend 17 default default(127.0.0.1,,8080)
--- Timestamp Bereq: 1455053523.467666 0.000133 0.000133
--- Timestamp Beresp: 1455053523.467808 0.000274 0.000142
--- BerespProtocol HTTP/1.1
--- BerespStatus 404
--- BerespReason Not Found
--- BerespHeader Date: Tue, 09 Feb 2016 21:32:03 GMT
--- BerespHeader Server: Apache/2.4.10 (Debian)
--- BerespHeader Content-Length: 298
--- BerespHeader Content-Type: text/html; charset=iso-8859-1
--- TTL RFC 120 -1 -1 1455053523 1455053523 1455053523 0 0
--- VCL_call BACKEND_RESPONSE
--- VCL_return deliver
--- Storage malloc s0
--- ObjProtocol HTTP/1.1
--- ObjStatus 404
--- ObjReason Not Found
--- ObjHeader Date: Tue, 09 Feb 2016 21:32:03 GMT
--- ObjHeader Server: Apache/2.4.10 (Debian)
--- ObjHeader Content-Type: text/html; charset=iso-8859-1
--- Fetch_Body 3 length stream
--- BackendReuse 17 default(127.0.0.1,,8080)
--- Timestamp BerespBody: 1455053523.467896 0.000362 0.000087
--- Length 298
Chapter 4.6 Client requests Page 95

--- BereqAcct 156 0 156 161 298 459


--- End

The backend sort of confirmed it for us due to the 404 message outputting the rewritten URL, but it is a
good idea to get used to varnishlog.
Future examples will not include quite as verbose testing transcripts, though.

4.6.3 vcl_pipe

vcl_pipe
Context Client request
Variables req, bereq, req_top, client, server server, local, remote, storage,
now
Return statements pipe, synth
Next state vcl_synth, delivery
Typical use

In pipe mode, Varnish opens a connection to the backend and starts moving data between the client and
backend without any interference. It is used as a last resort if what you need to do isn't supported by
Varnish. Once in pipe mode, the client can send unfiltered data to the server and get replies without
Varnish interpreting them - for better or worse.
In HTTP 1.1, keep-alive is the default connection mode. This means a client can send multiple requests
serialized over the same TCP connection. For pipe mode, Varnish suggests that the server should disable
this by adding Connection: close before entering vcl_pipe. If it didn't, then subsequent requests
after the piped requests would also bypass the cache completely.
You can override this in vcl_pipe if you really want to, but there isn't any good reason to do so that the
author is aware of. The built-in VCL for vcl_pipe is empty, save for a comment:

sub vcl_pipe {
# By default Connection: close is set on all piped requests, to stop
# connection reuse from sending future requests directly to the
# (potentially) wrong backend. If you do want this to happen, you can undo
# it here.
# unset bereq.http.connection;
return (pipe);
}

4.6.4 vcl_hash

vcl_hash
Context Client request
Variables req, req_top, client, server local, remote, storage, now
Return statements lookup
Next state vcl_hit, vcl_miss, vcl_pass, vcl_purge
Typical use
• Adding the Cookie to the hash
Page 96 Chapter 4.6 Client requests

If you return hash or purge in vcl_recv, Varnish will immediately execute the vcl_hash function. It
has a very simple purpose: Defining what identifies a unique object in the cache. You add items to cache
hash, and as long as two requests add up to the same hash, they are treated as the same object.
The built in VCL shows us what it's all about:

sub vcl_hash {
hash_data(req.url);
if (req.http.host) {
hash_data(req.http.host);
} else {
hash_data(server.ip);
}
return (lookup);
}

The hash_data() keyword is used to add items to hash. The built-in VCL is simple enough. It adds the
URL and either the server IP or the Host header.
In other words: If the URL and the Host-header is the same, the object is the same.
It is rare that you need to add extra logic to vcl_hash. The most common use case is when you want to
cache content generated based on cookies.
The only valid return statement in vcl_hash is return (lookup);, telling Varnish that it's time to look
the hash up in cache to see if it's a cache hit or not.

4.6.5 vcl_hit

vcl_hit
Context Client request
Variables req, req_top, client, server local, remote, storage, now, obj
Return statements synth, restart, pass, deliver, miss, fetch (deprecated, use miss
instead)
Next state vcl_deliver, vcl_miss, vcl_synth
Typical use
• Overriding grace mode
After Varnish looks up the content in cache, one out of three things can happen:

• Varnish finds the content in cache. This is a cache hit and vcl_hit is run
• Varnish does not find the content in cache. This is a cache miss and vcl_miss is run.
• Varnish finds a special hit-for-pass object in the cache, this is the result of a previous decision not to
cache responses for that hash. vcl_pass is run and content is fetched from the backend.
It is rare that you need to modify these VCL states, but it happens. The built-in VCL for vcl_hit is a bit
strange.

sub vcl_hit {
if (obj.ttl >= 0s) {
// A pure unadultered hit, deliver it
return (deliver);
}
if (obj.ttl + obj.grace > 0s) {
Chapter 4.6 Client requests Page 97

// Object is in grace, deliver it


// Automatically triggers a background fetch
return (deliver);
}
// fetch & deliver once we get the result
return (miss);
}

This VCL is all about grace mode. Once an object is inserted into the cache, it has a Time to live, a TTL.
This is the regular cache duration. On top of TTL, there is the grace period. This is an extended period of
time in which the object is kept in cache. During grace mode, the object can be delivered to clients, but a
request to the backend will be initiated in the background to update the content.
In addition to grace mode, Varnish also supports conditional backend requests to the backend. If Varnish
has an old object in cache with either an ETag or Last-Modified tag, Varnish can issue a conditional
GET request, potentially saving bandwidth. This happens automatically in grace mode.
The total duration Varnish keeps an object is:

• TTL - regular cache duration


• + grace - Grace period
• + keep - Extra period for conditional GET requests
This is why, in vcl_hit, there is still a chance to return a miss. This typically happens if the object found
is outside the TTL and outside the grace period, but it's still within the keep period.

4.6.6 vcl_miss

vcl_miss
Context Client request
Variables remote, req, req_top, server client, local
Return statements synth, restart, pass, fetch
Next state vcl_deliver, vcl_pass, vcl_synth
Typical use

The built-in vcl_miss again demonstrates the simplicity of it.

sub vcl_miss {
return (fetch);
}

The content was not found in cache. Go fetch it from the backend.
The next VCL state from the perspective of the client request is vcl_deliver, but after vcl_miss is
done, the backend request will be initiated and that has a set of VCL states all of its own. The first state in
the backend request handling is vcl_backend_fetch.

4.6.7 vcl_pass

vcl_pass
Context Client request
Page 98 Chapter 4.6 Client requests

Variables remote, req, req_top, server client, local


Return statements synth, restart, fetch
Next state vcl_deliver, vcl_synth
Typical use

In vcl_pass, Varnish is bypassing the cache.


Like vcl_miss, the built-in VCL for vcl_pass is blank:

sub vcl_pass {
return (fetch);
}

There are three ways to enter vcl_pass. Either directly from vcl_recv by explicitly calling
return (pass);, by calling return (pass); in vcl_hit or vcl_miss, or lastly by finding a
hit-for-pass object in the cache.
A hit-for-pass object is an object in the cache with no content that only serves to force Varnish into pass
mode.
A cache miss and a pass both results in a backend request. The difference between them is that with a
cache miss, Varnish assumes that the backend response can be used to satisfy multiple client requests. If
multiple clients request the same resource at the same time, Varnish will only send a single request to the
backend if they are cache misses. If the response is cacheable, then all client requests will get the same
object returned.
If, however, the result can not be cached, Varnish needs to send one backend request for each client
request. To avoid serializing these requests, Varnish stores a hit-for-pass object in cache, telling Varnish
that requests for this object are not cachable, should bypass the pass and executed independently of
other requests for the same object.
We will look at this later in more detail.

4.6.8 vcl_synth

vcl_synth
Context Client request
Variables remote, resp, req, req_top, server, client, local
Return statements restart, deliver
Next state delivery
Typical use
• Customizing error messages
• Generating 301/302 redirects
vcl_synth is called whenever Varnish needs to synthesize a response instead of delivering one fetched
from a backend server.
In its simplest form it is just a different error message, but it can be used for more than that. The built-in
VCL provides the default error message you might have already seen:

/*
* We can come here "invisibly" with the following errors: 413, 417 & 503
*/
Chapter 4.6 Client requests Page 99

sub vcl_synth {
set resp.http.Content-Type = "text/html; charset=utf-8";
set resp.http.Retry-After = "5";
synthetic( {"<!DOCTYPE html>
<html>
<head>
<title>"} + resp.status + " " + resp.reason + {"</title>
</head>
<body>
<h1>Error "} + resp.status + " " + resp.reason + {"</h1>
<p>"} + resp.reason + {"</p>
<h3>Guru Meditation:</h3>
<p>XID: "} + req.xid + {"</p>
<hr>
<p>Varnish cache server</p>
</body>
</html>
"} );
return (deliver);
}

Note that vcl_synth can also be called without vcl_recv ever being called first if certain specific error
situations occur.
Normally, vcl_synth is only called upon if you explicitly call return (synth()); from some other
VCL state.
A common use case for vcl_synth is to redirect clients to the proper URL that you want them to access
the content from. This is different from URL rewriting which is internal to Varnish. A redirect causes
Varnish to send a regular HTTP reponse to the client, which will then make another request using the
provided location.
A very simple variant of this can be done like this:

sub vcl_recv {
if (req.http.host ~ "^www\.") {
return (synth(301));
}
}

sub vcl_deliver {
if (resp.status == 301) {
set resp.http.Location =
regsub(req.http.host, "^www\.","") + req.url;
}
}

In vcl_recv we check if the Host-header starts with a leading "www". If it does, we issue a
return (synth(301));. Next up, Varnish enters vcl_synth.
In vcl_synth we check if the response code is 301 - the one we provided in vcl_recv. If it is, we set a
Location response header, which the client will use to re-request the content. The Location-header is
a combination of the Host-header with the leading "www." stripped away, and the url stored in req.url.

4.6.9 vcl_deliver
Page 100 Chapter 4.6 Client requests

vcl_deliver
Context Client request
Variables req, req_top, client, server local, remote, storage, now, obj.hits,
obj.uncacheable, resp
Return statements deliver, synth, restart
Next state vcl_synth, delivery
Typical use
• Adding or removing response headers
• Restarting the request in case of errors
You saw in the Hello world VCL what vcl_deliver is all about. It is the very last VCL state to execute
before Varnish starts sending data to the client. The built-in VCL for vcl_deliver is completely empty.

sub vcl_deliver {
return (deliver);
}

A very popular thing to do in vcl_deliver is to add a response header indicating if the request was a
cache hit or a cache miss. This can be done by evaluating the obj.hits variable, which is a reference
to the cached object (if any), and how any times it has been hit. If this was a cache hit, the value will be 1
or greater.

sub vcl_deliver {
if (obj.hits > 0) {
set resp.http.X-Cache-Hit = "true";
set resp.http.X-Cache-Hits = obj.hits;
} else {
set resp.http.X-Cache-Hit = "false";
}
}

Other than obj.hits and obj.uncacheable, you do not have direct access to the object. You do,
however, have most of what you need in resp.*. The cached object is always read-only, but the resp
data structure represents this specific response, not the cached object it self. As such, you can modify it.
The obj.uncacheable variable can be used to identify if the response was cacheable at all. If you
issued return (hash); in vcl_recv, and the backend and relevant VCL didn't prevent it, the value
will be true.
Chapter 4.7 Backend requests Page 101

4.7 Backend requests


In addition to the state engine provided for client request, there is also a smaller one for backend
requests. You should consider them isolated, with only minimal interaction.
The main interaction between them happens when a cache is empty. In addition, there is some interaction
when you are using "stale-while-revalidate" type of logic.
If you have a cache hit for a perfectly normal object, no backend thread is even affected. On the other
hand, in the case of a cache miss, the client-thread will have to wait for the backend-thread to finish
executing.
A third form of interaction takes place when a client hits a stale object. An object in cache that has
expired, but is still usable for "stale-while-revalidate". In these cases, the client will notify a backend
thread, then deliver the stale object to the client. The backend thread will then continue to request the
resource from the backend and populate the cache, even if there are no clients waiting for it.
It is also worth remembering that multiple client threads can be waiting for the same object to be fetched
by a single backend thread.
But despite all this, the basic VCL of the backend fetcher is pretty straight forward, with one or two minor
exceptions.

4.7.1 vcl_backend_fetch

vcl_backend_fetch
Context Backend request
Variables bereq, server, now
Return statements fetch,`abandon`
Next state vcl_backend_response, vcl_backend_error
Typical use

vcl_backend_fetch is called right before a backend request is initiated. It has a copy of the client
request in bereq, with some modifications where relevant. It can, for example, add
If-Modified-Since or If-None-Match headers if a conditional GET request can be made.
The built-in VCL is empty:

sub vcl_backend_fetch {
return (fetch);
}

4.7.2 vcl_backend_response

vcl_backend_response
Context Backend request
Variables bereq,`beresp`, server, now
Return statements deliver,`retry`,`abandon`
Next state vcl_backend_error
Page 102 Chapter 4.7 Backend requests

Typical use
• Override cache duration
• Clean up backend response
• Set grace and keep periods
• Decide what to do with errors
vcl_backend_response is executed right after a response from a backend has been received, but
before it is inserted into the cache. The beresp data structure represents the backend response which is
potentially soon to be the cached object. Before vcl_backend_response is executed, Varnish has
parsed the Cache-Control and Expires headers associated with the response and set the Time To
Live (TTL) accordingly. Any change to TTL that you make in vcl_backend_response will override
default values.
If you have a perfect backend there is little or no reason to add anything to vcl_backend_response.
In the real world, it turns out that vcl_backend_response is, along with vcl_recv, one of the most
important VCL states you have.
The built-in VCL provides a safety net:

sub vcl_backend_response {
if (beresp.ttl <= 0s ||
beresp.http.Set-Cookie ||
beresp.http.Surrogate-control ~ "no-store" ||
(!beresp.http.Surrogate-Control &&
beresp.http.Cache-Control ~ "no-cache|no-store|private") ||
beresp.http.Vary == "*") {
/*
* Mark as "Hit-For-Pass" for the next 2 minutes
*/
set beresp.ttl = 120s;
set beresp.uncacheable = true;
}
return (deliver);
}

In other words, if any of the following conditions are true, Varnish will not cache this response:

• The TTL is 0 or less. As set by RFC2616 rules (see summary of chapter 2)


• The response has a Set-Cookie header
• The response has a Surrogate-Control header with "no-store" set
• The response has a Vary header with the exact value of *
• The response does not have a Surrogate-Control header, but does have a Cache-Control
header with either no-cache, no-store or private set.
Note that when not caching, Varnish sets the TTL to 120s, then sets beresp.uncacheable = true;.
This is how a hit-for-pass object is born. For the next 2 minutes, requests matching this cache hash will
not be cached.
It is tempting to set beresp.uncacheable = true; if your backend server is serving an error that you
believe to be intermittent, but this is not without problems. This will tell Varnish that the resource is
uncachable in the future too. If you set beresp.uncachable = true;, you should also set
beresp.ttl to the period of time you want to disable caching. For intermittent errors, you want a very
low beresp.ttl. Perhaps as low as 1s.
Chapter 4.7 Backend requests Page 103

We will look closer at how to handle this soon.

4.7.3 vcl_backend_error

vcl_backend_error
Context Backend request
Variables bereq, beresp, server, now
Return statements deliver, retry, abandon
Next state vcl_backend_fetch
Typical use
• Change error messages
• Retry failed requests
• Decide what to do with errors
If Varnish fails to fetch a resource from a server, for instance if the server doesn't respond or doesn't
respond with HTTP, Varnish will execute vcl_backend_error. This allows Varnish to generate a
synthetic response, or put more plainly: an error message.
It is worth emphasising that this is only called if the server doesn't respond in any reasonable manner at
all, or times out before the response is complete. If a server returns "500 Internal Server Error", then
vcl_backend_response is run instead.
In vcl_backend_error, you have a beresp object, representing a synthetic backend response. You
also have the original bereq object, representing the backend request that triggered the error.
The built-in VCL just returns a standard error message:

sub vcl_backend_error {
set beresp.http.Content-Type = "text/html; charset=utf-8";
set beresp.http.Retry-After = "5";
synthetic( {"<!DOCTYPE html>
<html>
<head>
<title>"} + beresp.status + " " + beresp.reason + {"</title>
</head>
<body>
<h1>Error "} + beresp.status + " " + beresp.reason + {"</h1>
<p>"} + beresp.reason + {"</p>
<h3>Guru Meditation:</h3>
<p>XID: "} + bereq.xid + {"</p>
<hr>
<p>Varnish cache server</p>
</body>
</html>
"} );
return (deliver);
}

In addition to return (deliver);, you can also use retry, to make an other attempt at fetching the
request is made, and the bereq.retries counter is increased. If bereq.retries exceeds the
max_retries parameter, then no more attempts are made.
The last alternative, the return (abandon); is a bit special. It means that the result is discarded
entirely. This is highly useful if you have stale objects in the cache. If you use return (deliver);, the
Page 104 Chapter 4.7 Backend requests

stale object would be replaced by the error message, while using return (abandon); does not
replace the stale object, allowing you to use that instead.
Chapter 4.8 Housekeeping Page 105

4.8 Housekeeping
There are two more VCL "states" that fall outside of the backend- and client-scope. These are special
states that are almost exclusively used by Varnish modules (VMODS).
Since they are both tiny, there's little point dedicating a chapter to each.
The default VCL for both of them look as such:

sub vcl_init {
}

sub vcl_fini {
return (ok);
}

You will mostly deal with them when you use Varnish modules. As some of these vmods are very
common, such as the ones used for load balancing, we will cover them when we cover the vmods.
The vcl_init state is executed at VCL initialization, while vcl_fini is run when VCL is unloaded.
Page 106 Chapter 4.9 Varnish Modules

4.9 Varnish Modules


With Varnish 4, Varnish Modules have become quite mature. Varnish Modules are basically VCL
extensions, but with a little extra on the side. They can be used to solve anything from converting text form
lowercase to uppercase, to cryptographic hash functions to memcached integration. Wether it is a good
idea or not, is a different question.
Varnish already ships with two vmods. The standard vmod, or "std" vmod, provides a number of small but
highly useful utilityfunctions. It is documented in the manual file vmod_std(3), or on
https://varnish-cache.org/docs/4.1/reference/vmod_std.generated.html .
It can be used to convert text to numbers, log data to syslog, extract port numbers from an IP, and so
forth. You will see references to it several times.
The other vmod varnish ships with by default is the "directors" vmod. This used to be an integrated part of
Varnish, but is now split off into a module. The directors-vmod provides a common set of load balancing
functions, allowing you to treat a set of multiple origin servers a single logical entity. It provides a
random-director, round robin, hash-director and fallback-director. It is documented in the manual page
vmod_directors(5), or on
https://varnish-cache.org/docs/4.1/reference/vmod_directors.generated.html#varnish-directors-module .
Using a vmod is simple, once it is installed. All you have to do is add "import std;" in your VCL, and the
"std" namespace is available to you.
We will look at both of these directors when they are relevant, in addition to a few other commonly used
modules. For now, just know that they exist.
Chapter 4.10 Bringing it together Page 107

4.10 Bringing it together


There isn't any perfect way to write VCL. This chapter tries to provide a mixture of a reference guide that
you can use to look up the individual VCL states, and some small examples of what you can do with them.
The list of typical uses of a state should reveal a lot about how often you will need to tweak it.
An example of a complete, working VCL that makes a lot of sense for many sites could look like this:

vcl 4.0;

# The standard vmod, std, provides several small but important features.
import std;

# Define a single backend, with a health probe with default settings


backend origin {
.host = "192.168.1.0";
.port = "8080";
.probe = {
.url = "/healthcheck";
}
}

# A list of IPs that we allow to probe us for state


acl monitors {
"192.168.0.0"/24;
}

# A list of IPs for clients that will get some extra debug information.
# Presumably developers or sysadmins.
acl debuggers {
"192.168.1.0"/24;
"192.168.100.0"/24;
}

sub vcl_recv {
# This site only needs cookies under the "/admin" url.
# Removing the entire Cookie-header when you don't intend to use
# it makes caching a lot safer and easier.
if (req.url !~ "^/admin") {
unset req.http.cookie;
}

# Answer health checks


if (req.url ~ "/healthcheck") {
# Only answer health checks from monitor-ips.
if (client.ip ~ monitors) {
# Use the std-vmod to check if the backend is
# healthy, as per the health probe. If it is,
# return 200 OK. Otherwise, 503.
if (std.healthy(origin)) {
return (synth(200));
} else {
return (synth(503));
}
} else {
Page 108 Chapter 4.10 Bringing it together

# 401 Unahtorized if someone outside of the


# "monitors" ACL asks for health state.
return (synth(401));
}
}
# Otherwise, fall through to the built-in VCL and let that
# handle the rest.
}

sub vcl_backend_response {
# If the backend request wasn't for "/admin", then remove any
# "Set-Cookie" header.
if (bereq.url !~ "^/admin") {
unset beresp.http.set-cookie;
}
# Be very cautious about hit-for-pass objects. Only allow
# Varnish to disable the cache for 1 second at a time. Unless
# the backend itself provides a max-age.
if (beresp.uncacheable && beresp.http.cache-control !~ "max-age") {
set beresp.ttl = 1s;
}
# If the request was a succes (200 OK) and it was for an image,
# ensure a minimum cache time.
if (beresp.status == 200 && bereq.url ~ "^/images") {
if (beresp.ttl < 600s) {
set beresp.ttl = 600s;
}
}
# Same as with vcl_recv: Fall through to the built-in VCL if
# possible.
}

# Custom-routine to add some debug-headers on the response.


sub add_debug_headers {
if (obj.hits > 0) {
set resp.http.X-Hits = obj.hits;
set resp.http.X-Hit = "true";
} else {
set resp.http.X-Hit = "false";
}
set resp.http.X-Age = resp.http.Age;
}

# Custom-routine to remove debug-headers.


sub remove_debug_headers {
unset resp.http.X-Varnish;
unset resp.http.Via;
}

sub vcl_deliver {
# Only add debug-headers for clients in the "debuggers" subnet.
if (client.ip ~ debuggers) {
call add_debug_headers;
} else {
Chapter 4.10 Bringing it together Page 109

call remove_debug_headers;
}
# Remove the Age-header, as we want clients to behave as if any
# content from us is 100% fresh with regards to cache duration.
unset resp.http.Age;
}

This example isn't meant as a best practice type of guide, but to give you some inspiration as to how you
can use VCL.
Of special note is the lack of return statements. This is a good habit to establish, even if it isn't always
possible to stick with it. The idea is that you make your modifications first, then the built-in VCL provides a
safety net in case you forgot some corner case or misinterpreted the outcome. If the built-in VCL is getting
in your way, you should first understand exactly why, then see if you can modify the request or response
so that the built-in VCL will do what you want. Doing this is generally safer than just bypassing the built-in
VCL entirely.
Page 110 Chapter 4.11 Summary

4.11 Summary
VCL is mostly about cache policy. You work with a single request at a time and the goal is to keep the
VCL as small and generic as possible.
There are a large number of states you can modify, but in practical usage, it's rare that you end up using
more than three or four of them.
In chapters to come, we will go through a number of scenarios that are both common and uncommon. But
because VCL is a language, there isn't finite set of tasks you can use it for. It's really up to how your
application works.
Hopefully, this chapter can function as a reference for your future VCL needs, even when there are no
examples available for the problem you are trying to solve.
Chapter 5 Intelligent traffic routing Page 111

5 Intelligent traffic routing

Warning
I expect this chapter to change significantly throught its creation, and possibly throught the creation
of the rest of the book. It should be seen in context with chapter 4, introduction to VCL.
I advise against reviewing the pedagogical aspects of the chapter until the text is complete (as in: a
summary exists). Or until this warning is removed.
As of this writing, I have not done a thorough technical review of the details as they relate to
Varnish 4.1 or newer. The principles explained should all be correct, but some constants and
commands might have changed.

Now that we've looked at basic VCL, it's time to start talking about how to direct traffic. While many
Varnish installations only have a single backend, all the big ones have numerous backend servers.
Varnish supports this quite well, and provides a small but powerful set of tools that will allow you to direct
traffic.
There are two terms that are at the core of this chapter. The backend director, usually just called director,
which is a collection of multiple backend servers, and the health check probe.
Both directors and probes can be either very simple or quite elaborate. We'll start off looking closely at
probes, then move on to directors to tie it together.
Two often overlooked features is the fact that directors are now dynamic - you can add and remove
(predefined) backends from regular VCL - and that directors can be nested. We'll look at both these
features and what they offer you.
Page 112 Chapter 5.1 A closer look at a backend

5.1 A closer look at a backend


We've looked at backends only briefly at backends so far. Only specifying the host and port-options. But
backends can have a few more options.

Option Default Description


host Mandatory host-name or IP of the backend. Has to resolve to
max 1 IPv4 address and 1 IPv6 address. In other words: Your
backends can be dual-stack (both IPv4/IPv6), but not
load-balanced with DNS round-robin.
port 80 TCP port number to connect to.
host_header Optional Host-header to add. This is mainly useful for health
probes, if the backend requires a valid or specific
Host-header.
connect_timeout From Timeout waiting for the TCP connection to be established.
parameters Should be low, as this is usually handled by the operating
(3.5s) system. Factors that are relevant: Geographic distance,
virtualization.
first_byte_timeout From Timeout waiting for the very first byte of a reply. This is
parameters application-dependant. Typically, an application will send the
(60s) entire response in one go after generating it, so this is
basically how long you expect/allow the application to
generate a response.
between_bytes_timeout From The timeout between individual read-operations after the
parameters backend has started sending data. Should rarely be long,
(60s) depending on the application. This is essentially a means to
detect stalled connections.
max_connections unlimited The maximum number of connections Varnish will open to a
given backend.
probe Health check definition or reference. Covered in detail in the
next sub-chapter.

All the timeout values default to whatever the matching parameters are set to. The default values in
parentheses is the default parameter value of Varnish 4.1.
Over a number of years, the default values for various timeouts have been tweaked frequently to adapt to
what has proven to work. This is specially true for connect_timeout. Every once in a while you might
run across systems with greatly increased timeouts. A few questions you should ask yourself:

1. How long would a user actually wait.


2. What are you _actually_ waiting for.
3. In what circumstances would you want to send traffic to a backend this slow?
If you are working with actual users, a connect_timeout of 600s is just pointless. First, the connection is
usually established by the operating system, which means that even if the application is heavily loaded,
the actual TCP connection would be fast. Secondly, none of your users will wait for 10 minutes to get
cat-pictures.
If the application is used by non-humans (e.g.: API interface), then allowing slightly higher timeouts
generally make sense.
Chapter 5.1 A closer look at a backend Page 113

At the end of the day, timeouts are there to avoid using up resources on the Varnish-host waiting for
backends that will never respond. By timing out, you might be able to deliver stale content to a user (who
would be none the wiser) instead of waiting until the user leaves.
Page 114 Chapter 5.2 Health probes

5.2 Health probes


A health check allows you to test that a backend is working as it should before you start fetching
resources from it. In its simplest form, it is just a URL on a backend:

backend foo {
.host = "192.168.0.1";
.probe = {
.url = "/healthcheck";
}
}

This will set up a health check where Varnish will send frequent requests to /healthcheck. As long as
the server responds with 200 OK within the standard time frame, traffic will go as normal. If it stops
responding, or doesn't respond with 200 OK, then Varnish will not send traffic to it at all, except health
checks.
You can also provide more details for the probe.

backend foo {
.host = "192.168.0.1";
.probe = {
.request = "GET /healthcheck HTTP/1.1"
"Host: example.com";
.expected_response = 206;
.interval = 10s;
.threshold = 5;
.window = 15;
}
}

This probe definition uses a complete request instead of just a URL, which can be useful if your health
check needs some special headers for example. It also overrides the expected response code, expecting
206 instead of 200. None of the probe options are mandatory, however.

Option Default Description


url "/" The URL to request.
request The exact request, which overrides the URL if specified.
Each string will have \r\n added at the end.
expected_response 200 Response code that the backend needs to reply with for
Varnish to consider it healthy.
timeout 2s The timeout for the probe.
interval 5s How often to send a probe.
window 8 How many recent probes to consider when determining if
a backend is healthy.
threshold 3 How many probes within the last window must have been
successful to consider the backend healthy.
initial threshold-1 When starting up, how polls in the window should be
considered good. If set to 0, the backend will not get any
traffic until Varnish has probed it "threshold" amount of
times.
Chapter 5.2 Health probes Page 115

Window, threshold and initial are all related. The idea of a window is that you might not want to disable a
backend just because it fails a single probe. With the default setting, Varnish will evaluate the last 8
probes sent when checking if a backend is healthy. If at least 3 of them were OK, then the backend is
considered healthy.
One issue with this logic is that when Varnish starts up, there are no health probes in the history at all.
With only "window" and "threshold", this would require Varnish to send at least 3 probes by default before
it starts sending traffic to a server. That would mean some considerable downtime if you restarted your
Varnish server.
To solve this problem, Varnish has the "initial" value. When there is no history, Varnish will consider
"initial" amount of health probes good. The default value is relative to "threshold" in such a way that just a
single probe needs to be sent for Varnish to consider the backend healthy.
As you can imagine, if you have to define all these options for all your backends, you end up with a lot of
identical copy-pasted code blocks. This can be avoided by using named probes instead.

probe myprobe {
.url = "/healthcheck";
.interval = 2s;
.window = 5;
.threshold = 2;
}

backend one {
.host = "192.168.2.1";
.probe = myprobe;
}

5.2.1 Reviewing health probe status


There are a few different ways to review health state. Let's start with varnishlog:

# varnishlog -g raw -i Backend_health


0 Backend_health - default Still healthy 4--X-RH 8 3 8 0.000425 0.000562 HTTP/1.1 200 O
0 Backend_health - default Still healthy 4--X-RH 8 3 8 0.000345 0.000508 HTTP/1.1 200 O
0 Backend_health - default Still healthy 4--X-RH 8 3 8 0.000401 0.000481 HTTP/1.1 200 O
0 Backend_health - default Still healthy 4--X-RH 8 3 8 0.000437 0.000470 HTTP/1.1 200 O
0 Backend_health - default Still healthy 4--X-RH 8 3 8 0.000381 0.000448 HTTP/1.1 200 O
0 Backend_health - default Still healthy 4--X-RH 8 3 8 0.000334 0.000419 HTTP/1.1 200 O
0 Backend_health - default Still healthy 4--X-RH 8 3 8 0.000298 0.000389 HTTP/1.1 200 O

This is fairly cryptic, but you get the general idea I suppose. Note the -g raw which is necessary
because the Backend_health log-tag is not part of a session, so grouping by session wouldn't work.
You'll see one line like this for each health probe sent.
A closer look at 4--X-RH will tell you how the probe was handled. The 4 tells you it's IPv4, the X says it
was sent OK, the R tells you a response was read OK and the H says the health probe was "healthy":
The response was what we expected. In this case, a 200 OK.
You can get similar information from varnishadm, in two different ways. The first is the oldest way, and is
"hidden":

# varnishadm
200
-----------------------------
Page 116 Chapter 5.2 Health probes

Varnish Cache CLI 1.0


-----------------------------
Linux,4.6.0-0.bpo.1-amd64,x86_64,-smalloc,-smalloc,-hcritbit
varnish-4.0.2 revision bfe7cd1

Type 'help' for command list.


Type 'quit' to close CLI session.

varnish> help
200
help [command]
ping [timestamp]
auth response
quit
banner
status
start
stop
vcl.load <configname> <filename>
vcl.inline <configname> <quoted_VCLstring>
vcl.use <configname>
vcl.discard <configname>
vcl.list
param.show [-l] [<param>]
param.set <param> <value>
panic.show
panic.clear
storage.list
vcl.show <configname>
backend.list
backend.set_health matcher state
ban <field> <operator> <arg> [&& <field> <oper> <arg>]...
ban.list

varnish> help -d
200
debug.panic.master
debug.sizeof
debug.panic.worker
debug.fragfetch
debug.health
hcb.dump
debug.listen_address
debug.persistent
debug.vmod
debug.xid
debug.srandom

varnish> debug.health
200
Backend default is Healthy
Current states good: 8 threshold: 3 window: 8
Average responsetime of good probes: 0.000486
Oldest Newest
Chapter 5.2 Health probes Page 117

================================================================
4444444444444444444444444444444444444444444444444444444444444444 Good IPv4
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Good Xmit
RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR Good Recv
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Happy

varnish>

The debug.health command has been around for a long time, but was never really intended for
general use.
It does give you a history, though.
Let's see what happens if we disable our front page, which is what we're probing:

# chmod 000 /var/www/html/index.html


# varnishlog -g raw -i Backend_health
0 Backend_health - default Still healthy 4--X-R- 6 3 8 0.000402 0.000408 HTTP/1.1 403 F
0 Backend_health - default Still healthy 4--X-R- 5 3 8 0.000323 0.000408 HTTP/1.1 403 F
0 Backend_health - default Still healthy 4--X-R- 4 3 8 0.000297 0.000408 HTTP/1.1 403 F
0 Backend_health - default Still healthy 4--X-R- 3 3 8 0.000294 0.000408 HTTP/1.1 403 F
0 Backend_health - default Went sick 4--X-R- 2 3 8 0.000407 0.000408 HTTP/1.1 403 Forbi
0 Backend_health - default Still sick 4--X-R- 1 3 8 0.000307 0.000408 HTTP/1.1 403 Forb
0 Backend_health - default Still sick 4--X-R- 0 3 8 0.000385 0.000408 HTTP/1.1 403 Forb
0 Backend_health - default Still sick 4--X-R- 0 3 8 0.000350 0.000408 HTTP/1.1 403 Forb
0 Backend_health - default Still sick 4--X-R- 0 3 8 0.000290 0.000408 HTTP/1.1 403 Forb

First, observe that the 4--X-RH tag has changed to 4--X-R- instead. This tells you that Varnish is still
able to send the probe and it still receives a valid HTTP response, but it isn't happy about it - it's not a
200 OK.
Further, look at the three next numbers. Further up they were 8 3 8. Now they start out at 6 3 8
(because I was a bit slow to start the varnishlog-command).
The first number is the number of good health probes in the window(6), the next is the threshold(3) the
last is the size of the window (8). For each bad health probe, the number of good health probes we have
go down by 1. Once it breaches the threshold, Varnish reports that the backend "Went sick". Up until that
point, Varnish would still send traffic to that backend. The number of good health probes goes all the way
down to 0.
If we fix our backend, let's see the reverse happening:

# chmod a+r /var/www/html/index.html ; varnishlog -g raw -i Backend_health


0 Backend_health - default Still sick 4--X-RH 1 3 8 0.000365 0.000398 HTTP/1.1 200 OK
0 Backend_health - default Still sick 4--X-RH 2 3 8 0.000330 0.000381 HTTP/1.1 200 OK
0 Backend_health - default Back healthy 4--X-RH 3 3 8 0.000329 0.000368 HTTP/1.1 200 OK
0 Backend_health - default Still healthy 4--X-RH 4 3 8 0.000362 0.000366 HTTP/1.1 200 O
0 Backend_health - default Still healthy 4--X-RH 5 3 8 0.000327 0.000357 HTTP/1.1 200 O
0 Backend_health - default Still healthy 4--X-RH 6 3 8 0.000366 0.000359 HTTP/1.1 200 O
0 Backend_health - default Still healthy 4--X-RH 7 3 8 0.000332 0.000352 HTTP/1.1 200 O
0 Backend_health - default Still healthy 4--X-RH 8 3 8 0.000358 0.000354 HTTP/1.1 200 O
0 Backend_health - default Still healthy 4--X-RH 8 3 8 0.000295 0.000339 HTTP/1.1 200 O

Even though our backend starts behaving well immediately, Varnish will consider it "sick" until it has
reached the threshold for number of health probes needed.
The other numbers in the log output are timing for sending and receiving the response.
Page 118 Chapter 5.2 Health probes

The threshold and window-mechanism is there to avoid "flapping". But it is far from perfect.

Warning
You generally do not want to use the debug-commands unless you really know what you are
doing. Things such as debug.panic.master will kill Varnish (by design), and is included
exclusively for development, QA and testing. Similarilly, debug.srandom will let you forcibly set
the "random seed", of Varnish, making the random numbers predictable. Useful for unit-tests,
horrible for production.

5.2.2 Forcing state


You can forcibly set the state of a backend to sick if you want to remove it from rotation. This is easily
done with varnishadm:

# varnishadm backend.list
Backend name Admin Probe
boot.default probe Healthy 6/8

# varnishadm backend.set_health boot.default sick


# varnishadm backend.list
Backend name Admin Probe
boot.default sick Healthy 8/8

# varnishadm backend.set_health boot.default healthy


# varnishadm backend.list
Backend name Admin Probe
boot.default healthy Healthy 8/8

# varnishadm backend.set_health boot.default probe


# varnishadm backend.list
Backend name Admin Probe
boot.default probe Healthy 8/8

In the above example we first list the backends. The naming scheme is <vcl>.<name>. The "Admin"
column lists the administrative state. It starts out as "probe" - use whatever the probe state is. The next
column is the probe state itself. In the beginning you see that the probe is considering the backend
healthy, with 6 out of 8 healthy probes (Varnish was just restarted).
We can use backend.set_health <name> <state> to modify the state. The states available are
healthy, sick and probe.
Here you can see it in action:

# http -ph http://localhost:6081/?$RANDOM


HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Connection: keep-alive
Content-Encoding: gzip
Content-Length: 3041
Content-Type: text/html
Chapter 5.2 Health probes Page 119

Date: Mon, 15 Aug 2016 10:57:29 GMT


ETag: "29cd-53a19199f0a80-gzip"
Last-Modified: Mon, 15 Aug 2016 09:46:02 GMT
Server: Apache/2.4.23 (Debian)
Vary: Accept-Encoding
Via: 1.1 varnish-v4
X-Varnish: 15

# varnishadm backend.set_health boot.default sick

# http -ph http://localhost:6081/?$RANDOM


HTTP/1.1 503 Backend fetch failed
Age: 0
Connection: keep-alive
Content-Length: 282
Content-Type: text/html; charset=utf-8
Date: Mon, 15 Aug 2016 10:57:37 GMT
Retry-After: 5
Server: Varnish
Via: 1.1 varnish-v4
X-Varnish: 32775

Alternatively, let's set up an incorrect health probe, by using a bogus .url in the VCL:

# http -ph http://localhost:6081/?$RANDOM


HTTP/1.1 503 Backend fetch failed
Age: 0
Connection: keep-alive
Content-Length: 278
Content-Type: text/html; charset=utf-8
Date: Mon, 15 Aug 2016 10:59:14 GMT
Retry-After: 5
Server: Varnish
Via: 1.1 varnish-v4
X-Varnish: 2

# varnishadm backend.list
Backend name Admin Probe
boot.default probe Sick 2/8

# varnishadm backend.set_health boot.default healthy

# http -ph http://localhost:6081/?$RANDOM


HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Connection: keep-alive
Content-Encoding: gzip
Content-Length: 3041
Content-Type: text/html
Date: Mon, 15 Aug 2016 10:59:52 GMT
ETag: "29cd-53a19199f0a80-gzip"
Last-Modified: Mon, 15 Aug 2016 09:46:02 GMT
Server: Apache/2.4.23 (Debian)
Page 120 Chapter 5.2 Health probes

Vary: Accept-Encoding
Via: 1.1 varnish-v4
X-Varnish: 32773

# varnishadm backend.list
Backend name Admin Probe
boot.default healthy Sick 0/8

As you can see, Varnish starts out failing, because it believes the backend to be down. Once we forcibly
set it to healthy, then everything works, even if backend.list still reveals that the health checks are
failing.
Using backend.set_health is mainly meant to help you take things out of production temporarily.
Specially when you put backends back into production, it is important to remember that you want to use
backend.set_health <name> probe, not backend.set_health <name> healthy. The latter
will essentially make your probes worthless.
Chapter 5.3 Load balancing Page 121

5.3 Load balancing


Varnish has always offered a few different ways to provide load balancing of backends. With Varnish 4,
this is done through varnish modules. Mostly through the directors vmod, though any vmod can offer
it.
The idea is simple enough: Provide multiple backends that share the load of a single application.
In varnish, a load balancing scheme is usually referred to as a backend director, or just a director.

5.3.1 Basic round-robin and random load balancing


Round-robin load balancing will rotate which backend is used. At the end of the day, all backends will
have received the same amount of requests.
The random-director is almost just as simple. The traffic is randomly distributed among the backends. At
the end of the day, that means each backend has received the same amount of requests.
The biggest difference between the two is that the random-director also provides you with a means to
adjust the weight of the distribution. You can tell it to send more traffic to a more powerful backend than
the rest, for example.

import directors;

backend one {
.host = "192.168.2.1";
.port = "80";
}
backend two {
.host = "192.168.2.2";
.port = "80";
}
sub vcl_init {
new rrdirector = directors.round_robin();
rrdirector.add_backend(one);
rrdirector.add_backend(two);
}

sub vcl_recv {
set req.backend_hint = rrdirector.backend();
}

This example creates a director-object called rrdirector, of the round-robin type. It then adds two
backends to it.
In vcl_recv, we tell Varnish to use this director as backend.
You can do similar things with the random director.

import directors;

backend one {
.host = "192.168.2.1";
.port = "80";
}
backend two {
.host = "192.168.2.2";
Page 122 Chapter 5.3 Load balancing

.port = "80";
}
sub vcl_init {
new radirector = directors.random();
radirector.add_backend(one, 5.0);
radirector.add_backend(two, 1.0);
}

sub vcl_recv {
set req.backend_hint = radirector.backend();
}

Notice the second argument to radirector.add_backend(). This is the relative weight. You can pick
basically any scale you want, as long as it is relative to the other backends. In this example, the backend
called one will get five times as much traffic as the one called two.
You can add any number of backends to the same director, and you can use any number of directors.

5.3.2 Health probes and directors


Health probes and directors are meant for each other. If you have two or more application servers that can
serve the same site, putting them in a single director and adding health probes to them allows you to
ensure that only known good backends are used.
This does offer some challenges, however. First of all, if you are doing weighted load balancing, then the
total weight will change. Let's assume you have 5 backends:

Name Weight
red 1
blue 2
orange 4
yellow 8
green 16

Total weight would be 31. Normally, "green" will take 16/31, or 51% of the traffic. The "yellow" backend
will take half that, and so forth. If "orange" suddenly went down, the new total weight would be 27. The
extra load would be distributed evenly: 16/27 would go to "green" (59%), 1/27 would go to "red" (3.7%).
Generally speaking, it is better to keep weighting more or less equal. It makes it simpler for you to
estimate load whenever you change your stack.

5.3.3 Dynamic use of directors


Directors are dynamic, even if backends are not.
An example of this is that you can change the makeup of a director from VCL:

import directors;

backend one {
.host = "192.168.2.1";
.port = "80";
}
backend two {
Chapter 5.3 Load balancing Page 123

.host = "192.168.2.2";
.port = "80";
}
sub vcl_init {
new radirector = directors.random();
radirector.add_backend(one, 5.0);
radirector.add_backend(two, 1.0);
}

acl admins {
"192.168.0.0"/24;
}

sub vcl_recv {
set req.backend_hint = radirector.backend();
if (req.url ~ "^/admin" && client.ip ~ admins) {
if (req.url ~ "/add_one") {
radirector.add_backend(one, 1.0);
}
if (req.url ~ "/remove_one") {
radirector.remove_backend(one);
}
if (req.url ~ "/add_two") {
radirector.add_backend(two, 1.0);
}
if (req.url ~ "/remove_one") {
radirector.remove_backend(two);
}
return (synth(200));
}
}

The above VCL allows someone in the IP range 192.168.0.0/24 to issues HTTP requests to
/admin/add_one, /admin/remove_one, /admin/add_two or /admin/remove_two to affect the
load balancing. This is not very common, but sensible use-cases for it could be automatic deployment,
where a script first removes a backend from the director, then upgrades it, then puts it back in production.
This, however, can also be achieved with backend.set_health.

5.3.4 Fallback director


The fallback director is a simple thing. You can add any number of backends to it, but it will always give
you the first one that is healthy.
The idea is that you have a primary backend that should always be used, but if that fails, you can
potentially serve alternate content from a different backend.

import directors;

probe myprobe { .url = "/"; }


backend primary {
.host = "192.168.0.1";
.probe = myprobe;
}

backend secondary {
Page 124 Chapter 5.3 Load balancing

.host = "127.0.0.1";
.port = "8080";
.probe = myprobe;
}

sub vcl_init {
new fbdirector = directors.fallback();
fbdirector.add_backend(primary);
fbdirector.add_backend(secondary);
}

sub vcl_recv {
set req.backend_hint = fbdirector.backend();
}

You can have any number of backends in a fallback director.

5.3.5 Stacking directors


Varnish treats all backends as directors, and vice versa. Wherever you can add a backend, you can add a
director.
As a result, you can add a director to an other director. Most of the time, this makes little sense.
The one situation where it makes a ton of sense, however, is when you combine it with the fallback
director.
Let's say you have 2 regular application servers, but also periodically generate a static version of your site
(or parts of it), that you put on a simple web server in case your application goes down.
You can use a regular director for the first backends, but there are two ways to use the second backend.
One is simple VCL, using std.healthy:

import directors;
import std;

probe myprobe { .url = "/"; }


backend primary {
.host = "192.168.0.1";
.probe = myprobe;
}

backend secondary {
.host = "192.168.0.2";
.probe = myprobe;
}

backend fallback {
.host = "127.0.0.1";
.port = "8080";
.probe = myprobe;
}

sub vcl_init {
new rrdirector = directors.round_robin();
rrdirector.add_backend(primary);
Chapter 5.3 Load balancing Page 125

rrdirector.add_backend(secondary);
}

sub vcl_recv {
if (std.healthy(rrdirector.backend())) {
set req.backend_hint = rrdirector.backend();
} else {
set req.backend_hint = fallback;
}
}

This works, but can easily clutter your VCL.


An alternative implementation using nested directors can be written as such:

import directors;

probe myprobe { .url = "/"; }

backend primary {
.host = "192.168.0.1";
.probe = myprobe;
}

backend secondary {
.host = "192.168.0.2";
.probe = myprobe;
}

backend fallback {
.host = "127.0.0.1";
.port = "8080";
.probe = myprobe;
}

sub vcl_init {
new rrdirector = directors.round_robin();
rrdirector.add_backend(primary);
rrdirector.add_backend(secondary);

new fbdirector = directors.fallback();


fbdirector.add_backend(rrdirector.backend());
fbdirector.add_backend(fallback);
}

sub vcl_recv {
set req.backend_hint = fbdirector.backend();
}

In this example, the end-result is the same, but all backend-logic is done before you start working with
vcl_recv.
Other examples where this is useful is if you have a set of application servers and a set of servers for
static content, but the static content is also present on the application servers. You might want to have a
Page 126 Chapter 5.3 Load balancing

director for the static-only servers and a separate one for application servers. Then a director for the
combined result can be used for static resources:

import directors;

probe myprobe { .url = "/health"; }

backend app1 { .host = "192.168.0.1"; .probe = myprobe; }


backend app2 { .host = "192.168.0.2"; .probe = myprobe; }
backend app3 { .host = "192.168.0.3"; .probe = myprobe; }
backend app4 { .host = "192.168.0.4"; .probe = myprobe; }
backend app5 { .host = "192.168.0.5"; .probe = myprobe; }

backend static1 { .host = "192.168.1.1"; .probe = myprobe; }


backend static2 { .host = "192.168.1.2"; .probe = myprobe; }
backend static3 { .host = "192.168.1.3"; .probe = myprobe; }
backend static4 { .host = "192.168.1.4"; .probe = myprobe; }

sub vcl_init {
new appservers = directors.round_robin();
appservers.add_backend(app1);
appservers.add_backend(app2);
appservers.add_backend(app3);
appservers.add_backend(app4);
appservers.add_backend(app5);

new staticonly = directors.round_robin();


staticonly.add_backend(static1);
staticonly.add_backend(static2);
staticonly.add_backend(static3);
staticonly.add_backend(static4);

new static = directors.fallback();


static.add_backend(staticonly.backend());
static.add_backend(appservers.backend());

sub vcl_recv {
if (req.url ~ "^/static") {
set req.backend_hint = static.backend();
} else {
set req.backend_hint = appservers.backend();
}
}

In the above scenario, you have three directors:

• appservers are only the application servers, and is used by default.


• staticonly are the servers that only has static content.
• static is the set of the above - all servers that could deliver static content.
By writing your VCL like this, you separate the load balancing from the actual routing of traffic. In
vcl_recv, you just say "this is static content, fetch it from a server that has static files". If you later
wanted to change the balancing so that the application servers got traffic for static content even if a
Chapter 5.3 Load balancing Page 127

static-only server was up, then you could make that change in vcl_init without having to adjust
vcl_recv at all.
Page 128 Chapter 6 Appendix A: State machine graphs

6 Appendix A: State machine graphs


See chapter 1 for information on how to generate these for yourself. They are included for your
convenience.
These graphs are generated with:

# git clone http://github.com/varnish/Varnish-Cache/


Cloning into 'Varnish-Cache'...
(...)
# cd Varnish-Cache/
# cd doc/graphviz/
# for a in *dot; do dot -Tsvg $a > $(echo $a | sed s/.dot/.svg/); done
# ls *svg

A PNG version of each is also available at:

• https://varnishfoo.info/img/a/cache_req_fsm.png
• https://varnishfoo.info/img/a/cache_fetch.png
• https://varnishfoo.info/img/a/cache_http1_fsm.png
And SVG versions for download:

• https://varnishfoo.info/img/a/cache_req_fsm.svg
• https://varnishfoo.info/img/a/cache_fetch.svg
• https://varnishfoo.info/img/a/cache_http1_fsm.svg
Chapter 6 Appendix A: State machine graphs Page 129
Page 130 Chapter 6.1 cache_req_fsm

6.1 cache_req_fsm
Chapter 6.1 cache_req_fsm Page 131

RESTART

cnt_restart:
Request received ESI_REQ
ok? max_restarts?

cnt_recv:
vcl_recv{} req.* SYNTH
hash purge pass pipe synth

cnt_pipe:
cnt_recv: filter req.*->bereq.*
vcl_hash{} req.* req.*
vcl_pipe{}
lookup bereq.*
pipe synth

cnt_lookup: cnt_purge:
send bereq,
hash lookup (waitinglist) vcl_purge{} req.*
copy bytes until close
hit? miss? hit-for-pass? busy? synth restart

cnt_lookup:
req.*
vcl_hit{}
obj.*
deliver miss restart synth pass

cnt_miss:
vcl_miss{} req.*
fetch synth restart pass

parallel
if obj expired

cnt_pass:
vcl_pass{} req.*
fetch synth restart

BGFETCH FETCH see backend graph

FETCH_DONE FETCH_FAIL SYNTH

cnt_deliver:
cnt_synth:
Filter obj.->resp.
req.* stream?
req.* vcl_synth{}
vcl_deliver{} resp.* body
resp.*
deliver restart
restart deliver synth

V1D_Deliver

DONE
Page 132 Chapter 6.1 cache_req_fsm

cache_req_fsm details the client-specific part of the VCL state engine. And can be used when writing
VCL. You want to look for the blocks that read vcl_ to identify VCL functions. The lines tell you how a
return-statement in VCL will affect the VCL state engine at large, and which return statements are
available where. You can also see which objects are available where.
Chapter 6.2 cache_fetch Page 133

6.2 cache_fetch

FETCH BGFETCH RETRY

vbf_stp_startfetch:
vcl_backend_fetch{} bereq.*
abandon fetch

send bereq,
read beresp (headers)

vbf_stp_startfetch:
bereq.*
vcl_backend_response{}
beresp.*
retry deliver
abandon
max? ok? 304? other?

vbf_stp_error:
vbf_stp_condfetch: vbf_stp_fetch:
bereq.*
vcl_backend_error{} copy obj attr setup VFPs
error abandon beresp.* RETRY
steal body fetch
retry
abandon deliver fetch_fail? ok? fetch_fail? error? ok?
max? ok?

"backend synth"

FETCH_FAIL RETRY FETCH_DONE

cache_fetch has the same format as the cache_req_fsm, but from the perspective of a backend
request.
Page 134 Chapter 6.3 cache_http1_fsm

6.3 cache_http1_fsm
acceptor

S_STP_NEWREQ

http1_wait

S_STP_WORKING
R_STP_RECV

CNT_Request S_STP_NEWREQ

S_STP_WORKING
Session close Timeout
R_STP_RECV

Busy object
S_STP_WORKING http1_cleanup
R_STP_LOOKUP

Busy object
S_STP_WORKING Session close
R_STP_LOOKUP

disembark

hash waiter

Of the three, cache_http1_fsm is the least practical flow chart, mainly included for completeness. It
does not document much related to VCL or practical Varnish usage, but the internal state engine of an
HTTP request in Varnish. It can sometimes be helpful for debugging internal Varnish issues.
Chapter 7 Appendix B: Varnish Three Letter Acronyms Page 135

7 Appendix B: Varnish Three Letter Acronyms


Varnish acronyms are plentiful, and somewhat documented at
https://www.varnish-cache.org/trac/wiki/VTLA
This is a snapshot of that page for your convenience.
VAV
Varnish Arg Vector -- Argv parsing.
VBE
Varnish Back End -- Code for contacting backends (bin/varnishd/cache_backend.c)
VBP
Varnish Backend Polling -- Health checks of backends (bin/varnishd/cache_backend_poll.c)
VCA
Varnish Connection Acceptor -- The code that receives/accepts the TCP connections
(bin/varnishd/cache_acceptor.c)
VCC
VCL to C Compiler -- The code that compiles VCL to C code. (lib/libvcl)
VCL
Varnish Configuration Language -- The domain-specific programming language used for configuring a
varnishd.
VCT
Varnish CType(3) -- Character classification for RFC2616 and XML parsing.
VDD
Varnish (Core) Developer Day -- Quarterly invite-only meeting strictly for Varnish core (C) developers,
packagers and VMOD hackers.
VEV
Varnish EVent -- library functions to implement a simple event-dispatcher.
VGB
Varnish Governing Board -- May or may not exist. If you need to ask, you are not on it.
VGC
Varnish Generated Code -- Code generated by VCC from VCL.
VIN
Varnish Instance Naming -- Resolution of -n arguments.
VLU
Varnish Line Up -- library functions to collect stream of bytes into lines for processing.
(lib/libvarnish/vlu.c)
VRE
Varnish Regular Expression -- library functions for regular expression based matching and substring
replacement. (lib/libvarnish/vre.c)
VRT
Varnish Run Time -- functions called from compiled code. (bin/varnishd/cache_vrt.c)
VRY
VaRY -- Related to processing of Vary: HTTP headers. (bin/varnishd/cache_vary.c)
VSL
Varnish Shared memory Log -- The log written into the shared memory segment for
varnish{log,ncsa,top,hist} to see.
Page 136 Chapter 7 Appendix B: Varnish Three Letter Acronyms

VSB
Varnish string Buffer -- a copy of the FreeBSD "sbuf" library, for safe string handling.
VSC
Varnish Statistics Counter -- counters for various stats, exposed via varnishapi.
VSS
Varnish Session Stuff -- library functions to wrap DNS/TCP. (lib/libvarnish/vss.c)
VTC
Varnish Test Code -- a test-specification for the varnishtest program.
VTLA
Varnish Three Letter Acronym -- No rule without an exception.
VUG
Varnish User Group meeting -- Half-yearly event where the users and developers of Varnish Cache
gather to share experiences and plan future development.
VWx
Varnish Waiter 'x' -- A code module to monitor idle sessions.
VWE
Varnish Waiter Epoll -- epoll(2) (linux) based waiter module.
VWK
Varnish Waiter Kqueue -- kqueue(2) (freebsd) based waiter module.
VWP
Varnish Waiter Poll -- poll(2) based waiter module.
VWS
Varnish Waiter Solaris -- Solaris ports(2) based waiter module.
Chapter 8 Appendix C: Built-in VCL for Varnish 4.1.1 Page 137

8 Appendix C: Built-in VCL for Varnish 4.1.1


This is the built-in VCL that shipped with Varnish 4.1.1.
The up-to-date VCL of your Varnish version can usually be found in
/usr/share/doc/varnish/builtin.vcl or similar directory.

/*-
* Copyright (c) 2006 Verdens Gang AS
* Copyright (c) 2006-2015 Varnish Software AS
* All rights reserved.
*
* Author: Poul-Henning Kamp <[email protected]>
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*

*
* The built-in (previously called default) VCL code.
*
* NB! You do NOT need to copy & paste all of these functions into your
* own vcl code, if you do not provide a definition of one of these
* functions, the compiler will automatically fall back to the default
* code from this file.
*
* This code will be prefixed with a backend declaration built from the
* -b argument.
*/

vcl 4.0;

#######################################################################
# Client side

sub vcl_recv {
if (req.method == "PRI") {
Page 138 Chapter 8 Appendix C: Built-in VCL for Varnish 4.1.1

/* We do not support SPDY or HTTP/2.0 */


return (synth(405));
}
if (req.method != "GET" &&
req.method != "HEAD" &&
req.method != "PUT" &&
req.method != "POST" &&
req.method != "TRACE" &&
req.method != "OPTIONS" &&
req.method != "DELETE") {
/* Non-RFC2616 or CONNECT which is weird. */
return (pipe);
}

if (req.method != "GET" && req.method != "HEAD") {


/* We only deal with GET and HEAD by default */
return (pass);
}
if (req.http.Authorization || req.http.Cookie) {
/* Not cacheable by default */
return (pass);
}
return (hash);
}

sub vcl_pipe {
# By default Connection: close is set on all piped requests, to stop
# connection reuse from sending future requests directly to the
# (potentially) wrong backend. If you do want this to happen, you can undo
# it here.
# unset bereq.http.connection;
return (pipe);
}

sub vcl_pass {
return (fetch);
}

sub vcl_hash {
hash_data(req.url);
if (req.http.host) {
hash_data(req.http.host);
} else {
hash_data(server.ip);
}
return (lookup);
}

sub vcl_purge {
return (synth(200, "Purged"));
}

sub vcl_hit {
if (obj.ttl >= 0s) {
Chapter 8 Appendix C: Built-in VCL for Varnish 4.1.1 Page 139

// A pure unadultered hit, deliver it


return (deliver);
}
if (obj.ttl + obj.grace > 0s) {
// Object is in grace, deliver it
// Automatically triggers a background fetch
return (deliver);
}
// fetch & deliver once we get the result
return (miss);
}

sub vcl_miss {
return (fetch);
}

sub vcl_deliver {
return (deliver);
}

/*
* We can come here "invisibly" with the following errors: 413, 417 & 503
*/
sub vcl_synth {
set resp.http.Content-Type = "text/html; charset=utf-8";
set resp.http.Retry-After = "5";
synthetic( {"<!DOCTYPE html>
<html>
<head>
<title>"} + resp.status + " " + resp.reason + {"</title>
</head>
<body>
<h1>Error "} + resp.status + " " + resp.reason + {"</h1>
<p>"} + resp.reason + {"</p>
<h3>Guru Meditation:</h3>
<p>XID: "} + req.xid + {"</p>
<hr>
<p>Varnish cache server</p>
</body>
</html>
"} );
return (deliver);
}

#######################################################################
# Backend Fetch

sub vcl_backend_fetch {
return (fetch);
}

sub vcl_backend_response {
if (beresp.ttl <= 0s ||
beresp.http.Set-Cookie ||
Page 140 Chapter 8 Appendix C: Built-in VCL for Varnish 4.1.1

beresp.http.Surrogate-control ~ "no-store" ||
(!beresp.http.Surrogate-Control &&
beresp.http.Cache-Control ~ "no-cache|no-store|private") ||
beresp.http.Vary == "*") {
/*
* Mark as "Hit-For-Pass" for the next 2 minutes
*/
set beresp.ttl = 120s;
set beresp.uncacheable = true;
}
return (deliver);
}

sub vcl_backend_error {
set beresp.http.Content-Type = "text/html; charset=utf-8";
set beresp.http.Retry-After = "5";
synthetic( {"<!DOCTYPE html>
<html>
<head>
<title>"} + beresp.status + " " + beresp.reason + {"</title>
</head>
<body>
<h1>Error "} + beresp.status + " " + beresp.reason + {"</h1>
<p>"} + beresp.reason + {"</p>
<h3>Guru Meditation:</h3>
<p>XID: "} + bereq.xid + {"</p>
<hr>
<p>Varnish cache server</p>
</body>
</html>
"} );
return (deliver);
}

#######################################################################
# Housekeeping

sub vcl_init {
}

sub vcl_fini {
return (ok);
}
Chapter 9 Appendix D: Regular expression cheat sheet Page 141

9 Appendix D: Regular expression cheat sheet


Varnish uses Perl-style regular expressions, or "pcre". There are books written about the subject, but
here's a small list of common tasks that you may want with Varnish.

sub vcl_recv {
# Regular matching. Use ^ to anchor it to the beginning of the line.
if (req.url ~ "^/html") {
# ...
}

# Or $ to match the end of the line


if (req.url ~ ".gif$") {
# ...
}

# . matches any single character, * means "zero or more of the previous


# character"
# Combined, .* matches everything or nothing.

if (req.url ~ "^/content/.*txt$") {
# Matches /content/txt, /content/asfasftxt, /content/foo.txt,
# but not /content/txt?foo=bar.
}

# \ can be used to "escape" the next character.


if (req.url ~ ".html") {
# This matches /foo.html, but also /html/blatti
# because . is a wildcard.
}
if (req.url ~ "\.html") {
# Matches /foo.html, but not /html/blatti, but does match
# /html/blatti.html and /foo.html?foo=bar
}

# regsub(string,regex,replacement) to replace the content of a


# string.
# This creates a "X-Url" request-header that is identical to the
# URL, but changes upper-case HTML to lowercase html.
# The URL itself is unchanged.
set req.http.x-url = regsub(req.url, "HTML","html");

# Remove redundant leading www


set req.http.host = regsub(req.http.host,"^www\.","");

# regsuball() is identical to regsub(), but replaces ALL


# occurences of the regular expression, not just the first.
# This changes /foo/bar/foo/blatti to /bar/bar/bar/blatti.
set req.url = regsuball(req.url, "foo","bar");

# Paranthesis can be used to group logic, and | as a logical


# "or" operator.
if (req.url ~ "^/(html|css|img)") {
# Matches /html, /css and /img
Page 142 Chapter 9 Appendix D: Regular expression cheat sheet

# You can also reference groups in regsub(), and use [] as


# character-classes. A character class is a list of characters.
# If it starts with ^, it matches any characters EXCEPT the ones
# listed.

# The following extracts the "magic" get-argument from a URL and


# puts it into a request header called "x-magic".
# It does this by first ensuring we've passed the question-mark,
# then skipping all characters up until "magic=", then starting
# a group using paranthesis. The content of the group is one or
# more of any character except &.
# + acts just like * , but where * means "0 or more", + means "1
# or more"
set req.http.x-magic = regsub(req.url, "\?.*magic=([^&]+)","\1");
}
Chapter 10 Appendix X: License Page 143

10 Appendix X: License
The Copyright of this book belongs to Kristian Lyngstøl, except where stated otherwise. It is provided
under a Creative Commons "CC-BY-SA" license. A copy of that license is provided:

Attribution-ShareAlike 4.0 International

=======================================================================

Creative Commons Corporation ("Creative Commons") is not a law firm and


does not provide legal services or legal advice. Distribution of
Creative Commons public licenses does not create a lawyer-client or
other relationship. Creative Commons makes its licenses and related
information available on an "as-is" basis. Creative Commons gives no
warranties regarding its licenses, any material licensed under their
terms and conditions, or any related information. Creative Commons
disclaims all liability for damages resulting from their use to the
fullest extent possible.

Using Creative Commons Public Licenses

Creative Commons public licenses provide a standard set of terms and


conditions that creators and other rights holders may use to share
original works of authorship and other material subject to copyright
and certain other rights specified in the public license below. The
following considerations are for informational purposes only, are not
exhaustive, and do not form part of our licenses.

Considerations for licensors: Our public licenses are


intended for use by those authorized to give the public
permission to use material in ways otherwise restricted by
copyright and certain other rights. Our licenses are
irrevocable. Licensors should read and understand the terms
and conditions of the license they choose before applying it.
Licensors should also secure all rights necessary before
applying our licenses so that the public can reuse the
material as expected. Licensors should clearly mark any
material not subject to the license. This includes other CC-
licensed material, or material used under an exception or
limitation to copyright. More considerations for licensors:
wiki.creativecommons.org/Considerations_for_licensors

Considerations for the public: By using one of our public


licenses, a licensor grants the public permission to use the
licensed material under specified terms and conditions. If
the licensor's permission is not necessary for any reason--for
example, because of any applicable exception or limitation to
copyright--then that use is not regulated by the license. Our
licenses grant only permissions under copyright and certain
other rights that a licensor has authority to grant. Use of
the licensed material may still be restricted for other
reasons, including because others have copyright or other
rights in the material. A licensor may make special requests,
such as asking that all changes be marked or described.
Page 144 Chapter 10 Appendix X: License

Although not required by our licenses, you are encouraged to


respect those requests where reasonable. More_considerations
for the public:
wiki.creativecommons.org/Considerations_for_licensees

=======================================================================

Creative Commons Attribution-ShareAlike 4.0 International Public


License

By exercising the Licensed Rights (defined below), You accept and agree
to be bound by the terms and conditions of this Creative Commons
Attribution-ShareAlike 4.0 International Public License ("Public
License"). To the extent this Public License may be interpreted as a
contract, You are granted the Licensed Rights in consideration of Your
acceptance of these terms and conditions, and the Licensor grants You
such rights in consideration of benefits the Licensor receives from
making the Licensed Material available under these terms and
conditions.

Section 1 -- Definitions.

a. Adapted Material means material subject to Copyright and Similar


Rights that is derived from or based upon the Licensed Material
and in which the Licensed Material is translated, altered,
arranged, transformed, or otherwise modified in a manner requiring
permission under the Copyright and Similar Rights held by the
Licensor. For purposes of this Public License, where the Licensed
Material is a musical work, performance, or sound recording,
Adapted Material is always produced where the Licensed Material is
synched in timed relation with a moving image.

b. Adapter's License means the license You apply to Your Copyright


and Similar Rights in Your contributions to Adapted Material in
accordance with the terms and conditions of this Public License.

c. BY-SA Compatible License means a license listed at


creativecommons.org/compatiblelicenses, approved by Creative
Commons as essentially the equivalent of this Public License.

d. Copyright and Similar Rights means copyright and/or similar rights


closely related to copyright including, without limitation,
performance, broadcast, sound recording, and Sui Generis Database
Rights, without regard to how the rights are labeled or
categorized. For purposes of this Public License, the rights
specified in Section 2(b)(1)-(2) are not Copyright and Similar
Rights.

e. Effective Technological Measures means those measures that, in the


absence of proper authority, may not be circumvented under laws
fulfilling obligations under Article 11 of the WIPO Copyright
Treaty adopted on December 20, 1996, and/or similar international
agreements.
Chapter 10 Appendix X: License Page 145

f. Exceptions and Limitations means fair use, fair dealing, and/or


any other exception or limitation to Copyright and Similar Rights
that applies to Your use of the Licensed Material.

g. License Elements means the license attributes listed in the name


of a Creative Commons Public License. The License Elements of this
Public License are Attribution and ShareAlike.

h. Licensed Material means the artistic or literary work, database,


or other material to which the Licensor applied this Public
License.

i. Licensed Rights means the rights granted to You subject to the


terms and conditions of this Public License, which are limited to
all Copyright and Similar Rights that apply to Your use of the
Licensed Material and that the Licensor has authority to license.

j. Licensor means the individual(s) or entity(ies) granting rights


under this Public License.

k. Share means to provide material to the public by any means or


process that requires permission under the Licensed Rights, such
as reproduction, public display, public performance, distribution,
dissemination, communication, or importation, and to make material
available to the public including in ways that members of the
public may access the material from a place and at a time
individually chosen by them.

l. Sui Generis Database Rights means rights other than copyright


resulting from Directive 96/9/EC of the European Parliament and of
the Council of 11 March 1996 on the legal protection of databases,
as amended and/or succeeded, as well as other essentially
equivalent rights anywhere in the world.

m. You means the individual or entity exercising the Licensed Rights


under this Public License. Your has a corresponding meaning.

Section 2 -- Scope.

a. License grant.

1. Subject to the terms and conditions of this Public License,


the Licensor hereby grants You a worldwide, royalty-free,
non-sublicensable, non-exclusive, irrevocable license to
exercise the Licensed Rights in the Licensed Material to:

a. reproduce and Share the Licensed Material, in whole or


in part; and

b. produce, reproduce, and Share Adapted Material.

2. Exceptions and Limitations. For the avoidance of doubt, where


Page 146 Chapter 10 Appendix X: License

Exceptions and Limitations apply to Your use, this Public


License does not apply, and You do not need to comply with
its terms and conditions.

3. Term. The term of this Public License is specified in Section


6(a).

4. Media and formats; technical modifications allowed. The


Licensor authorizes You to exercise the Licensed Rights in
all media and formats whether now known or hereafter created,
and to make technical modifications necessary to do so. The
Licensor waives and/or agrees not to assert any right or
authority to forbid You from making technical modifications
necessary to exercise the Licensed Rights, including
technical modifications necessary to circumvent Effective
Technological Measures. For purposes of this Public License,
simply making modifications authorized by this Section 2(a)
(4) never produces Adapted Material.

5. Downstream recipients.

a. Offer from the Licensor -- Licensed Material. Every


recipient of the Licensed Material automatically
receives an offer from the Licensor to exercise the
Licensed Rights under the terms and conditions of this
Public License.

b. Additional offer from the Licensor -- Adapted Material.


Every recipient of Adapted Material from You
automatically receives an offer from the Licensor to
exercise the Licensed Rights in the Adapted Material
under the conditions of the Adapter's License You apply.

c. No downstream restrictions. You may not offer or impose


any additional or different terms or conditions on, or
apply any Effective Technological Measures to, the
Licensed Material if doing so restricts exercise of the
Licensed Rights by any recipient of the Licensed
Material.

6. No endorsement. Nothing in this Public License constitutes or


may be construed as permission to assert or imply that You
are, or that Your use of the Licensed Material is, connected
with, or sponsored, endorsed, or granted official status by,
the Licensor or others designated to receive attribution as
provided in Section 3(a)(1)(A)(i).

b. Other rights.

1. Moral rights, such as the right of integrity, are not


licensed under this Public License, nor are publicity,
privacy, and/or other similar personality rights; however, to
the extent possible, the Licensor waives and/or agrees not to
assert any such rights held by the Licensor to the limited
Chapter 10 Appendix X: License Page 147

extent necessary to allow You to exercise the Licensed


Rights, but not otherwise.

2. Patent and trademark rights are not licensed under this


Public License.

3. To the extent possible, the Licensor waives any right to


collect royalties from You for the exercise of the Licensed
Rights, whether directly or through a collecting society
under any voluntary or waivable statutory or compulsory
licensing scheme. In all other cases the Licensor expressly
reserves any right to collect such royalties.

Section 3 -- License Conditions.

Your exercise of the Licensed Rights is expressly made subject to the


following conditions.

a. Attribution.

1. If You Share the Licensed Material (including in modified


form), You must:

a. retain the following if it is supplied by the Licensor


with the Licensed Material:

i. identification of the creator(s) of the Licensed


Material and any others designated to receive
attribution, in any reasonable manner requested by
the Licensor (including by pseudonym if
designated);

ii. a copyright notice;

iii. a notice that refers to this Public License;

iv. a notice that refers to the disclaimer of


warranties;

v. a URI or hyperlink to the Licensed Material to the


extent reasonably practicable;

b. indicate if You modified the Licensed Material and


retain an indication of any previous modifications; and

c. indicate the Licensed Material is licensed under this


Public License, and include the text of, or the URI or
hyperlink to, this Public License.

2. You may satisfy the conditions in Section 3(a)(1) in any


reasonable manner based on the medium, means, and context in
which You Share the Licensed Material. For example, it may be
reasonable to satisfy the conditions by providing a URI or
Page 148 Chapter 10 Appendix X: License

hyperlink to a resource that includes the required


information.

3. If requested by the Licensor, You must remove any of the


information required by Section 3(a)(1)(A) to the extent
reasonably practicable.

b. ShareAlike.

In addition to the conditions in Section 3(a), if You Share


Adapted Material You produce, the following conditions also apply.

1. The Adapter's License You apply must be a Creative Commons


license with the same License Elements, this version or
later, or a BY-SA Compatible License.

2. You must include the text of, or the URI or hyperlink to, the
Adapter's License You apply. You may satisfy this condition
in any reasonable manner based on the medium, means, and
context in which You Share Adapted Material.

3. You may not offer or impose any additional or different terms


or conditions on, or apply any Effective Technological
Measures to, Adapted Material that restrict exercise of the
rights granted under the Adapter's License You apply.

Section 4 -- Sui Generis Database Rights.

Where the Licensed Rights include Sui Generis Database Rights that
apply to Your use of the Licensed Material:

a. for the avoidance of doubt, Section 2(a)(1) grants You the right
to extract, reuse, reproduce, and Share all or a substantial
portion of the contents of the database;

b. if You include all or a substantial portion of the database


contents in a database in which You have Sui Generis Database
Rights, then the database in which You have Sui Generis Database
Rights (but not its individual contents) is Adapted Material,

including for purposes of Section 3(b); and


c. You must comply with the conditions in Section 3(a) if You Share
all or a substantial portion of the contents of the database.

For the avoidance of doubt, this Section 4 supplements and does not
replace Your obligations under this Public License where the Licensed
Rights include other Copyright and Similar Rights.

Section 5 -- Disclaimer of Warranties and Limitation of Liability.

a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE


EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS
Chapter 10 Appendix X: License Page 149

AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF


ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS,
IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION,
WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR
PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS,
ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT
KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT
ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU.

b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE


TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION,
NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT,
INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES,
COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR
USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN
ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR
DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR
IN PART, THIS LIMITATION MAY NOT APPLY TO YOU.

c. The disclaimer of warranties and limitation of liability provided


above shall be interpreted in a manner that, to the extent
possible, most closely approximates an absolute disclaimer and
waiver of all liability.

Section 6 -- Term and Termination.

a. This Public License applies for the term of the Copyright and
Similar Rights licensed here. However, if You fail to comply with
this Public License, then Your rights under this Public License
terminate automatically.

b. Where Your right to use the Licensed Material has terminated under
Section 6(a), it reinstates:

1. automatically as of the date the violation is cured, provided


it is cured within 30 days of Your discovery of the
violation; or

2. upon express reinstatement by the Licensor.

For the avoidance of doubt, this Section 6(b) does not affect any
right the Licensor may have to seek remedies for Your violations
of this Public License.

c. For the avoidance of doubt, the Licensor may also offer the
Licensed Material under separate terms or conditions or stop
distributing the Licensed Material at any time; however, doing so
will not terminate this Public License.

d. Sections 1, 5, 6, 7, and 8 survive termination of this Public


License.
Page 150 Chapter 10 Appendix X: License

Section 7 -- Other Terms and Conditions.

a. The Licensor shall not be bound by any additional or different


terms or conditions communicated by You unless expressly agreed.

b. Any arrangements, understandings, or agreements regarding the


Licensed Material not stated herein are separate from and
independent of the terms and conditions of this Public License.

Section 8 -- Interpretation.

a. For the avoidance of doubt, this Public License does not, and
shall not be interpreted to, reduce, limit, restrict, or impose
conditions on any use of the Licensed Material that could lawfully
be made without permission under this Public License.

b. To the extent possible, if any provision of this Public License is


deemed unenforceable, it shall be automatically reformed to the
minimum extent necessary to make it enforceable. If the provision
cannot be reformed, it shall be severed from this Public License
without affecting the enforceability of the remaining terms and
conditions.

c. No term or condition of this Public License will be waived and no


failure to comply consented to unless expressly agreed to by the
Licensor.

d. Nothing in this Public License constitutes or may be interpreted


as a limitation upon, or waiver of, any privileges and immunities
that apply to the Licensor or You, including from the legal
processes of any jurisdiction or authority.

=======================================================================

Creative Commons is not a party to its public


licenses. Notwithstanding, Creative Commons may elect to apply one of
its public licenses to material it publishes and in those instances
will be considered the “Licensor.” The text of the Creative Commons
public licenses is dedicated to the public domain under the CC0 Public
Domain Dedication. Except for the limited purpose of indicating that
material is shared under a Creative Commons public license or as
otherwise permitted by the Creative Commons policies published at
creativecommons.org/policies, Creative Commons does not authorize the
use of the trademark "Creative Commons" or any other trademark or logo
of Creative Commons without its prior written consent including,
without limitation, in connection with any unauthorized modifications
to any of its public licenses or any other arrangements,
understandings, or agreements concerning use of licensed material. For
the avoidance of doubt, this paragraph does not form part of the
public licenses.

Creative Commons may be contacted at creativecommons.org.

You might also like