The LPIC-2 Exam Prep
The LPIC-2 Exam Prep
Chapter 8
This topic has a weight of 11 points and contains the following objectives:
Objective 208.1; Basic Apache Configuration (4 points) Candidates should be able to install and configure a web server. This
objective includes monitoring the server’s load and performance, restricting client user access, configuring support for
scripting languages as modules and setting up client user authentication. Also included is configuring server options to
restrict usage of resources. Candidates should be able to configure a web server to use virtual hosts and customize file
access.
Objective 208.2; Apache configuration for HTTPS (3 points) Candidates should be able to configure a web server to provide
HTTPS.
Objective 208.3; Implementing Squid as a caching proxy (2 points) Candidates should be able to install and configure a proxy
server, including access policies, authentication and resource usage.
Objective 208.4; Implementing Nginx as a web server and a reverse proxy (2 points) Candidates should be able to install
and configure a reverse proxy server, Nginx. Basic configuration of Nginx as a HTTP server is included.
Candidates should be able to install and configure a web server. This objective includes monitoring the server’s load and perfor-
mance, restricting client user access, configuring support for scripting languages as modules and setting up client user authenti-
cation. Also included is configuring server options to restrict usage of resources. Candidates should be able to configure a web
server to use virtual hosts and customise file access.
• access.log or access_log
• error.log or error_log
• .htaccess
• httpd.conf
• mod_auth
• mod_authn_file
• mod_access_compat
• htpasswd
• AuthUserFile, AuthGroupFile
• apache2ctl
• httpd
Resources: LinuxRef06; LPIC2sybex2nd; Coar00; Poet99; Wilson00; Engelschall00; PerlRef01; Krause01; apachedoc; apache24upgrad
digochanges; apache24upgrade; apache24upgrade; the man pages for the various commands.
Building Apache from source was routinely done when Apache emerged. Nowadays Apache is available in binary format for
most modern (post 2005) Linux distributions. Installing programs from source is already covered in 206.1. Therefore, we will
concentrate on working with rpm and apt package managers and tools during this chapter. Do not underestimate the importance
of building Apache from source though. Depending on requirements and (lack of) availability, it might still be necessary to
compile Apache and Apache modules from source. The Apache binary httpd can be invoked with certain command-line options
that affect the behaviour of the server. But in general Apache is started by other scripts that serve as a wrapper for httpd. These
scripts should take care of passing required flags to httpd. The behaviour of the server is configured by setting various options
called directives. These directives are declared in configuration files. The location of configuration files and how
they are organized varies. Red Hat and similar distributions have their configuration files in the /etc/httpd/conf directory.
Other locations which are or have been used are /etc/apache/config, /etc/httpd/config and /etc/apache2.
Depending on your Linux distribution and enabled repositories, your distribution may come with Apache 2.2 or Apache 2.4
or both. Apache 2.0 does comply to the LPIC-2 Apache 2.x scope due to its name, but Apache 2.0 is no longer maintained.
It is therefore not recommended to use Apache 2.0. Instead, Apache 2.4 is recommended to be used by the Apache founda-
tion. As a Linux administrator, you may however still encounter Apache 2.0 on servers. It is therefore recommended to be
familiar with the configuration differences between the different versions. The Apache foundation does provide guidence: Via
https://httpd.apache.org/docs/ upgrade documents can be accessed that address the necessary steps when upgrading from Apache
2.0 to 2.2, and from Apache 2.2 to 2.4.
It is important to distinguish between (global) directives that affect the Apache server processes, and options that affect a specific
component of the Apache server, i.e. an option that only affects a specific website. The way the configuration files are layed
out can often be a clue as to where which settings are configured. Despite this presumed obviousness, it is also important not to
make assumptions. Always familiarize yourself with all the configured options. When in doubt about a specific option, use the
documentation or a web search to find out more and consider whether the option is configured appropriately.
On many (Red Hat based) distributions the main Apache configuration file is httpd.conf, other (Debian based) distributions
favour the apache2.conf filename. Depending on your distribution and installation it might be one big file or a small generic
one with references to other configuration files via Include and/or IncludeOptional statements. The difference between
these two directives lies in the optional part. If Apache is configured to Include all *.conf files from a certain directory, there
has to be at least one file that matches that pattern to include. Otherwise, the Apache server will fail to start. As an alternative, the
IncludeOptional directive can be used to include configuration files if they are present and accessible. The Apache main
configuration file can configure generic settings like servername, listening port(s) and which IP addresses these ports should be
bound to. There may also be a seperate ports.conf configuration file though, so always follow the Include directives and
familiarize yourself with the contents of all configuration files. The user and group Apache should run as can also be configured
from the main configuration file. These accounts can be set to switch after startup. This way, the Apache software can be started
as the root user, but then switch to a dedicated “www”, “httpd” or “apache” user to adhere to the principle of least privilege. There
are also various directives to influence the way Apache serves files from its document tree. For example there are Directory
directives that control whether it is allowed to execute PHP files located in them. The default configuration file is meant to be
self explanatory and contains a lot of valuable information. In regards to the LPIC-2 exam, you are required to be familiar with
the most common Apache directives. We shall cover some of those in the section to come. At the time of this writing, Apache
2.4 is the latest stable version and the recommended version to use according to it’s distributor, the Apache Foundation. Where
applicable, this book will try to point out the differences between the various versions.
An additional method to set options for a subdivision of the document tree is by means of an .htaccess file. For security
reasons you will also need to enable the use of .htaccess files in the main configuration file by setting the AllowOverride
directive for that Directory context. All options in an .htaccess file influence files in the directory and the ones below it,
unless they are overridden by another .htaccess file or directives in the main configuration file.
Modularity
Apache has a modular source code architecture. You can custom build a server with only modules you really want. Many
modules are available on the Internet and you could also write your own.
Modules are compiled objects written in C. If you have questions about the development of Apache modules, join the Apache-
modules mailing list at http://httpd.apache.org/lists.html. Remember to do your homework first: research past messages and
check all the documentation on the Apache site before posting questions.
Special modules exist for the use of interpreted languages like Perl and Tcl. They allow Apache to run interpreted scripts natively
without having to reload an interpreter every time a script runs (e.g. mod_perl and mod_tcl). These modules include an API
to allow for modules written in an interpreted (scripted) language.
The modular structure of Apache’s source code should not be confused with the functionality of run-time loading of Apache
modules. Run-time modules are loaded after the core functionality of Apache has started and are a relatively new feature. In
older versions, to use the functionality of a module, it needed to be compiled in during the build phase. Current implementations
of Apache are capable of run-time module loading. The section on DSO has more details.
Most modern Unix derivatives have a mechanism for the on demand linking and loading of so called Dynamic Shared Objects
(DSO). This is a way to load a special program into the address space of an executable at run-time. This can usually be done in
two ways: either automatically by a system program called ld.so when the executable is started, or manually from within the
executing program with the system calls dlopen() and dlsym().
In the latter method the DSO’s are usually called shared objects or DSO files and can be named with an arbitrary extension. By
convention the extension .so is used. These files are usually installed in a program-specific directory. The executable program
manually loads the DSO at run-time into its address space via dlopen().
Tip
How to run Apache-SSL as a shareable (DSO) module: Install the appropriate package:
packagemanager installcommand modulename
Depending on your distribution, the configuration file(s) might or might not have been adjusted accordingly. Always check for
the existence of a LoadModule line in one of the configuration files:
LoadModule apache_ssl_module modules/libssl.so
This line might belong in the Apache main configuration file, or one of the included configuration files. A construction
that has been receiving much support lately, is the use of seperate modules-available and modules-enabled
directories. These directories are subdirectories inside the Apache configuration directory. Modules are installed in the
modules-available directory, and an Include reference is made to a symbolic link inside the modules-enabled
directory. This symbolic link then points back to the module. The Include reference might be a wildcard, including all files
from a certain directory.
Another construction is similar, but includes a conf.modules.d directory inside the Apache configuration directory. This
file is in fact a symbolic link, pointing to a directory inside the Apache program directory somewhere else on the filesystem. An
example from a Red Hat based host:
Include conf.modules.d/*.conf
Again, the implementations you could encounter might differ significantly from each other. Various aspects such as Linux
distribution used, Apache version installed or whether Apache is installed from packages or source may be of influence to
the way Apache is implemented. Not to mention the administrator on duty. Important to remember is that Apache often uses
configuration files that may be nested. But that there will always be one main Apache configuration file, at the top of the
hiearchy.
Tip
To see whether your version of Apache supports DSOs, execute the command httpd -l which lists the modules that have been
compiled into Apache. If mod_so.c appears in the list of modules then your Apache server can make use of dyamic modules.
The APXS is a new support tool from Apache 1.3 and onwards which can be used to build an Apache module as a DSO outside
the Apache source-tree. It knows the platform dependent build parameters for making DSO files and provides an easy way to run
the build commands with them.
An Open Source system that can be used to periodically load-test pages of web-servers is Cricket. Cricket can be easily set up
to record page-load times, and it has a web-based grapher that will generate charts to display the data in several formats. It is
based on RRDtool whose ancestor is MRTG (short for “Multi-Router Traffic Grapher”). RRDtool (Round Robin Data Tool) is a
package that collects data in “round robin” databases; each data file is fixed in size so that running Cricket does not slowly fill up
your disks. The database tables are sized when created and do not grow larger over time. As the data ages, it is averaged.
Lack of available RAM may result in memory swapping. A swapping webserver will perform badly, especially if the disk
subsystem is not up to par. Causing users to hit stop and reload, further increasing the load. You can use the MaxClients
setting to limit the amount of children your server may spawn hence reducing memory footprint. It is advised to grep through the
Apache main configuration file for all directives that start with Min or Max. These settings define the MINimal and MAXimum
boundaries for each affected setting. The default values should provide a concense balance between server load at idle on one
hand, and the possibility to handle heavy load on the other. As each chain is only as strong as it’s weakest link, the underlying
system should be adequatetly configured to handle the expected load. The LPIC-2 exam focuses more on the detection of these
performance bottlenecks in chapter 200.1.
The access_log contains a generic overview of page requests for your web-server. The format of the access log is highly
configurable. The format is specified using a format string that looks much like a C-style printf format string. A typical
configuration for the access log might look like the following:
LogFormat "%h %l %u %t \"%r\" %>s %b" common
CustomLog logs/access_log common
This defines the nickname common and associates it with a particular log format string. The format as shown is known as the
Common Log Format (CLF). It is a standard format produced by many web servers and can be read by most log analysis
programs. Log file entries produced in CLF will look similar to this line:
127.0.0.1 - bob [10/Oct/2000:13:55:36 -0100] "GET /apache_pb.gif HTTP/1.0" 200 2326
The server error log, whose name and location is set by the ErrorLog directive, is a very important log file. This is the file to
which Apache httpd will send diagnostic information and record any errors that it encounters in processing requests. It is a good
place to look when a problem occurs starting the server or while operating the server. It will often contain details of what went
wrong and how to fix it.
The error log is usually written to a file (typically error_log on Unix systems and error.log on Windows). On Unix systems it is
also possible to have the server send errors to syslog or pipe them to a program.
The format of the error log is relatively free-form and descriptive. But there is certain information that is contained in most error
log entries. For example, here is a typical message:
[Wed Oct 11 14:32:52 2000] [error] [client 127.0.0.1] client denied by server \
configuration: /export/home/live/ap/htdocs/test
The first item in the log entry is the date and time of the message. The second item lists the severity of the error being reported.
The LogLevel directive is used to control the types of errors that are sent to the error log by restricting the severity level. The
third item gives the IP address of the client that generated the error. Beyond that is the message itself, which in this case indicates
that the server has been configured to deny the client access. The server reports the file-system path (as opposed to the web path)
of the requested document.
A very wide variety of different messages can appear in the error log. Most look similar to the example above. The error log will
also contain debugging output from CGI scripts. Any information written to stderr by a CGI script will be copied directly to the
error log.
It is not possible to customize the error log by adding or removing information. However, error log entries dealing with particular
requests have corresponding entries in the access log. For example, the above example entry corresponds to an access log entry
with status code 403. Since it is possible to customize the access log, you can obtain more information about error conditions
using that log file.
During testing, it is often useful to continuously monitor the error log for any problems. On Unix systems, you can accomplish
this using:
tail -f error_log
Knowing how to customize Apache logging may prove to be a very usable skill. Manually reviewing Apache logs is not for the
faint of heart. For a low-traffic server, this may still be doable. Otherwise, looking for information by sifting through logs on
a busy server that serves multiple websites, can become a very intense textfile-manipulating-excercise. This creates a paradox:
With little to no logging, hardly any input is available when looking for the cause of a problem. With very elaborate logging in
place, the information may be overwhelming. For this reason, Apache logs are often interpreted by external facilities. The logs
are either sent to or read by a system that has the capability to visualize statistics and recognize patterns. To ensure the provided
logging is adequate, customizing the Apache logging first may be necessary.
Apache 2.3.6 and later provide the possibility to enable different kinds of LogLevel configurations on a per-module or per-
directory basis. The Apache documentation regarding the Loglevel directive is outstanding and there is not much we could
add to that.
Discretionary Access Control (DAC) A system that employs DAC allows users to set object permissions themselves. They can
change these at their discretion.
Mandatory Access Controls (MAC) A system that employs MAC has all its objects (e.g., files) under strict control of a system
administrator. Users are not allowed to set any permissions themselves.
Apache takes a liberal stance and defines discretionary controls to be controls based on usernames and passwords, and mandatory
controls to be based on static or quasi-static data like the IP address of the requesting client.
Apache uses modules to authenticate and authorise users. First of all, the difference between authentication and authorization
should be clear. Authentication is the process in which a user should validate their identity. This is the who part. Authorization
is the process of deciding who is allowed to do what. Authorization either allows or denies requests made to the Apache server.
Authorization depends on authentication to make these decisions.
The Apache modules that serve the purpose of autheNtication, follow the naming convention of mod_authn_*. The mod-
ules that serve the purpose of authoriZation, follow the convention of mod_authz_*. An exception to this rule is the mod_
authnz_ldap module. As you might have guessed, due to the nature of LDAP this module can aid in both authentication as
well as authorization.
The location of these modules on the filesystem may vary. Most distributions create a modules, modules.d or modules-availabl
directory within the Apache configuration directory. This directory can very well be a symbolic link to a directory somewhere
else on the filesystem. This can be determined by invoking pwd -P or ls -ld from within the modules directory as shown by the
following example:
[user@redhatbased /etc/httpd]$ pwd -P
/usr/lib64/httpd/modules
In the example above, the symbolic link /etc/httpd/modules provides for easy reference to the modules from within
Apache configuration files. Apache modules are loaded using the LoadModule directive. This directive expects the path to the
module to be relative to the Apache configuration directory declared by the ServerRoot directive.
In general, modules will use some form of database to store and retrieve credential data. The mod_authn_file module for
instance uses text files where mod_auth_dbm employs a Unix DBM database.
Below is a list of some modules that are included as part of the standard Apache distribution.
mod_auth_file (DAC) This is the basis for most Apache security modules; it uses ordinary text files for the authentication
database.
mod_access (MAC) This used to be the only module in the standard Apache distribution which applies what Apache defines
as mandatory controls. It used to allow you to list hosts, domains, and/or IP addresses or networks that were permitted
or denied access to documents. As of Apache 2.4, this module is no longer used. Apache 2.4 and newer use an updated
authentication and authorization model. This new model also comes with new modules, new directives and new syntax.
The mod_access module is still an LPIC-2 exam objective, so the pre-2.4 syntax should still be familiar to you. In
order to aid the migration towards Apache 2.4, a module called mod_access_compat ships with Apache 2.4. This
module serves the purpose of still accepting the pre-2.4 syntax on Apache 2.4 servers. If you encounter mod_access
related errors after upgrading to Apache 2.4 from a previous version, make sure the Apache 2.4 configuration loads this
compabibility module with a line similar to:
LoadModule mod_access_compat modules/mod_access_compat.so
mod_authn_anon (DAC) This module mimics the behaviour of anonymous FTP. Rather than having a database of valid cre-
dentials, it recognizes a list of valid usernames (i.e., the way an FTP server recognizes “ftp” and “anonymous”) and grants
access to any of those with virtually any password. This module is more useful for logging access to resources and keeping
robots out than it is for actual access control.
mod_authn_dbm (DAC) Like mod_auth_db, except that credentials are stored in a Unix DBM file.
mod_auth_digest (DAC) This module implements HTTP Digest Authentication (RFC2617), which used to provide a more
secure alternative to the mod_auth_basic functionality. The explanation that follows is nice to know but outdated. The
whole point of digest authentication was to prevent user credentials to travel via unencrypted HTTP over the wire. The
hashing algorithms used by the digest module are however seriously outdated. Using digest authentication instead of basic
HTTP authentication does not offer as many advantages in terms of security as the use of HTTPS would. The following
documentation page provides more detail: http://httpd.apache.org/docs/2.4/mod/mod_auth_digest.html.
After receiving a request and a user name, the server will challenge the client by sending a nonce. The contents of a
nonce can be any (preferably base 64 encoded) string, and the server may use the nonce to prevent replay attacks. A nonce
might, for example, be constructed using an encrypted timestamp within a resolution of a minute, i.e. ’201611291619’.
The timestamp (and maybe other static data identifying the requested URI) might be encrypted using a private key known
only to the server.
Upon receival of the nonce the client calculates a hash (by default a MD5 checksum) of the received nonce, the username,
the password, the HTTP method, and the requested URI and sends the result back to the server. The server will gather the
same data from session data and password data retrieved from a local digest database. To reconstruct the nonce the server
will try twice: the first try will use the current clocktime, the second try (if necessary) will use the current clocktime minus
one minute. One of the tries should give the exact same hash the client calculated. If so, access to the page will be granted.
This restricts validity of the challenge to one minute and prevents replay attacks.
Please note that the contents of the nonce can be chosen by the server at will. The example provided is one of many
possibilities. Like with mod_auth, the credentials are stored in a text file (the digest database). Digest database files are
managed with the htdigest tool. Please refer to the module documentation for more details.
mod_authz_host The mod_authz_host module may be used to Require a certain source of request towards Apache.
The mod_authz_host module is quite flexible about the arguments provided. Due to the name of the module, it may
seem logical to provide a hostname. While this certainly works, it may not be the preferred choice. Not only does this
module need to perform a forward DNS lookup on the provided hostname to resolve it to a numerical IP, the module is
also configured to perform a reverse DNS lookup on the resolved numerical IP after the forward lookup is performed.
Providing a hostname thus leads to at least two DNS lookups for every affected webserver request. And if the reverse
DNS result differs from the provided hostname, the request will be denied despite what the configuration may allow. To
circumvent this requirement regarding forward and reverse DNS records matching, the forward-dns option may be
used when providing a hostname. Luckily, mod_authz_host not only accepts hostnames as an argument. It can also
handle (partial) IP addresses, both IPv4 and IPv6, and CIDR style notations. There is also an argument available called
local. This will translate to the 127.0.0.0/8 or ::1 loopback addresses as well as the configured IP addresses of
the server. This setting may come in handy when restricting connections in regards to the local host. Because of the liberal
way that IP addresses are interpreted, it is recommended to be as explicit as possible when using this module. For instance,
all of the following is regarded as valid input and will be interpreted by the rules that apply:
Require host: snow.nl
Require ip: 10.6.6
Require ip: 172
One of the noteworthy differences between Apache 2.2 and 2.4 lies in the directives used for authorization. The authorization
functionality is provided by Apache mod_authz_* modules. Where previous versions of Apache used the Order, Allow
from , Deny from and Satisfy directives, Apache 2.4 uses new directives called all, env, host and ip. These new
directives have a significant impact on the syntax of configuration files. In order to aid the transition towards Apache 2.4, the
mod_access_compat module can still interpret the previously used authorization directives. This module has to be explicitly
enabled though. In doing so, backwards compatibility towards previous authorization configuration directives is maintained.
The current authorization directives provide the possibility of a more granular configuration in regards to who is authorized to
do what. This added granularity mostly comes from the availability of the Require directive. This directive could already
be used before Apache 2.4 for authentication purposes. Since Apache 2.4 though, this directive can also be interpreted by the
authorization modules.
The following example puts the old en new syntax in comparison, while providing the same functionality.
First, the pre-2.4 style:
<Directory /lpic2bookdev>
Order deny,allow
Deny from all
allow from 10.6.6.0/24
Require group employees
Satisfy any
</Directory>
And now the same codeblock, but using the Apache 2.4 style syntax:
<Directory /lpic2bookdev>
<RequireAny>
Require ip 10.6.6.0/24
Require group employees
</RequireAny>
</Directory>
The benefit of the new syntax is all about efficiency. By accomplishing the same functionality with fewer lines, the processing
of those lines will be handled more effectively by both humans and computers. The computers benefit from spending less
processing cycles while accomplishing the same result. Humans benefit from a short configuration section. Long configurations
are more prone to contain errors that may be overlooked. By creating sections within configuration files using the RequireAll,
RequireAny, and RequireNone directives, these configurations can contain granular rules while at the same time preserving
their readability.
Another 2.4 change that is worth mentioning, has to do with the LPIC-2 exam objective regarding the mod_auth module.
Starting with Apache 2.1, the functionality of the mod_auth module has been superseeded by more specific modules. One of
these modules, mod_authn_file now provides the functionality that was previously offered by mod_auth. mod_authn_
file allows for the use of a file that holds usernames and password as part of the authorization process. This file can be created
and the contents may be maintained by the htpasswd utility. When using mod_auth_digest instead of mod_auth_basic,
the htdigest utility should be used instead. This book will focus on the mod_auth_basic option. The htpasswd -c option
will create a file with the provided argument as a filename during creation of a username and password pair. htpasswd allows for
the creation of crypt, MD5 or SHA1 password algorithms. As of Apache 2.4.4, it is also possible to use bcrypt as the password
encryption algorithm. Plaintext passwords can also be generated using the htpasswd -p option, but will only work if Apache 2.4
is hosted on Netware and MS Windows platforms. The crypt algorithm used to be the htpasswd default algorithm up to Apache
version 2.2.17, but is considered insecure. Crypt will limit the provided password to the first eight characters. Every part of the
password string from the ninth character on will be neglected. Crypt password strings are subject to fast brute force cracking and
therefore pose a considerable security risk. The use of the crypt algorithm should be avoided whenever possible. Instead, the
bcrypt algorithm should be considered when available. On a system with Apache 2.4.4 or later, the following syntax can be used
to create a new password file htpasswdfile, supply it with the user “bob” and set the password for the user account using the
bcrypt algorithm:
The system will ask for the new password twice. To update this file anytime later by adding the user “alice”, the -c option can
be ommited to prevent the file from being rewritten:
htpasswd -B /path/outside/document/root/htpasswdfile alice
Using the brypt algorithm with htpasswd also enables the use of the -C option. Using this option, the computing time used to
calculate the password hash may be influenced. By default, the system uses a setting of 5. A value between 4 and 31 may be
provided. Depending on the available resources, a value up to 18 should be acceptable to generate whilst increasing security. To
add the user eve to the existing htpasswdfile while increasing the computing time to a value of 18, the following syntax may
be used:
htpasswd -B -C18 /path/outside/document/root/htpasswdfile eve
In the examples above, it is suggested that the password file is created outside of the webserver document tree. Otherwise, it
could be possible for clients to download the password file.
To use the generated password file for authentication purposes, Apache has to be aware of the htpasswdfile file. This can be
accomplished by defining the AuthUserFile directive. This directive may be defined in either the Apache configuration files,
or in a seperate .htaccess file. That .htaccess file should be located inside the directory of the document root it should
represent. The Apache config responsible for that document root should have the AllowOverride directive specified. This
way Apache will override directives from its configuration for directories that have .htaccess documents in them. The syntax
for the .htaccess documents is the same as for Apache configuration files. A code block to use for user authentication could
look as follows:
<Directory /web/document/root>
AuthName "Authentication Required"
AuthType Basic
AuthUserFile /path/outside/document/root/htpasswdfile
Require valid-user
Documentroot /web/document/root
</Directory>
Consult the contents of your Apache modules directory for the presence of mod_auth* files. There are multiple authentication
and authorization modules available. Each has its own purpose, and some depend on each other. Each module adds functionality
within Apache. This functionality can be addressed by using specific module-specific directives. Refer to the Apache docu-
mentation website https://httpd.apache.org/docs/2.4/mod/ for detailed usage options regarding the modules available for Apache
2.4.
Apache security modules are configured by configuration directives. These are read from either the centralized configuration files
(mostly found under or in the /etc/ directory) or from decentralized .htaccess files. The latter are mostly used to restrict
access to directories and are placed in the top level directory of the tree they help to protect. For example, authentication modules
will read the location of their databases using the AuthUserFile or AuthDBMGroupFile directives.
Centralized configuration This is an example of a configuration as it might occur in a centralized configuration file:
<Directory /home/johnson/public_html>
<Files foo.bar>
AuthName "Foo for Thought"
AuthType Basic
AuthUserFile /home/johnson/foo.htpasswd
Require valid-user
</Files>
</Directory>
The resource being protected is “any file named foo.bar” in the /home/johnson/public_html directory or any underlying
subdirectory. Likewise, the file specifies whom are authorized to access foo.bar: any user that has credentials in the /home/
johnson/foo.htpasswd file.
Decentralized configuration The alternate approach is to place a .htaccess file in the top level directory of any document
tree that needs access protection. Note that you must set the directive AllowOverride in the central configuration to enable
this.
The first section of .htaccess determines which authentication type should be used. It can contain the name of the password
or group file to be used, e.g.:
AuthUserFile {path to passwd file}
AuthGroupFile {path to group file}
AuthName {title for dialog box}
AuthType Basic
The second section of .htaccess ensures that only user {username} can access (GET) the current directory:
<Limit GET>
require user {username}
</Limit>
The Limit section can contain other directives to restrict access to certain IP addresses or to a group of users.
The following would permit any client on the local network (IP addresses 10.*.*.*) to access the foo.html page and require a
username and password for anyone else:
<Files foo.html>
Order Deny,Allow
Deny from All
Allow from 10.0.0.0/8
AuthName "Insiders Only"
AuthType Basic
AuthUserFile /usr/local/web/apache/.htpasswd-foo
Require valid-user
Satisfy Any
</Files>
User files
The mod_auth module uses plain text files that contain lists of valid users. The htpasswd command can be used to create
and update these files. The resulting files are plain text files, which can be read by any editor. They contain entries of the form
“username:password”, where the password is encrypted. Additional fields are allowed, but ignored by the software.
htpasswd encrypts passwords using either a version of MD5 modified for Apache or the older crypt() routine. You can mix
and match.
SYNOPSIS
htpasswd [ -c ] passwdfile username
Here are two examples of using htpasswd for creating an Apache password file. The first is for creating a new password file
while adding a user, the second is for changing the password for an existing user.
$ htpasswd -c /home/joe/public/.htpasswd joe
$ htpasswd /home/joe/public/.htpasswd stephan
Note
Using the -c option, the specified password file will be overwritten if it already exists!
Group files
Apache can work with group files. Group files contain group names followed by the names of the people in the group. By
authorizing a group, all users in that group have access. Group files are known as .htgroup files and by convention bear that
name - though you can use any name you want. Group files can be located anywhere in the directory tree but are normally
placed in the toplevel directory of the tree they help to protect. To allow the use of group files you will need to include some
directives in the Apache main configuration file. This will normally be inside the proper Directory definition. Where the
AuthUserFile may specify either an absolute or relative path, the AuthGroupFile directive will always treat the provided
argument as relative to the ServerRoot. The AuthGroupFile file functions as an addition to the AuthUserFile. The
file should contain a group on each line, followed by a colon. An example:
Apache main configuration file:
...
AuthType Basic
AuthUserFile /var/www/.htpasswd
AuthGroupFile /var/www/.htgroup
Require group Management
...
Now the accounts “bob” and “alice” would have access to the resource but account “joe” would not due to the “Require group
Management” statement in the main configuration file because “joe” is not a member of the required “Management” group. For
this to work the users specified in the .htgroup file must have an entry in the .htpasswd file as well.
Note
A username can be in more than one group entry. This simply means that the user is a member of both groups.
To use a DBM database (as used by mod_auth_db) you may use dbmmanage. For other types of user files/databases, please
consult the documentation that comes with the chosen module.
Note
Make sure the various files are readable by the webserver.
Configuring mod_perl
mod_perl is another module for Apache, which loads the Perl interpreter into your Apache webserver, reducing spawning of
child processes and hence memory footprint and need for processor power. Another benefit is code-caching: modules and scripts
are loaded and compiled only once, and will be served from the cache for the rest of the webserver’s life.
Using mod_perl allows inclusion of Perl statements into your webpages, which will be executed dynamically if the page is
requested. A very basic page might look like this:
print "Content-type: text/plain\r\n\r\n";
print "Hello, you perly thing!\n";
mod_perl also allows you to write new modules in Perl. You have full access to the inner workings of the web server and
can intervene at any stage of request-processing. This allows for customized processing of (to name just a few of the phases)
URI->filename translation, authentication, response generation and logging. There is very little run-time overhead.
The standard Common Gateway Interface (CGI) within Apache can be replaced entirely with Perl code that handles the response
generation phase of request processing. mod_perl includes two general purpose modules for this purpose. The first is Apache::
Registry, which can transparently run well-written existing perl CGI scripts. If you have badly written scripts, you should
rewrite them. If you lack resources, you may choose to use the second module Apache::PerlRun instead because it doesn’t
use caching and is far more permissive then Apache::Registry.
You can configure your httpd server and handlers in Perl using PerlSetVar, and <Perl> sections. You can also define your
own configuration directives, to be read by your own modules.
There are many ways to install mod_perl, e.g. as a DSO, either using APXS or not, from source or from RPM’s. Most of the
possible scenarios can be found in the Mod_perl Guide PerlRef01.
For building Apache from source code you should have downloaded the Apache source code, the source code for mod_perl and
have unpacked these in the same directory 1 . You’ll need a recent version of perl installed on your system. To build the module,
in most cases, these commands will suffice:
$ cd ${the-name-of-the-directory-with-the-sources-for-the-module}
$ perl Makefile.PL APACHE_SRC=../apache_x.x.x/src \
DO_HTTPD=1 USE_APACI=1 EVERYTHING=1
$ make && make test && make install
After building the module, you should also build the Apache server. This can be done using the following commands:
$ cd ${the-name-of-the-directory-with-the-sources-for-Apache}
$ make install
All that’s left then is to add a few configuration lines to httpd.conf (the Apache configuration file) and start the server. Which
lines you should add depends on the specific type of installation, but usually a few LoadModule and AddModule lines suffice.
As an example, these are the lines you would need to add to httpd.conf to use mod_perl as a DSO:
LoadModule perl_module modules/libperl.so
AddModule mod_perl.c
PerlModule Apache::Registry
The first two lines will add the mod_perl module when Apache starts. During startup, the PerlModule directive ensures that
the named Perl module is read in too. This usually is a Perl package file ending in .pm. The Alias keyword reroutes requests for
URIs in the form http://www.example.com/perl/file.pl to the directory /home/httpd/perl. Next, we define
settings for that location. By setting the SetHandler, all requests for a Perl file in the directory /home/httpd/perl now
will be redirected to the perl-script handler, which is part of the Apache::Registry module. The next line simply allows
execution of CGI scripts in the specified location instead of displaying this file. Any URI of the form http://www.example.
com/perl/file.pl will now be compiled once and cached in memory. The memory image will be refreshed by recompiling
the Perl routine whenever its source is updated on disk. Setting PerlSendHeader to on tells the server to send an HTTP
headers to the browser on every script invocation but most of the time it’s better either to use the $r->send_http_header
method using the Apache Perl API or to use the $q->header method from the CGI.pm module.
PHP is a server-side, cross-platform, HTML embedded scripting language. PHP started as a quick Perl hack written by Rasmus
Lerdorf in late 1994. Later he rewrote his code in C and hence the "Personal Home Page/Forms Interpreter" (PHP/FI) was born.
1 The mod_perl module can be obtained at perl.apache.org, the source code for Apache at www.apache.org
Over the next two to three years, it evolved into PHP/FI 2.0. Zeev Suraski and Andi Gutmans wrote a new parser in the summer
of 1997, which led to the introduction of PHP 3.0. PHP 3.0 defined the syntax and semantics used in both versions 3 and 4. PHP
became the de facto programming language for millions of web developers. Still another version of the (Zend) parser and much
better support for object oriented programming led to the introduction of version 5.0 in july 2004. Several subversions followed
and also version 6 was started to include native Unicode support. However this version was abandoned. For the year 2015 the
start for version 7.0 was planned.
PHP can be called from the CGI interface, but the common approach is to configure PHP in the Apache web server as a (dynamic)
DSO module. To do this, you can either use pre-built modules extracted from RPM’s or roll your own from the source code2 .
You need to configure the make process first. To tell configure to build the module as a DSO, you need to tell it to use APXS:
./configure -with-apxs
.. or, in case you want to specify the location for the apxs binary:
./configure -with-apxs={path-to-apxs}/apxs
Next, you can compile PHP by running the make command. Once all the source files are successfully compiled, install PHP by
using the make install command.
Before Apache can use PHP, it has to know about the PHP module and when to use it. The apxs program took care of telling
Apache about the PHP module, so all that is left to do is tell Apache about .php files. File types are controlled in the httpd.
conf file, and it usually includes lines about PHP that are commented out. You may want to search for these lines and uncomment
them:
Addtype application/x-httpd-php .php
Then restart Apache by issuing the apachectl restart command. The apachectl command is another way of passing commands
to the Apache server instead of using /etc/init.d/httpd. Consult the apachectl(8) manpage for more information.
To test whether it actually works, create the following page:
<HTML>
<HEAD><TITLE>PHP Test </TITLE></HEAD>
<BODY>
<?php phpinfo( ) ?>
</BODY>
</HTML>
Save the file as test.php in Apache’s htdocs directory and aim your browser at http://localhost/test.php. A
page should appear with the PHP logo and additional information about your PHP configuration. Notice that PHP commands are
contained by <? and ?> tags.
The httpd binary is the actual HTTP server component of Apache. During normal operation, it is recommended to use the
apachectl or apache2ctl command to control the httpd daemon. On some distributions the httpd binary is named apache2.
Apache used to be a daemon that forked child-processes only when needed. To allow better response times, nowadays Apache
can also be run in pre-forked mode. This means that the server will spawn a number of child-processes in advance, ready to serve
any communication requests. On most distributions the pre-forked mode is run by default.
The httpd.conf file contains a number of sections that allow you to configure the behavior of the Apache server. A number
of keywords/sections are listed below.
2 The source code for PHP4 can be obtained at www.php.net
MaxKeepAliveRequests The maximum number of requests to allow during a persistent connection. Set to 0 to allow an
unlimited amount.
StartServers The number of servers to start initially.
MinSpareServers, MaxSpareServers Used for server-pool size regulation. Rather than making you guess how many
server processes you need, Apache dynamically adapts to the load it sees. That is, it tries to maintain enough server
processes to handle the current load, plus a few spare servers to handle transient load spikes (e.g., multiple simultaneous
requests from a single browser). It does this by periodically checking how many servers are waiting for a request. If there
are fewer than MinSpareServers, it creates a new spare. If there are more than MaxSpareServers, the superfluous
spares are killed.
MaxClients Limit on total number of servers running, i.e., limit on the number of clients that can simultaneously connect. If
this limit is ever reached, clients will be locked out, so it should not be set too low. It is intended mainly as a brake to keep
a runaway server from taking the system with it as it spirals down.
Note
In most Red Hat derivates the Apache configuration is split into two subdirectories. The main configuration file httpd.conf
is located in /etc/httpd/conf. The configuration of Apache modules is located in /etc/httpd/conf.d. Files in that
directories with the suffix .conf are added to the Apache configuration during startup of Apache.
Virtual Hosting is a technique that provides the capability to host more than one domain on one physical host. There are two
methods to implement virtual hosting:
* Name-based virtual hosting With name-based virtual hosting, the HTTP server relies on the client (e.g. the browser) to
report the hostname as part of the HTTP request headers. By using name-based virtual hosting, one IP address may serve
multiple websites for different web domains. In other words: Name-based virtual hosts use the website address from the URL to
determine the correct virtual host to serve.
* IP-based virtual hosting Using IP-based virtual hosting, each configured web domain is committed to at least one IP address.
Since most host systems can be configured with multiple IP addresses, one host can serve multiple web domains. Each web
domain is configured to use a specific IP address or range of IP addresses. In other words: IP-based virtual hosts use the IP
address of the TCP connection to determine the correct virtual host to serve.
Name-based virtual hosting is a fairly simple technique. You need to configure your DNS server to map each domain name to
the correct IP address first. Then, configure the Apache HTTP Server to recognize the different domain names and serve the
appropriate websites.
Tip
Name-based virtual hosting eases the demand for scarce IPv4 addresses. Therefore you could (or should?) use name-based
virtual hosting unless there is a specific reason to choose IP-based virtual hosting, see IP-based Virtual Hosting.
To use name-based virtual hosting, you must designate the IP address (and possibly port) on the server that will be accepting
requests for the hosts. On Apache 2.x up to 2.4, this is configured using the NameVirtualHost directive. This NameVirtual-
Host directive is deprectated since Apache 2.4. Each VirtualHost also implies a NameVirtualHost, so defining a VirtualHost
is sufficient from Apache 2.4 on. Any available IP address can be used. There should be a balance between ease of config-
uration, use and administration on one hand, and security on the other. Using a wildcard as the listening IP address inside a
NameVirtualHost or VirtualHost segment will enable the functionality of that specific configuration on all IP addresses
specified by the Listen directive of Apache’s main configuration file. If the main configuration file also uses a wildcard for the
Listen option, this will result in the availability of the Apache HTTPD server on all configured IP addresses of the server.
And therefore, the availability of the previously mentioned functionality on all of these IP addresses as well. Whether or not this
is either preferable or imposes risk, depends on the circumstances. If the server is using multiple network interfaces and/or IP
addresses, special care should be taken when configuring services. Every daemon exposing services to the network could contain
code based or configuration errors. These errors could be abused by someone with malicious intentions. By minimizing the so
called network footprint of the server, the available attack surface is also minimized. Whether or not the additional configuration
overhead of preventing wildcards is worth the effort, will always remain a trade off.
• Listen can be used to specify the IP addresses and ports to which an Apache listener should be opened in order to serve the
configured content.
The <VirtualHost> directive is the next step to create for each different webdomain you would like to serve. The argument
to the <VirtualHost> directive should be the same as the argument to the (pre-Apache 2.4) NameVirtualHost directive
(i.e., an IP address or * for all addresses). Inside each <VirtualHost> block you will need, at minimum, a ServerName
directive to designate which host is served and a DocumentRoot directive to point out where in the filesystem the content for
that webdomain can be found.
Suppose that both www.domain.tld and www.otherdomain.tld point to the IP address 111.22.33.44. You could then
add the following to httpd.conf or equivalent (included) configuration file:
NameVirtualHost 111.22.33.44
<VirtualHost 111.22.33.44>
ServerName www.domain.tld
DocumentRoot /www/domain
</VirtualHost>
<VirtualHost 111.22.33.44>
ServerName www.otherdomain.tld
DocumentRoot /www/otherdomain
</VirtualHost>
The IP address 111.22.44.33 could be replaced by * to match all IP addresses for this server. The implications of using
wildcards in this way have been addressed above.
Many websites should be accessible by more than one name. For instance, the organization behind domain.tld wants to
facilitate blog.domain.tld. There are multiple ways to implement this functionality, but one of them uses the ServerAl
ias directive. The ServerAlias directive is declared inside the <VirtualHost> section.
If, for example, you add the following to the first <VirtualHost> block above
ServerAlias domain.tld *.domain.tld
then requests for all hosts in the domain.tld domain will be served by the www.domain.tld virtual host. The wildcard
characters * and ? can be used to match names.
Tip
Of course, you can’t just make up names and place them in ServerName or ServerAlias. The DNS system must be
properly configured to map those names to the IP address(es) declared in the NameVirtualHost directive.
Finally, you can fine-tune the configuration of the virtual hosts by placing other directives inside the <VirtualHost> contain-
ers. Most directives can be placed in these containers and will then change the configuration only of the relevant virtual host.
Configuration directives set in the main server context (outside any <VirtualHost> container) will be used only if they are
not overridden by the virtual host settings.
Now when a request arrives, the server will first check if it is requesting an IP address that matches the NameVirtualHost. If
it is, then it will look at each <VirtualHost> section with a matching IP address and try to find one where the ServerName
or ServerAlias matches the requested hostname. If it finds one, it then uses the corresponding configuration for that server.
If no matching virtual host is found, then the first listed virtual host that matches the IP address will be used.
As a consequence, the first listed virtual host is the default virtual host. The DocumentRoot from the main server will never
be used when an IP address matches the NameVirtualHost directive. If you would like to have a special configuration for
requests that do not match any particular virtual host, put that configuration in a <VirtualHost> container and place it before
any other <VirtualHost> container specification in the Apache configuration.
Despite the advantages of name-based virtual hosting, there are some reasons why you might consider using IP-based virtual
hosting instead. These are niche scenarios though:
• Some older or exotic web clients are not compatible with name-based virtual hosting for HTTP or HTTPS. HTTPS name-
based virtual hosting is implemented using an extension to the TLS protocol called Server Name Indicator (SNI). Most modern
browsers on modern operating systems should support SNI at the time of this writing.
• Some operating systems and network equipment devices implement bandwidth management techniques that cannot differenti-
ate between hosts unless they are on separate IP addresses.
As the term IP-based indicates, the server must have a different IP address for each IP-based virtual host. This can be achieved by
equipping the machine with several physical network connections or by using virtual interfaces. Virtual interfaces are supported
by most modern operating systems (refer to the system documentation for details on IP aliasing and the ifconfig or ip
command).
There are two ways of running the Apache HTTP server to support multiple hosts:
• There are security issues, e.g., if you want to maintain strict separation between the web-pages for separate customers. In
this case you would need one daemon per customer, each running with different User, Group, Listen and ServerRoot
settings;
• You can afford the memory and file descriptor requirements of listening to every IP alias on the machine. It is only possible
to Listen to the “wildcard” address, or to specific IP addresses. So, if you need to restrict one webdomain to a specific IP
address, all other webdomains need to be configured to use specific IP addresses as well.
Create a separate httpd installation for each virtual host. For each installation, use the Listen directive in the configuration file
to select which IP address (or virtual host) that daemon services:
Listen 123.45.67.89:80
The Listen directive may be defined as an IP:PORT combination seperated by colons as above. Another option is to specify
only the port number. By doing so, the Apache server will default to activating listeners on all configured IP addresses on the
specified port(s):
Listen 80
Listen 443
The above Listen configuration could also be defined using 0.0.0.0 as the IP address, again using the colon as a seperator.
Another option of the Listen directive enables the exact specification of the protocol. In the previous example, port 80 and 443
are used. By default, Port 80 is configured for HTTP and port 443 for HTTPS in Apache. This configuration could be expanded
by another HTTPS website on port 8443::
Listen 80
Listen 443
Listen 8443 https
When configuring one or more Apache daemons, the Listen directive may be used to specify one or more ports above 1024.
This will prevent the necessity of root privileges for that daemon, if no other ports below 1025 are specified. Unless certain key
or certificate files which are only accessible with root privileges are included in the configuration. You will read more about this
on the next page of this book.
As of Apache 2.4, the Listen directive is mandatory and should be specified. Previous versions of Apache would default to
port 80 for HTTP and 443 for HTTPS on all available IP addresses if no Listen directive was specified. Starting with Apache
2.4, the Apache server will fail to start if no valid Listen directive is specified.
For this case, a single httpd will service requests for the main server and all the virtual hosts. The VirtualHost directive in
the configuration file is used to set the values of ServerAdmin, ServerName, DocumentRoot, ErrorLog and Transf
erLog or CustomLog configuration directives to different values for each virtual host.
<VirtualHost www.snow.nl>
ServerAdmin [email protected]
DocumentRoot /groups/snow/www
ServerName www.snow.nl
ErrorLog /groups/snow/logs/error_log
TransferLog /groups/snow/logs/access_log
</VirtualHost>
<VirtualHost www.unix.nl>
ServerAdmin [email protected]
DocumentRoot /groups/unix_nl/www
ServerName www.unix.nl
ErrorLog /groups/unix_nl/logs/error_log
TransferLog /groups/unix_nl/logs/access_log
</VirtualHost>
Redirect allows you to tell clients about documents which used to exist in your server’s namespace, but do not anymore. This
allows you to tell the clients where to look for the relocated document.
Redirect {old-URI} {new-URI}
Resources: LinuxRef08; Engelschall00; Krause01; SSL01; SSL02; SSL03; SSL04; wikipedia_apachemodules; mozsslconf;
raymii.org; cipherli.st; apachesslhowto; wikiCRIME; the man pages for the various commands.
Depending on the Linux distribution in use, the following files and directories may be used for configuration of Apache 2.x when
Apache is installed from packages:
httpd.conf
apache.conf
apache2.conf
/etc/httpd/
/etc/httpd/conf
/etc/httpd/conf.d
/etc/apache2/
Configuration files are expected to contain predefined directives. If a directive is not explicitly defined, Apache will use a default
setting. This default may vary per Linux distribution, so consult your distribution’s Apache documentation. /usr/share/doc
is a good place to start. Configuration files can be checked for syntax errors using either of the following commands:
$ sudo apachectl configtest
$ sudo service httpd configtest
Because Apache usually serves a daemon that listens to ports below 1024, sudo or a root shell should be used to invoke all Apache
related commands. Refer to your system documentation to check for the availability of the apachectl or apache2ctl command.
If both exist, they might be symlinked. The naming difference for this command has a historical reason. apachectl was used
for Apache 1.x and when Apache2 was released the command was hence renamed match the new name. Now that Apache2.x
has become the standard, either apache2ctl has been renamed to apachectl or both commands are available for compatibility
reasons. When available, the service facility may point to httpd on Red Hat based systems, or to apache2 on Debian-based
systems: The apachectl command has many useful options. It is in fact a shell script that functions as a wrapper for the httpd
binary. Consult the man page for all available arguments and options. Just two more examples to get you started: To show all
configured virtual hosts, use:
Be careful interpreting the output from the command above. That output shows the configuration of the currently running
websites. There is no guarantee that the website configuration on disk has changed since these websites were brought online. In
other words: The output from the running processes does not necessarily have to match the contents of the configuration files
(anymore).
In regards to the Apache configuration files, it is important to know about the different ways Apache may be installed and
configured. Depending on the Linux distribution and Apache2.x version in use, configuration files may be located and even
named differently across otherwise similar systems. As we will see further down this chapter, Apache often uses one main
configuration file. Within this file, other configuration files may be included using the INCLUDE /path/to/other/config
directive. The configuration file syntax may be checked for errors by invoking the apachectl script as shown previously. Each
configuration file that is included from the main configuration file in use will be checked for consistency and syntax. Consistency
here means that if a dependant configuration file, certificate file or key file can not be accessed properly by the user the httpd
binary runs as, a warning will be shown. If apachectl does not appear in your $PATH, use the sudo find command with
apachectl or apache2ctl as an argument. Depending on the size of your storage volumes, it may be wiser to narrow this search
down to specific directories. You have been warned. If the service command is not available on your system, the Apache daemon
may be started, checked and stopped by a SysV script instead. Look within the /etc/init.d/ directory for a script called
httpd, apache2 or equivalent. This script may be then called upon as follows, to reveal the available arguments:
$ sudo /etc/init.d/apache2
Apache can support SSL/TLS for (reasonably) secure online communication. While TLS in version 1.2 is actually the currently
favourable option, TLS encrypted HTTPS sessions are still referred to as ’SSL’ encrypted sessions. TLS could in fact be seen as
the successor to SSL (v3.0). So, just as with Apache versus Apache2, whenever Apache/SSL is mentioned in this chapter, TLS
is implied as well. Unless otherwise specified. We will cover the strengths and weaknesses of both protocols further down this
chapter.
The Secure Sockets Layer protocol (SSL) is a protocol which may be placed between a reliable connection-oriented network layer
protocol (e.g., TCP/IP) and the application layer protocol (e.g., HTTP). SSL provides secure communication between client and
server by allowing mutual authentication and the use of digital signatures for integrity and encryption for privacy. Currently there
are two versions of SSL still in use: version 2 and version 3. Additionally, the successor to SSL, TLS (version 1.0, 1.1 and 1.2,
which are based on SSL), were designed by the IETF organisation.
SSL/TLS uses Public Key Cryptography ( PKC), also known as asymmetric cryptography. Public key cryptography is used in
situations where the sender and receiver do not share a common secret, e.g., between browsers and web servers, but wish to
establish a trusted channel for their communication.
PKC defines an algorithm which uses two keys, each of which may be used to encrypt a message. If one key is used to encrypt a
message, then the other must be used to decrypt it. This makes it possible to receive secure messages by simply publishing one
key (the public key) and keeping the other key secret (the private key). Anyone may encrypt a message using the public key, but
only the owner of the private key will be able to read it. For example, Alice may send private messages to the owner of a key-pair
(e.g., your web server), by encrypting the messages using the public key your server publishes. Only the server will be able to
decrypt it using the corresponding private key.
A secure web server (e.g., Apache/SSL) uses HTTP over SSL/TLS, using port 443 by default. The SSL/TLS port can be
configured by defining the Listen directive inside the main configuration file. There should already be a listener configured
for port 80 (HTTP). On Debian-based systems, there is a dedicated file for defining the active listeners. This file is called
ports.conf and is included from the main configuration file. Apart from this file, individual websites should specify the
listening host at the end of the ServerName or NameVirtualHost declaration. Starting from Apache v2.4, NameVirtualHost
has been deprecated in favour of VirtualHost. A declaration like that could look as follows: <VirtualHost *:443>. Within
the browser, the use of HTTPS is signified by the use of the https:// scheme in the URL. The public key is exchanged during
the set-up of the communication between server and client (browser). That public key should be signed (it contains a digital
signature e.g., a message digest) by a so-called valid CA (Certificate Authority). Each browser contains a number of so-called
root-certificates: these can be used to determine the validity of the CA’s that signed the key. Not every certificate out there is
signed by one of these valid CA’s. Especially for testing purposes, it is common to sign certificates without the intervention of a
valid CA. This is done in order to save both (validation) time and (registration fee) money. As of 2015, it has become easier to
maintain valid CA signed certificate. An organisation called Let’s Encrypt is willing to sign certificates for free, as long as you
play by the rules. Use your favourite web search engine to find out more about Let’s Encrypt, after reading this chapter.
The Apache Software Foundation provides excellent documentation regarding the use of mod_ssl,. We urge you to take the
time to read through the resources collected at the following URL: https://httpd.apache.org/docs/current/ssl/
The subject of encryption is so vast and complicated that entire books have been written around about it. The added confiden-
tiality and integrity only provide their value when encryption is implemented correctly. So called ’best practices’ in regards to
encryption may change overnight. In addition to the collection of resources listed at the URL above, we want to add the following
URL: http://httpd.apache.org/docs/trunk/ssl/ssl_howto.html
As you can see, this URL does not point to the current version of the documentation. Instead, it points to the trunk version.
At the time of this writing, this corresponds to the Apache 2.5 documentation. The trunk documentation will always point
towards the most recent Apache version in development. And while the trunk Apache code may not be recommended to use,
the documentation may be more recently updated than elsewhere. In regards to the subject of SSL/TLS, this results in more
up-to-date best practices than the 2.4 documentation provides.
The documentation provided by The Apache Software Foundation is vendor-neutral. So when the Apache documentation states
that the following directives should be present in the Apache main configuration file:
LoadModule mod_ssl modules/mod_ssl.so
Listen 443
<Vhost>
</Vhost>
It can very well be that these directives are configured amongst several configuration files. This depends on your Linux distribu-
tion. In addition to the documentation provided by The Apache Software Foundation, we will try to point out the configuration
differences between Red Hat and Debian based distributions.
To use mod_ssl you will need to install the Apache and mod_ssl package. On Red Hat based systems, this is done using the
following command:
$ sudo yum install httpd mod_ssl
After installation, make sure the OpenSSL module is enabled within Apache. The module should be available to the Apache
daemon, and included to be loaded during daemon start-up. Again, there are several ways this can be achieved. A com-
mon way is similar to the websites-available and websites-enabled strategy. However, now we are dealing with
modules-available and modules-enabled directories instead. As a plus, Debian-based systems come with a utility
called a2enmod. By invoking this command as follows:
$ sudo a2enmod enable ssl
a2enmod will create symlinks within the mods-enabled directory, pointing to respectively mods-available
ssl.conf and mods-available/ssl.load. When Apache is reloaded, these symlinks will ensure the SSL module will
be loaded as well.
Red Hat based systems use the LoadModule directive instead. This directive should be declared so it will be read during the
start of the Apache daemon. On a Red Hat based system, this could be achieved by a /etc/httpd/conf/httpd.conf that
holds the following INCLUDE directive:
Include conf.d/*conf
The default file /etc/httpd/conf.d/ssl.conf could then contain the following LoadModule and Listen statements:
LoadModule ssl_module modules/mod_ssl.so
Listen 443
After reloading Apache, the SSL module should be loaded together with the Apache daemon. It is always a good practice to
check for configuration errors before restarting the Apache daemon. This can be done using the apachectl configtest command
and has been covered earlier. The output should be clear to interpret whether Apache will encounter errors or not, and why (it
will).
Then, generate a key and Certificate Signing Request (CSR). Either sign the csr file yourself, thus creating a ’self-signed’
certificate, or have it signed by a valid Certificate Authority (CA). Depending on who you are, a self-signed certificate might
cause browser-warnings when presented via HTTPS. Having the csr signed by a valid CA might prevent this from happening.
Some additional directives should be used to configure the secure server - for example the location of the key-files. It’s beyond
the scope of this book to document all of these directives. However, you should be familiar with most of the mod_ssl directives.
You can find best practices by searching the web and should also refer to your distribution’s specific mod_ssl documentation.
The generic mod_ssl documentation can be found on the mod_ssl mod_ssl web-site .
mod_ssl can also be used to authenticate clients using client certificates. These client certificates can be signed by your own
CA and mod_ssl will validate the certificates against this CA. To enable this functionality set the SSLVerifyClient to
require. Use the value none to turn it off.
Certificates that are installed as part of your Linux distribution are usually installed in /etc/ssl/certs on Debian-based
systems, and in /etc/pki/tls/certs on Red Hat based systems. The Red Hat based systems may have a symlink in place
that points /etc/ssl/certs to /etc/pki/tls/certs for convenience and compatibility.
Keys or key-files that are installed as part of your Linux distribution are in turn usually installed in /etc/ssl/private on
Debian-based systems and in /etc/pki/tls/private on Red Hat based systems. Other directories within /etc/ssl and
/etc/pki may also contain specific key files.
It is often considered a best practice to create subdirectories when working with specific keys and/or certificates. Especially
because specific cryptographic keys and certificates belong to each other. By devoting a dedicated subdirectory to each keypair,
structure will be maintained within both the filesystem and configuration files pointing to these files. These subdirecties may
be created as part of the /etc/ssl or /etc/pki hierarchy. But creating subdirectories below /etc/apache2 or /etc/
httpd can be done as well.
Directory /etc/ssl/
/etc/ssl$ ls -l
total 32
drwxr-xr-x 3 root root 16384 2011-03-06 15:31 certs
-rw-r--r-- 1 root root 9374 2010-09-24 22:05 openssl.cnf
drwx--x--- 2 root ssl-cert 4096 2011-03-06 13:19 private
The openssl program is a command line interface to the OpenSSL crypto library. You can use it to generate certificates, encrypt
and decrypt files, create hashes and much more. It is generally seen as “the Swiss Army knife” of cryptography. One of the
more common usages is to generate (self-signed) certificates for use on a secured webserver (to support the https protocol).
/etc/ssl/openssl.cnf is the standard location for its configuration file, where you can set defaults for the name of your
organization, the address etc.
Note
If you generate a certificate for a webserver you start by creating a Certificate Signing Request (.csr). The openssl tool
will prompt you for information it needs to create the request, using defaults it fetches from the configuration file. When you
generate such a signing request, make sure you enter the FQDN ("Fully Qualified Domain Name") of the server when openssl
prompts you for the “Common Name” or CN (which is part of the “Distinguished Name”). For example when you generate a
CSR for the web-site https://www.foo.example/, enter www.foo.example as the CN. Be aware that a certificate
providing foo.example would not be valid for the website accessed via https://www.foo.example . Neither would
this certificate be valid for the website behind the URL https://webmail.foo.example. Seperate certificates for each
domain should be put in place. To combat this necessity, many organizations choose to use wilcard- certificates. Especially
for internal hosted websites. When issuing a CSR for a certificate that could be used to serve any of the .foo.example
websites, the request should be done for the CN value *.foo.example. Browsers will understand this wildcard certificate
when presented, and decide accordingly. www.foo.example and webmail.foo.example could be configured to use
this certificate. https://foo.example on the other hand, would issue a browser warning with this certificate.
While installing OpenSSL, the program openssl is installed on your system. This command can be used to create the necessary
files that implement a (self-signed) server certificate.
More specifically:
• You start by generating the RSA key file. It contains a pair of related keys, used to encrypt and decrypt messages to and from
you. One half of the keypair will be used to encrypt messages that will be sent to you using the public key. The other half is
used to decrypt these received messages using the private key. The public key will be made part of your digital certificate. This
allows client systems to sent encrypted messages to your webserver that only this webserver can decrypt, as it holds the related
private key;
• Next you will create a Certificate Signing Request (CSR). This is a file which contains the public key and identifying informa-
tion like the name of your company, location etc;
• The CSR is sent to a Certificate Authority (CA) which should verify the correctness of the information you provided and
generate the certificate. This certificate contains a digital signature that allows verification that the CA has approved of the
contents of the certificate. The certificate will contain the data you provided (including your public key) and it is signed by the
CA using its private key. A certificate contains your RSA public key, your name, the name of the CA and is digitally signed by
your CA. Browsers that know the CA can verify the signature on that certificate, thereby obtaining your RSA public key. That
enables them to send messages which only you can decrypt.
Note
You can create a signing request and then sign it yourself. In fact, that is what Certificate Authorities do when they create their
root certificate. A root certificate is simply a certificate that says that they say they are whom they say they
are. So, anybody can create a root certificate and put any credentials on it just as they please. The root certificate itself is no
proof of anything. You will need to ensure that it really was issued by a party you trust yourself. Either you visit them and get a
copy directly from them, or fetch it using another method you trust or you rely on others you trust to have done this for you. One
of the ways you implicitly “trust” a large number of CAs is by relying on their root certificates that are made part of your browser.
As an example: to create an RSA private key that has a keysize of 2048 bits, and which will be triple-des (3DES) encrypted,
stored in a file named server.key in the default format (which is known as PEM), type:
RSA keysizes below 1024 bits are considered out-of-date. 1024 bits seems to be a best practice today, with 2048, 3072, 4096
and onwards being valid options if all involved components are able to handle these keysizes without overexceeding thresholds.
openssl will ask for a pass-phrase, which will be used as the key to encrypt the private key. Please store this file in a secure
backup location and remember the pass-phrase. If you loose the pass-phrase you will not be able to recover the key.
For testing purposes, it might be preferable to strip the password from the key file. This can accomplished by reading the key
and exporting it as follows:
$ openssl rsa -in server.key -out stripped.key
The server.key file still holds the encrypted private key information in ciphertext. The stripped.key file is a plain text
file with the unencrypted private key information as its contents. Handle with care.
To create a Certificate Signing Request (CSR) with the server RSA private key (output will be PEM formatted), execute the
following:
$ openssl req -new -key server.key -out server.csr
The signing request can now either be sent to a real CA, which will sign the request and create a digital certificate, or you can
create your own CA and do it yourself. Note that if you do it yourself, you will also need to install the root certificate of your CA
into your clients (e.g. browser) to signal them that a certificate signed by your own CA can be trusted. If you omit this step, you
will be getting a lot of disturbing warnings about missing trust and insecurity.
You can provide the openssl parameters yourself, but that can be a daunting task for less experienced users. Hence, for conve-
niences sake the OpenSSL software suite provides a perl script (CA.pl) to handle most CA related tasks a lot easier. It has a
simplified syntax and supplies the more complex command line arguments to the underlying openssl command.
CA.pl will default use values it reads from the standard OpenSSL configuration file /etc/ssl/openssl.cnf. To create
your own CA, find the CA shellscript or CA.pl perlscript that should be part of the OpenSSL package. On Red Hat based
systems, this script is located in the /etc/pki/tls/misc directory. Depending on your distribution, the script might not
interpret filenames for arguments. The script then instead looks for predifined values for the key file and csr file. Page the script
source using a command like less or more and look for clues. The STDERR output might also show some valueable pointers. In
the following example, newkey.pem and newreq.pem are used as file names by the CA.pl script:
# /usr/lib/ssl/misc/CA.pl -newca
CA certificate filename (or enter to create)
You now created a certificate signed by your own CA (newcert.pem). You might want to rename the file to something
more distinguishable, e.g Certificate:ssltest.snow.nl. While at it, rename the server key file too, for example
PrivateKey:ssltest.snow.nl. Especially if you maintain a lot of keys and certificates on a lot of servers, it really helps
to be able to learn from the name of a file what is in it.
The Certificate Signing Request (CSR) could have been sent to an external Certificate Authority (CA) instead. You usually have
to post the CSR into a web form, pay for the signing and await a signed Certificate. There are non-profit CA’s that will perform
similar tasks free of charge, for example CAcert. However, their root certificate is not yet included into most browsers so you
will need to do that yourself if you are going to use their services.
The server.csr file is no longer needed. Now you have two files: server.key and newcert.pem. In your Apache’s
httpd.conf file you should refer to them using lines like these:
SSLCertificateFile /path/to/Certificate:ssltest.snow.nl
SSLCertificateKeyFile /path/to/PrivateKey:ssltest.snow.nl
It is considered a best practice to follow the ’least privilege’ principle when managing key and certificate files. These files should
preferebly be stored in a way that only the user account that runs the web server can access them.
SSLEngine This directive toggles the usage of the SSL/TLS Protocol Engine. This should be used inside a <VirtualHost>
section to enable SSL/TLS for a that virtual host. By default the SSL/TLS Protocol Engine is disabled for both the main
server and all configured virtual hosts.
SSLCertificateKeyFile This directive points to the PEM-encoded private key file for the server. If the contained private
key is encrypted, the pass phrase dialog is forced at startup time. This directive can be used up to three times (referencing
different filenames) when an RSA, a DSA, and an ECC based private key is used in parallel. For each SSLCertificateKey-
File directive, there must be a matching SSLCertificateFile directive.
SSLCertificateFile This directive points to a file with certificate data in PEM format. At a minimum, the file must
include an end-entity (leaf) certificate. This directive can be used up to three times (referencing different filenames) when
an RSA, a DSA, and an ECC based server certificate is used in parallel.
Sometimes, it might be acceptable to use a self-signed SSL certificate with Apache. The following steps explain how to accom-
plish this on a Debian based system. First, create a directory to hold the SSL keys. On the system we use as an example, all
system-wide SSL certificates are stored in the directory /etc/ssl/certs. For our purpose, we create a new directory called
/etc/ssl/webserver and use it to store our new keypair:
# mkdir /etc/ssl/webserver
# openssl req -new -x509 -days 365 -nodes \
> -out /etc/ssl/webserver/apache.pem -keyout /etc/ssl/webserver/apache.key
Generating a 2048 bit RSA private key
...............................+++
.......+++
writing new private key to ’/etc/ssl/webserver/apache.key’
# ls /etc/ssl/webserver/
apache.key apache.pem
Note
During creation, openssl wil use the contents of /etc/ssl/openssl/cnf to fill in some variables. Other values will be
asked by an interactive script. Be sure to use the proper FQDN here to distinguish this certificate from certificates with another
purpose later on.
In order to be able to use SSL with Apache, a module called mod_ssl has to be loaded. On this system, we can check the
enabled modules by listing the contents of the /etc/apache2/mods-enabled directory. All currently available modules
can be checked by listing the contents of the /etc/apache2/mods-available directory:
# ls /etc/apache2/mods-enabled/
alias.conf autoindex.conf mime.conf reqtimeout.load
alias.load autoindex.load mime.load setenvif.conf
auth_basic.load cgi.load negotiation.conf setenvif.load
authn_file.load deflate.conf negotiation.load status.conf
authz_default.load deflate.load perl.load status.load
authz_groupfile.load dir.conf php5.conf
authz_host.load dir.load php5.load
authz_user.load env.load reqtimeout.conf
# ls /etc/apache2/mods-available/
actions.conf cgid.conf include.load proxy_ftp.conf
actions.load cgid.load info.conf proxy_ftp.load
alias.conf cgi.load info.load proxy_http.load
alias.load charset_lite.load ldap.conf proxy.load
asis.load dav_fs.conf ldap.load proxy_scgi.load
auth_basic.load dav_fs.load log_forensic.load reqtimeout.conf
auth_digest.load dav.load mem_cache.conf reqtimeout.load
ssl appears to be available but has not been enabled yet because both ssl files, ssl.load and ssl.conf, are still present
in the /etc/apache2/mods-available/ directory and not in the /etc/apache2/mods-enabled/ directory. We
could create a symlink to activate support for ssl ourselves, but Debian provides a utility written in perl called a2enmod that
takes care of this. Consult the A2ENMOD(8) manpage for more information. It’s counterpart, conveniently called a2dismod,
does the opposite and disables Apache modules by removing the symlinks from /etc/apache2/mods-enabled/.
Let’s enable SSL:
# a2enmod ssl
Enabling module ssl.
See /usr/share/doc/apache2.2-common/README.Debian.gz on how to configure SSL \
and create self-signed certificates.
To activate the new configuration, you need to run:
service apache2 restart
# service apache2 restart
[ ok ] Restarting web server: apache2 ... waiting .
# apachectl status |grep -i ssl
Server Version: Apache/2.2.22 (Debian) PHP/5.4.4-15.1 mod_ssl/2.2.22 OpenSSL/
SSL has now been enabled on the Apache HTTP server. In order for a site to actually use SSL, it’s configuration has to be
properly configured. HTTPS uses tcp port 443 by default, so we want to specify this in the apache config of Debian. Add the
following line to your /etc/apache2/ports.conf file:
Listen 443
Now, all sites that want to make use of SSL need to have their configuration files reconfigured. The following lines need to be
added to each “enabled” site that should serve it’s content by HTTPS:
SSLEngine On
SSLCertificateFile /etc/ssl/webserver/apache.pem
SSLCertificateKeyFile /etc/ssl/webserver/apache.key
An example site configuration file for both a HTTP and HTTPS enabled site could be like the following:
NameVirtualHost *:80
NameVirtualHost *:443
<VirtualHost *:80>
Servername webserver.intranet
DocumentRoot /srv/http
ErrorLog /var/log/apache2/error.log
</VirtualHost>
<VirtualHost *:443>
SSLEngine On
SSLCertificateFile /etc/ssl/webserver/apache.pem
SSLCertificateKeyFile /etc/ssl/webserver/apache.key
Servername webserver.intranet
DocumentRoot /srv/http
ErrorLog /var/log/apache2/error.log
</VirtualHost>
Now, use apachectl configtest to test your site configuration and if no errors occur restart the Apache HTTP server. The SSL
enabled sites should now be accessible by using the https URL instead of http.
Apart from the directives used above, the following Apache configuration directives should be familiar to you:
SSLCACertificateFile This directive sets the all-in-one file where you can assemble the certificates of Certification Au-
thorities (CA) whose clients you deal with. These are used for Client Authentication. Such a file is simply the concatenation
of the various PEM-encoded certificate files, in order of preference.
SSLCACertificatePath Sets the directory where you keep the certificates of Certification Authorities (CAs) whose clients
you deal with. These are used to verify the client certificate on Client Authentication.
SSLCipherSuite This complex directive uses a colon-separated cipher-spec string consisting of OpenSSL cipher specifica-
tions to configure the Cipher Suite the client is permitted to negotiate in the SSL handshake phase. Notice that this directive
can be used both in per-server and per-directory context. In per-server context it applies to the standard SSL handshake
when a connection is established. In per-directory context it forces a SSL renegotiation with the reconfigured Cipher Suite
after the HTTP request was read but before the HTTP response is sent.
SSLProtocol This directive can be used to control the SSL protocol flavors mod_ssl should use when establishing its server
environment. Clients then can only connect with one of the provided protocols.
ServerSignature The ServerSignature directive allows the configuration of a trailing footer line under server-generated
documents (error messages, mod_proxy ftp directory listings, mod_info output, ...). The reason why you would want to
enable such a footer line is that in a chain of proxies, the user often has no possibility to tell which of the chained servers
actually produced a returned error message.
ServerTokens This directive controls whether the Server response header field which is sent back to clients includes minimal
information, everything worth mentioning or somewhere in between. By default, the ServerTokens directive is set to
Full. By declaring this (global) directive and setting it to Prod, the supplied information will be reduced to the bare
minimum. During the first chapter of this subject the necessity for compiling Apache from source is mentioned. Modifying
the Apache Server response header field values could be a scenario that requires modification of source code. This could
very well be part of a server hardening process. As a result, the Apache server could provide different values as response
header fields.
TraceEnable This directive overrides the behavior of TRACE for both the core server and mod_proxy. The default Tra
ceEnable on permits TRACE requests per RFC 2616, which disallows any request body to accompany the request.
TraceEnable off causes the core server and mod_proxy to return a 405 (method not allowed) error to the client.
There is also the non-compliant setting extended which will allow message bodies to accompany the trace requests.
This setting should only be used for debugging purposes. Despite what a security scan may say, the TRACE method is part
of the HTTP/1.1 RFC 2616 specification and should therefore not be disabled without a specific reason.
As we saw before, the FQDN plays an important part in SSL. It has to match the CN value of the certificate. This certificate
is presented to the browser when initiating the connection to the HTTPS server. Only when the certificate is valid, issued by a
known, trusted and registered party, and matches the hostname will the browser iniate the connection without warnings. Other-
wise, the browser should present a warning about an invalid or expired certificate, an unknown issuer, or an invalid hostname.
With IP-based virtual hosts we have a different IP/port combination for every virtual host, which means we can configure an SSL
certificate for every virtual host. The HTTPS connection will be initiated on a dedicated IP address after all.
When working with name based virtual hosts however, we have no unique identifier for the resource being requested except for
the hostname. So the Apache HTTP server receives all requests for the virtual hosts it serves on the same IP/port combination.
It isn’t until after the SSL connection has been established that the HTTP server knows which virtual host is actually being
requested based on the URL. The URL discloses the hostname for the virtual host. But, by then it might be too late; the client
could have been offered a certificate with a different CN value due to the nature of the HTTP within SSL/TLS transaction.
Currently, an extension called SNI (Server Name Indication) can be used to circumvent this name based issue. Using this
extension, the browser includes the requested hostname in the first message of its SSL handshake as an UTF-8 encoded byte-
string value representing the server_name attribute of the client hello message. This value should only consist of host and/or
domain names. No IPv4 or IPv6 IP addresses should be used. Both the browser and Apache need to support SNI in order for the
SNI mechanism to work. If SNI is used on the server and the browser doesn’t support SNI the browser could show a “untrusted
certificate” warning. This depends on the certificate that is presented for the “default” website of the HTTP server. As of this
writing, most browsers do support SNI. Exceptions are the Android 2.x default browser, MS Internet Explorer on MS Windows
XP before SP3 and versions of Oracle Java before 1.7 on any operating system.
To use SNI on the Apache server and prevent “untrusted certificate” warnings due to non-SNI capable browsers, it is possible to
use a multidomain certificate. This certificate should contain all the necessary domain names and should be used in the Apache
configuration in a separate virtual host. In this virtual host no servername should be configured and because of this it will match
all requests without a hostname and therefore serving all browsers without SNI support. Apache will show the content of the right
requested (SNI) site due to the requested website being extracted from the URL. This extraction takes place after the encrypted
session has been created, using the multidomain certificate. This solution will probably never receive an award for it’s looks, but
the science behind it at least works.
Without SNI, a web server can serve multiple name based virtual hosts over HTTPS without browser warnings. With the
requirement (or restriction) that the SSL certificate being used will be the same for all virtual hosts. The virtual hosts also have to
be part of the same domain, e.g.: virtual01.example.com and virtual02.example.com. The SSL certificate has to
be configured in such a way that the CN (Common Name) points to a wildcard for the domain being used, e.g.: *.example.
com.
The cryptographic security aspect of SSL is entirely based on trust. Mid-2013, there were about 650 Certificate Authorities.
Every one of these authorities may pose as the “weakest link”, and therefore the security of your SSL certificate only goes as far
as your trust in it’s CA.
Because;reasons (2.4.8), the SSLCertificateChainFile directive has been removed from the lPIC-2 objectives. This
directive is good to know nevertheless. It is an addition to the SSLCertificateFile and SSLCertificateKeyFile
directives. The impact of self signed certificates for HTTPS sessions has been pointed out earlier. But even if you send a CSR
to a known CA and receive a validated certificate in return, set up your server correctly using this certificate and the correct
key file used to generate the CSR, this does not mean every browser will validate a session using this certificate as valid. This
is often due to the involved CA having signed the CSR using a certificate that is not known to your browser. This might for
instance be the case if the signing certificate is newer than your browser. But also because many CA’s offer different types of
certificates from their product portfolio. Browsers recognize the validity of HTTPS certificates based on the availability of so
called root-certificates. These could be seen as the top level within the certificate chain. The CSR’s on the other hand are often
signed using a so called intermediate certificate. These intermediate certificates are relatable to the root certificate, but exist on a
lower level in the certificate chain. To restrict the amount of certificates being shipped with browser software, not all certificates
being used to sign CSR’s are included by default . This could result in browser warnings about an incomplete certificate chain.
To remedy these warnings (actually errors) , one or more intermediate certificates may be needed to reassemble the certificate
chain for completeness. In order to facilitate this, CA’s using intermediate certificates usually offer these intermediate certificates
for download. The website of the CA and/or the email holding the signed certificate information should point to the appropriate
intermediate certificate(s). The different levels are often referred to as Gn-level certificates, where n represents a certain number.
These intermediate certicates fill the gap between your signed certificate, and the root certificates known to all major browsers.
By using the SSLCertificateChainFile directive, you may point Apache to a file that holds two or more concatenated
certificates. By concatenating the missing certificates in the right order, the certificate chain gap will be closed and the certificate
chain will be complete again.
When Apache is configured as a Secure Server to serve content via the HTTPS protocol, the client and server negotiate upon
encryption initiation which ciphers can be used to secure the connection. Both the Apache server and the browser then offer a
list of their available encryption algorithms to each other. Depending on which settings are enforced, the server and browser then
initiate a secure channel using an overlapping encryption algorithm from the list of available ciphers. By setting the SSLHono
rCipherOrder directive to a value of on, the server will honor the order of ciphers as specified by the SSLCipherSuite
directive. Otherwise, the client will have the upper hand when deciding which ciphers will be used. This secure channel then is
used to transmit the encryption keys that will be used to secure the communication from here on forward. Those encryption keys
must also be in a cipher format that both the server and client can handle. As is often the case, trade-offs have to be made between
compatibility and security. Maximizing the amount of ciphers offered (and therefore browser compatibility) also increases the
possibility that one or more of the ciphers in use may be susceptible to attack vectors. The term ’weak ciphers’ is often used to
refer to these vulnerable encryption algorithms. The list of (de)recommended ciphers, is heavily dependant on the publication of
known vulnerabilities to abuse these ciphers. This leads to opinions that may change over time. It is therefore recommended to
stay up to date about known vulnerabilities. Currently, use of so called Cipher Block Code ciphers is not recommended. These
ciphers may be identified by a “CBC” part in their name. Neither is Arcfour (or RC4 in short) a recommended cipher to be used.
As far as protocols go, SSL v2 and SSL v3 are known to be vulnerable to a plethora of attack vectors. TLS v1.0 and v1.1 also
have their weaknesses. TLS v1.2 is currently the recommended protocol to use if security is a concern. With TLS v1.3 being
almost visible on the horizon. Apache allows for configuration of the ciphers being offered to clients. By using the following
directives, the list of ciphers that Apache offers to client software can be limited:
SSLCipherSuite
SSLProtocol
The following section shows an example configuration on how these directives may be used. The SSLCipherSuite is con-
figured to use strong ciphersuites only, while explicitly disabling RC4. The SSLProtocol directive disables support for All
protocols, while explicitly enabling TLSv1.2 support. The order in which the ciphers should be evaluated for mutual support by
both the server and the client is determined by the server through use of the SSLHonorCipherOrder directive. Finally, the
SSLCompression directive is configured to disable SSL/TLS compression. Disabling this compression mitigates the known
CRIME (Compression Ratio Info-leak Made Easy) attack vector.
SSLCipherSuite EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH:!RC4
SSLProtocol -All +TLSv1.2
SSLHonorCipherOrder On
SSLCompression Off
From Apache 2.4 on, mod_ssl dropped support for SSLv2. Only SSLv3 and TLS are supported. Be aware that support for
recent TLS versions depends on the availability of recent OpenSSL libraries.
A good reference before configuring a secure Apache server, is the Mozilla TLS Server project at Github. This website has
example configuration strings for various major HTTP servers, including Apache. The project team members should keep the
example configurations up to date according to the latest recommendations. Another good reference is https://cipherli.st. This
website also offers example configurations.
Regarding every example configuration is it always important not to copy/paste configurations without validating the content. As
explained earlier, a trade-off has to be made in most cases. Configuring Apache to strictly serve a modern cipher like TLS 1.2 will
mitigate most known attack vectors in regards to SSL and TLS connections. But not every browser on every operating system
will be able to adhere to this requirement. The adoption of TLS v1.2 in client software requires the availability of recent SSL
libraries. Not all software vendors keep these libraries on track. The trade-off here is therefore security in favor of compatibility.
Using older ciphers like SSL v2 and SSL v3 will probably increase the chances of encrypted connections being vulnerable to
known attacks. But at the same time this will maximize the compatibility of clients that will be able to set up these connections.
Another trade off.
Apart from explicitly specifying which protocols and ciphers _can_ be used, the preference for using a certain protocol or cipher
may vary. The order in which the directives are specified has influence, but gives no guarantees. Servers and clients often use
a technique called opportunistic encryption to decide which of the protocols and ciphers will be used. At the same time, it is
possible for client software to specify exactly what protocol and ciphers should be used. Depending on the server configuration,
the server will respect these demands by the client. It is this very functionality that is the basis for a category of known attacks
called downgrade attacks.
After having set up your server, it is regarded as a best practice to periodicly scan the system for known vulnerabilities. This can
be done in multiple ways. An easy way is to use a public web service like Qualys SSLabs: https://ssllabs.com. The output will
show you in detail whether and if so, which weak protocols and/or ciphers were detected.
Candidates should be able to install and configure a proxy server, including access policies, authentication and resource usage.
The following is a partial list of the used files, terms and utilities:
• squid.conf
• acl
• http_access
Resources: Kiracofe01; Brockmeier01; Wessels01; Pearson00; the man pages for the various commands.
Web-caches
A web-cache, also known as an http proxy, is used to reduce bandwidth demands and often allows for finer-grained access
control. Using a proxy, in the client software the hostname and port number of a proxy must be specified. When the browser
tries to connect to a web server, the request will be sent to the specified proxy server but to the user it looks like the request has
been sent to the requested web server directly. The proxy server now makes a connection to the specified web server, waits for
the answer and sends this back to the client. The proxy works like an interpreter: the client talks and listens to the proxy and the
proxy talks and listens to the web server the client wants to talk to. A proxy will also use locally cached versions of web-pages
if they have not yet expired and will also validate client requests.
Additionally, there are transparent proxies. Usually this is the tandem of a regular proxy and a redirecting router. In these cases,
a web request can be intercepted by the proxy, transparently. In this case there is no need to setup a proxy in the settings of the
client software. As far as the client software knows, it is talking directly to the target server, whereas it is actually talking to the
proxy.
squid
squid is a high-performance proxy caching server for web clients. squid supports more then just HTTP data objects: it supports
FTP and gopher objects too. squid handles all requests in a single, non-blocking, I/O-driven process. squid keeps meta
data and, especially, hot objects cached in RAM, it caches DNS lookups, supports non-blocking DNS lookups and implements
negative caching of failed requests. squid also supports SSL, extensive access controls and full request logging. By using the
lightweight Internet Cache Protocol, squid caches can be arranged in a hierarchy or mesh for additional bandwidth savings.
squid can be used for a number of things, including bandwidth saving, handling traffic spikes and caching sites that are occasion-
ally unavailable. squid can also be used for load balancing. Essentially, the first time squid receives a request from a browser, it
acts as an intermediary and passes the request on to the server. squid then saves a copy of the object. If no other clients request
the same object, no benefit will be gained. However, if multiple clients request the object before it expires from the cache, squid
can speed up transactions and save bandwidth. If you’ve ever needed a document from a slow site, say one located in another
country or hosted on a slow connection, or both, you will notice the benefit of having a document cached. The first request may
be slower than molasses, but the next request for the same document will be much faster, and the originating server’s load will
be lightened.
squid consists of a main server program squid, a Domain Name System lookup program dnsserver, some optional programs
for rewriting requests and performing authentication, and some management and client tools. When squid starts up, it spawns a
configurable number of dnsserver processes, each of which can perform a single, blocking Domain Name System (DNS) lookup.
This reduces the amount of time the cache waits for DNS lookups.
squid is normally obtained in source code format. On most systems a simple make install will suffice. After that, you will
also have a set of configuration files. In most distributions all the squid configuration files are, by default, kept in the directory
/usr/local/squid/etc. However, the location may vary, depending on the style and habits of your distribution. The
Debian packages, for example, place the configuration files in /etc, which is the normal home directory for .conf files.
Though there is more than one file in this directory, only one file is important to most administrators, namely the squid.conf
file. There are just about 125 option tags in this file but only eight options are really needed to get squid up and running. The
other options just give you additional flexibility.
squid assumes that you wish to use the default value if there is no occurrence of a tag in the squid.conf file. Theoretically, you
could even run squid with a zero length configuration file. However, you will need to change at least one part of the configuration
file, i.e. the default squid.conf denies access to all browsers. You will need to edit the Access Control Lists to allow your
clients to use the squid proxy. The most basic way to perform access control is to use the http_access option (see below).
S ECTIONS IN THE SQUID . CONF FILE
http_port This option determines on which port(s) squid will listen for requests. By default this is port 3128. Another
commonly used port is port 8080.
cache_dir Used to configure specific storage areas. If you use more than one disk for cached data, you may need more
than one mount point (e.g., /usr/local/squid/cache1 for the first disk, /usr/local/squid/cache2 for the
second disk). squid allows you to have more than one cache_dir option in your config file. This option can have four
parameters:
cache_dir /usr/local/squid/cache/ 100 16 256
The first option determines in which directory the cache should be maintained. The next option is a size value in Megabytes
where the default is 100 Megabytes. squid will store up to that amount of data in the specified directory. The next two
options will set the number of subdirectories (first and second tier) to create in this directory. squid creates a large number
of directories and stores just a few files in each of them in an attempt to speed up disk access (finding the correct entry in
a directory with one million files in it is not efficient: it’s better to split the files up into lots of smaller sets of files).
http_access, acl The basic syntax of the option is http_access allow|deny [!]aclname. If you want to pro-
vide access to an internal network, and deny access to anyone else, your options might look like this:
acl home src 10.0.0.0/255.0.0.0
http_access allow home
The first line sets up an Access Control List class called “home” of an internal network range of ip addresses. The second
line allows access to that range of ip addresses. Assuming it’s the final line in the access list, all other clients will be denied.
See also the section on acl.
Tip
Note that squid’s default behavior is to do the opposite of your last access line if it can’t find a matching entry. For
example, if the last line is set to “allow” access for a certain set of network addresses, then squid will deny any client that
doesn’t match any of its rules. On the other hand, if the last line is set to “deny” access, then squid will allow access to
any client that doesn’t match its rules.
auth_param This option is used to specify which program to start up as an authenticator. You can specify the name of the
program and any parameters needed.
After you have made changes to your configuration, issue squid -k reconfigure so that squid will use the changes.
Redirectors
squid can be configured to pass every incoming URL through a redirector process that returns either a new URL or a blank line
to indicate no change. A redirector is an external program, e.g. a script that you wrote yourself. Thus, a redirector program is
NOT a standard part of the squid package. However, some examples are provided in the contrib/ directory of the source
distribution. Since everyone has different needs, it is up to the individual administrators to write their own implementation.
A redirector allows the administrator to control the web sites his users can get access to. It can be used in conjunction with
transparent proxies to deny the users of your network access to certain sites, e.g. porn-sites and the like.
The redirector program must read URLs (one per line) on standard input, and write rewritten URLs or blank lines on standard
output. Also, squid writes additional information after the URL which a redirector can use to make a decision. The input line
consists of four fields:
URL ip-address/fqdn ident method
• The results of any IDENT / AUTH lookup done for this client, if enabled.
• The HTTP method used in the request, e.g. GET.
A sample response:
ftp://ftp.net.lboro.ac.uk/gnome/stable/releases/gnome-1.0.53/README 192.168.12.34/- - GET
It is possible to send an HTTP redirect to the new URL directly to the client, rather than have squid silently fetch the alternative
URL. To do this, the redirector should begin its response with 301: or 302: depending on the type of redirect.
A simple very fast redirector called squirm is a good place to start, it uses the regex library to allow pattern matching.
The following Perl script may also be used as a template for writing your own redirector:
#!/usr/local/bin/perl
$|=1; # Unbuffer output
while (<>) {
s@http://fromhost.com@http://tohost.org@;
print;
}
Authenticators
squid can make use of authentication. Authentication can be done on various levels, e.g. network or user.
Browsers are capable to send the user’s authentication credentials using a special “authorization request header”. This works as
follows. If squid gets a request, given there was an http_access rule list that points to a proxy_auth ACL, squid looks
for an authorization header. If the header is present, squid decodes it and extracts a username and password. If the header
is missing, squid returns an HTTP reply with status 407 (Proxy Authentication Required). The user agent (browser) receives
the 407 reply and then prompts the user to enter a name and password. The name and password are encoded, and sent in the
authorization header for subsequent requests to the proxy.
Authentication is actually performed outside of the main squid process. When squid starts, it spawns a number of authentication
subprocesses. These processes read usernames and passwords on stdin and reply with OK or ERR on stdout. This technique
allows you to use a number of different authentication schemes. The current supported schemes are: basic, digest, ntlm and
negotiate.
Squid has some basic authentication backends. These include:
The ntlm, negotiate and digest authentication schemes provide more secure authentication methods because passwords are not
exchanged over the wire or air in plain text.
Configuration of each scheme is done via the auth_param director in the config file. Each scheme has some global and scheme-
specific configuration options. The order in which authentication schemes are presented to the client is dependent on the order
the scheme first appears in the config file.
Example configuration file with multiple directors:
#Recommended minimum configuration per scheme:
#
#auth_param negotiate program < uncomment and complete this line to activate>
#auth_param negotiate children 20 startup=0 idle=1
#auth_param negotiate keep_alive on
#
#auth_param ntlm program < uncomment and complete this line to activate>
#auth_param ntlm children 20 startup=0 idle=1
#auth_param ntlm keep_alive on
#
#auth_param digest program < uncomment and complete this line>
#auth_param digest children 20 startup=0 idle=1
#auth_param digest realm Squid proxy-caching web server
#auth_param digest nonce_garbage_interval 5 minutes
#auth_param digest nonce_max_duration 30 minutes
#auth_param digest nonce_max_count 50
#
#auth_param basic program < uncomment and complete this line>
#auth_param basic children 5 startup=5 idle=1
#auth_param basic realm Squid proxy-caching web server
#auth_param basic credentialsttl 2 hours
Access policies
Many squid.conf options require the use of Access Control Lists (ACLs). Each ACL consists of a name, type and value (a
string or filename). ACLs are often regarded as being the most difficult part of the squid cache configuration, i.e. the layout and
concept is not immediately obvious to most people. Additionally, the use of external authenticators and the default ACL augment
to the confusion.
ACLs can be seen as definitions of resources that may or may not gain access to certain functions in the web-cache. Allowing
the use of the proxy server is one of these functions.
To regulate access to certain functions, you will have to define an ACL first, and then add a line to deny or allow access to a
function of the cache, thereby using that ACL as a reference. In most cases the feature to allow or deny will be http_acc
ess, which allows or denies a web browsers access to the web-cache. The same principles apply to the other options, such as
icp_access (Internet Cache Protocol).
To determine whether a resource (e.g. a user) has access to the web-cache, squid works its way through the http_access list
from top to bottom. It will match the rules, until one is found that matches the user and either denies or allows access. Thus,
if you want to allow access to the proxy only to those users whose machines fall within a certain IP range you would use the
following:
acl ourallowedhosts src 192.168.1.0/255.255.255.0
acl all src 0.0.0.0/0.0.0.0
If a user from 192.168.1.2 connects using TCP and requests an URL, squid will work it’s way through the list of http_access
lines. It works through this list from top to bottom, stopping after the first match to decide which one they are in. In this case,
squid will match on the first http_access line. Since the policy that matched is allow, squid would proceed to allow the
request.
The src option on the first line is one of the options you can use to decide which domain the requesting user is in. You can
regulate access based on the source or destination IP address, domain or domain regular expression, hours, days, URL, port,
protocol, method, username or type of browser. ACLs may also require user authentication, specify an SNMP read community
string, or set a TCP connection limit.
For example, these lines would keep all internal IP addresses off the Web except during lunchtime:
acl allowed_hosts src 192.168.1.0/255.255.255.0
acl lunchtime MTWHF 12:00-13:00
http_access allow allowed_hosts lunchtime
The MTWHF string denotes the proper days of the week, where M specifies Monday, T specifies Tuesday and so on. WHFAS means
Wednesday until Sunday. For more options have a look at the default configuration file squid installs on your system.
Another example is the blocking of certain sites, based on their domain names:
acl adults dstdomain playboy.com sex.com
acl ourallowedhosts src 192.168.1.0/255.255.255.0
acl all src 0.0.0.0/0.0.0.0
These lines prevent access to the web-cache (http_access) to users who request sites listed in the adults ACL. If another
site is requested, the next line allows access if the user is in the range as specified by the ACL ourallowedhosts. If the user
is not in that range, the third line will deny access to the web-cache.
To use an authenticator, you have to tell squid which program it should use to authenticate a user (using the authenticate_
program option in the squid.conf file). Next you need to set up an ACL of type proxy_auth and add a line to regulate
the access to the web-cache using that ACL. Here’s an example:
The ACL points to the external authenticator /sbin/my_auth. If a user wants access to the webcache (the http_access
function), you would expect that (as usual) the request is granted if the ACL name is matched. HOWEVER, this is not the case!
Authenticator Behaviour
Both warnings imply that if the example above is implemented as it stands, the final line “http_access allow name”
implicitly adds a final rule “http_access deny all”. If the external authenticator grants access, the access is not granted,
but the next rule is checked - and that next rule is the default deny rule if you do not specify one yourself! This means that
properly authorized people would be denied access. This exceptional behavior of squid is often misunderstood and puzzles many
novice squid administrators. A common solution is to add an extra line, like this:
http_access allow name
http_access allow all
squid uses lots of memory. For performance reasons this makes sense since it takes much, much longer to read something from
disk compared to reading directly from memory. A small amount of metadata for each cached object is kept in memory, the so-
called StoreEntry. For squid version 2 this is 56-bytes on “small” pointer architectures (Intel, Sparc, MIPS, etc) and 88-bytes on
“large” pointer architectures (Alpha). In addition, there is a 16-byte cache key (MD5 checksum) associated with each StoreEntry.
This means there are 72 or 104 bytes of metadata in memory for every object in your cache. A cache with 1,000,000 objects
therefore requires 72 MB of memory for metadata only.
In practice, squid requires much more than that. Other uses of memory by squid include:
You can use a number of parameters in squid.conf to determine squid’s memory utilization:
• The cache_mem parameter specifies how much memory to use for caching hot (very popular) requests. squid’s actual
memory usage depends strongly on your disk space (cache space) and your incoming request load. Reducing cache_mem
will usually also reduce squid’s process size, but not necessarily.
• The maximum_object_size option in squid.conf specifies the maximum file size that will be cached. Objects larger
than this size will NOT be saved on disk. The value is specified in kilobytes and the default is 4MB. If speed is more important
than saving bandwidth, you should leave this low.
• The minimum_object_size option specifies that objects smaller than this size will NOT be saved on disk. The value is
specified in kilobytes, and the default is 0 KB, which means there is no minimum (and everything will be saved to disk).
• The cache_swap option tells squid how much disk space it may use. If you have a large disk cache, you may find that you
do not have enough memory to run squid effectively. If it performs badly, consider increasing the amount of RAM or reducing
the cache_swap.
Candidates should be able to install and configure a reverse proxy server, Nginx. Basic configuration of Nginx as a HTTP server
is included.
Nginx
Reverse Proxy
Basic Web Server
• /etc/nginx/
• nginx
Resources: NginX01; NginX02; the man pages for the various commands.
NGINX
Nginx can be used as a webserver, an HTTP reverse proxy or as an IMAP/POP3 proxy. It is pronounced as engine-x. Nginx
is performing so well that large sites as Netflix, Wordpress and GitHub rely on Nginx. It doesn’t work using threads as most
of the other webserver software does but it’s using an event driven (asynchronous) architecture. It has a small footprint and
performs very well under load with predictable usage of resources. This predictable performance and small memory footprint
makes Nginx interesting in both small and large environments. Nginx is distributed as Open Source software. There is also
’Nginx Plus’, which is the commercial edition. This book will focus on the Open Source edition.
Reverse Proxy
A proxy server is a go-between or intermediary server that forwards requests for content from multiple clients to different servers
across the Internet. A reverse proxy server is a type of proxy server that typically sits behind the firewall in a private network and
directs client requests to the appropriate back-end server. A reverse proxy provides an additional level of abstraction and control
to ensure the smooth flow of network traffic between clients and servers.
Common uses for a reverse proxy server include:
• Load balancing – A reverse proxy server can act as a “traffic cop,” sitting in front of your back-end servers and distributing
client requests across a group of servers in a manner that maximizes speed and capacity utilization while ensuring no server is
overloaded, which can degrade performance. If a server goes down, the load balancer redirects traffic to the remaining online
servers.
• Web acceleration – Reverse proxies can compress inbound and outbound data, as well as cache commonly requested content,
both of which speed up the flow of traffic between clients and servers. They can also perform additional tasks such as SSL
encryption to take load off of your web servers, thereby boosting their performance.
• Security and anonymity – By intercepting requests headed for your back-end servers, a reverse proxy server protects their
identities and acts as an additional defense against security attacks. It also ensures that multiple servers can be accessed from
a single record locater or URL regardless of the structure of your local area network.
Using Nginx as reverse HTTP proxy is not hard to configure. A very basic reverse proxy setup might look like this:
location / {
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header Host $host;
proxy_pass http://localhost:8000;
}
In this example all requests received by Nginx, depending on the configuration of the server parameters in /etc/nginx/
nginx.conf, are forwarded to an HTTP server running on localhost and listening on port 8000. The Nginx configuration file
looks like this:
server {
listen 80;
root /var/www/;
index index.php index.html index.htm;
location / {
try_files $uri $uri/ /index.php;
}
location ~ \.php$ {
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header Host $host;
proxy_pass http://localhost:8080;
}
location ~ /\.ht {
deny all;
}
}
The line starting with location ~ /\.ht is added to prevent Nginx of displaying the content of Apache’s .htaccess files.
The try_files line is used to attempt to serve whatever page the visitor requests. If nginx is unable, then the file is passed to the
proxy.
# Here you can set a server name, you can use wildcards such as *.example.com
# however remember if you use server_name *.example.com; You’ll only match subdomains
# to match both subdomains and the main domain use both example.com and *.example.com
server_name example.com www.example.com;
# It is best to place the root of the server block at the server level, and not the ←-
location level
# any location block path will be relative to this root.
root /usr/local/www/example.com;
# Access and error logging. NB: Error logging cannot be turned off.
access_log /var/log/nginx/example.access.log;
error_log /var/log/nginx/example.error.log;
location / {
# Rewrite rules and other criterias can go here
# Remember to avoid using if() where possible (http://wiki.nginx.org/IfIsEvil)
}
}
For PHP support Nginx relies on a PHP fast-cgi spawner. Preferable is php-fpm which can be found at http://php-fpm.org.
It has some unique features like adaptive process spawning and statistics and has the ability to start workers with different
uid/gid/chroot/environment and different php.ini. The safe_mode can be replaced using this feature.
You can add the content below to nginx.conf. A better practice is to put the contents in a file and include this file into the
main configuration file of Nginx. Create a file, for example php.conf and include the next line at the end of the Nginx main
configuration file:
include php.conf;
Web Services
1. What is the name of the file that contains Apache configuration items that can be set on a per directory basis?
.htaccess. .htaccess [210]
2. Which protocol is supported by the Apache web server that enables secure online communication between client en server?
SSL. SSL [227]
3. What is the principle behind PKC (Public Key Cryptography)?
PKC is based on a public and a secret key, in which the sender of a message encrypts the data with the public key of the
receiver. Only the owner of the corresponding private key can decipher this data. Public Key Cryptography [227]
4. What is the difference between Covalent’s Raven SSL Module and mod_ssl?
The difference is in the license; mod_ssl is open source and Covalent’s Raven SSL Module is not.
7. Which standard format is used to write entries in the Apache web server access log file?
Common Log Format. Common Log Format [213]
8. Name the two methods of access control in use by an Apache web server.
Discretionary access control (DAC) and mandatory access control (MAC). Access Control [214]
11. By which means can an Apache web server configuration file be checked for syntax errors?
apachectl configtest. apachectl configtest [226]