Paper 2
Paper 2
Paper 2
Eric Allman*
University of California, Berkeley
Mammoth Project
ABSTRACT
Routing mail through a heterogenous internet presents many new problems. Among
the worst of these is that of address mapping. Historically, this has been handled on an
ad hoc basis. However, this approach has become unmanageable as internets grow.
Sendmail acts a unified "post office" to which all mail can be submitted. Address inter-
pretation is controlled by a production system, which can parse both domain-based ad-
dressing and old-style ad hoc addresses. The production system is powerful enough to
rewrite addresses in the message header to conform to the standards of a number of
common target networks, including old (NCP/RFC733) Arpanet, new (TCP/RFC822)
Arpanet, UUCP, and Phonenet. Sendmail also implements an SMTP server, message
queueing, and aliasing.
Sendmail implements a general internetwork mail routing facility, featuring aliasing and forwarding,
automatic routing to network gateways, and flexible configuration.
In a simple network, each node has an address, and resources can be identified with a host-resource
pair; in particular, the mail system can refer to users using a host-username pair. Host names and numbers
have to be administered by a central authority, but usernames can be assigned locally to each host.
In an internet, multiple networks with different characterstics and managements must communicate.
In particular, the syntax and semantics of resource identification change. Certain special cases can be han-
dled trivially by ad hoc techniques, such as providing network names that appear local to hosts on other
networks, as with the Ethernet at Xerox PARC. However, the general case is extremely complex. For
example, some networks require point-to-point routing, which simplifies the database update problem since
only adjacent hosts must be entered into the system tables, while others use end-to-end addressing. Some
networks use a left-associative syntax and others use a right-associative syntax, causing ambiguity in mixed
addresses.
Internet standards seek to eliminate these problems. Initially, these proposed expanding the address
pairs to address triples, consisting of {network, host, resource} triples. Network numbers must be univer-
sally agreed upon, and hosts can be assigned locally on each network. The user-level presentation was
quickly expanded to address domains, comprised of a local resource identification and a hierarchical
domain specification with a common static root. The domain technique separates the issue of physical ver-
sus logical addressing. For example, an address of the form “eric@a.cc.berkeley.arpa” describes only the
logical organization of the address space.
Sendmail is intended to help bridge the gap between the totally ad hoc world of networks that know
nothing of each other and the clean, tightly-coupled world of unique network numbers. It can accept old
*A considerable part of this work was done while under the employ of the INGRES Project at the University of California at
Berkeley and at Britton Lee.
arbitrary address syntaxes, resolving ambiguities using heuristics specified by the system administrator, as
well as domain-based addressing. It helps guide the conversion of message formats between disparate net-
works. In short, sendmail is designed to assist a graceful transition to consistent internetwork addressing
schemes.
Section 1 discusses the design goals for sendmail. Section 2 gives an overview of the basic functions
of the system. In section 3, details of usage are discussed. Section 4 compares sendmail to other internet
mail routers, and an evaluation of sendmail is given in section 5, including future plans.
1. DESIGN GOALS
Design goals for sendmail include:
(1) Compatibility with the existing mail programs, including Bell version 6 mail, Bell version 7
mail [UNIX83], Berkeley Mail [Shoens79], BerkNet mail [Schmidt79], and hopefully UUCP
mail [Nowitz78a, Nowitz78b]. ARPANET mail [Crocker77a, Postel77] was also required.
(2) Reliability, in the sense of guaranteeing that every message is correctly delivered or at least
brought to the attention of a human for correct disposal; no message should ever be completely
lost. This goal was considered essential because of the emphasis on mail in our environment. It
has turned out to be one of the hardest goals to satisfy, especially in the face of the many
anomalous message formats produced by various ARPANET sites. For example, certain sites
generate improperly formated addresses, occasionally causing error-message loops. Some hosts
use blanks in names, causing problems with UNIX mail programs that assume that an address is
one word. The semantics of some fields are interpreted slightly differently by different sites. In
summary, the obscure features of the ARPANET mail protocol really are used and are difficult
to support, but must be supported.
(3) Existing software to do actual delivery should be used whenever possible. This goal derives as
much from political and practical considerations as technical.
(4) Easy expansion to fairly complex environments, including multiple connections to a single net-
work type (such as with multiple UUCP or Ether nets [Metcalfe76]). This goal requires consid-
eration of the contents of an address as well as its syntax in order to determine which gateway
to use. For example, the ARPANET is bringing up the TCP protocol to replace the old NCP
protocol. No host at Berkeley runs both TCP and NCP, so it is necessary to look at the
ARPANET host name to determine whether to route mail to an NCP gateway or a TCP gateway.
(5) Configuration should not be compiled into the code. A single compiled program should be able
to run as is at any site (barring such basic changes as the CPU type or the operating system).
We have found this seemingly unimportant goal to be critical in real life. Besides the simple
problems that occur when any program gets recompiled in a different environment, many sites
like to “fiddle” with anything that they will be recompiling anyway.
(6) Sendmail must be able to let various groups maintain their own mailing lists, and let individuals
specify their own forwarding, without modifying the system alias file.
(7) Each user should be able to specify which mailer to execute to process mail being delivered for
him. This feature allows users who are using specialized mailers that use a different format to
build their environment without changing the system, and facilitates specialized functions (such
as returning an “I am on vacation” message).
(8) Network traffic should be minimized by batching addresses to a single host where possible,
without assistance from the user.
These goals motivated the architecture illustrated in figure 1. The user interacts with a mail gen-
erating and sending program. When the mail is created, the generator calls sendmail, which routes the
message to the correct mailer(s). Since some of the senders may be network servers and some of the
mailers may be network clients, sendmail may be used as an internet mail gateway.
SENDMAIL — An Internetwork Mail Router SMM:9-3
sendmail
2. OVERVIEW
1
except when mailing to a file, when sendmail does the delivery directly.
SMM:9-4 SENDMAIL — An Internetwork Mail Router
3.1. Arguments
Arguments may be flags and addresses. Flags set various processing options. Following flag
arguments, address arguments may be given, unless we are running in SMTP mode. Addresses fol-
low the syntax in RFC822 [Crocker82] for ARPANET address formats. In brief, the format is:
(1) Anything in parentheses is thrown away (as a comment).
(2) Anything in angle brackets (“< >”) is preferred over anything else. This rule implements the
ARPANET standard that addresses of the form
user name <machine-address>
will send to the electronic “machine-address” rather than the human “user name.”
(3) Double quotes ( " ) quote phrases; backslashes quote characters. Backslashes are more
powerful in that they will cause otherwise equivalent phrases to compare differently — for
example, user and "user" are equivalent, but \user is different from either of them.
Parentheses, angle brackets, and double quotes must be properly balanced and nested. The
rewriting rules control remaining parsing3.
2
Obviously, if the site giving the error is not the originating site, the only reasonable option is to mail back to the sender. Also,
there are many more error disposition options, but they only effect the error message — the “return to sender” function is always han-
dled in one of these two ways.
3
Disclaimer: Some special processing is done after rewriting local names; see below.
SMM:9-6 SENDMAIL — An Internetwork Mail Router
3.3.1. Aliasing
Aliasing maps names to address lists using a system-wide file. This file is indexed to
speed access. Only names that parse as local are allowed as aliases; this guarantees a unique
key (since there are no nicknames for the local host).
3.3.2. Forwarding
After aliasing, recipients that are local and valid are checked for the existence of a “.for-
ward” file in their home directory. If it exists, the message is not sent to that user, but rather to
the list of users in that file. Often this list will contain only one address, and the feature will be
used for network mail forwarding.
Forwarding also permits a user to specify a private incoming mailer. For example, for-
warding to:
"| /usr/local/newmail myname"
will use a different incoming mailer.
3.3.3. Inclusion
Inclusion is specified in RFC 733 [Crocker77a] syntax:
:Include: pathname
An address of this form reads the file specified by pathname and sends to all users listed in that
file.
The intent is not to support direct use of this feature, but rather to use this as a subset of
aliasing. For example, an alias of the form:
project: :include:/usr/project/userlist
is a method of letting a project maintain a mailing list without interaction with the system
administration, even if the alias file is protected.
It is not necessary to rebuild the index on the alias database when a :include: list is
changed.
SENDMAIL — An Internetwork Mail Router SMM:9-7
3.7. Configuration
Configuration is controlled primarily by a configuration file read at startup. Sendmail should
not need to be recomplied except
(1) To change operating systems (V6, V7/32V, 4BSD).
(2) To remove or insert the DBM (UNIX database) library.
(3) To change ARPANET reply codes.
(4) To add headers fields requiring special processing.
Adding mailers or changing parsing (i.e., rewriting) or routing information does not require recom-
pilation.
If the mail is being sent by a local user, and the file “.mailcf” exists in the sender’s home
directory, that file is read as a configuration file after the system configuration file. The primary use
of this feature is to add header lines.
The configuration file encodes macro definitions, header definitions, mailer definitions,
rewriting rules, and options.
3.7.1. Macros
Macros can be used in three ways. Certain macros transmit unstructured textual informa-
tion into the mail system, such as the name sendmail will use to identify itself in error messages.
Other macros transmit information from sendmail to the configuration file for use in creating
SMM:9-8 SENDMAIL — An Internetwork Mail Router
other fields (such as argument vectors to mailers); e.g., the name of the sender, and the host and
user of the recipient. Other macros are unused internally, and can be used as shorthand in the
configuration file.
4.1. Delivermail
Sendmail is an outgrowth of delivermail. The primary differences are:
(1) Configuration information is not compiled in. This change simplifies many of the problems
of moving to other machines. It also allows easy debugging of new mailers.
(2) Address parsing is more flexible. For example, delivermail only supported one gateway to
any network, whereas sendmail can be sensitive to host names and reroute to different gate-
ways.
(3) Forwarding and :include: features eliminate the requirement that the system alias file be
writable by any user (or that an update program be written, or that the system administration
make all changes).
(4) Sendmail supports message batching across networks when a message is being sent to mul-
tiple recipients.
SENDMAIL — An Internetwork Mail Router SMM:9-9
(5) A mail queue is provided in sendmail. Mail that cannot be delivered immediately but can
potentially be delivered later is stored in this queue for a later retry. The queue also pro-
vides a buffer against system crashes; after the message has been collected it may be reli-
ably redelivered even if the system crashes during the initial delivery.
(6) Sendmail uses the networking support provided by 4.2BSD to provide a direct interface net-
works such as the ARPANET and/or Ethernet using SMTP (the Simple Mail Transfer Proto-
col) over a TCP/IP connection.
4.2. MMDF
MMDF [Crocker79] spans a wider problem set than sendmail. For example, the domain of
MMDF includes a “phone network” mailer, whereas sendmail calls on preexisting mailers in most
cases.
MMDF and sendmail both support aliasing, customized mailers, message batching, automatic
forwarding to gateways, queueing, and retransmission. MMDF supports two-stage timeout, which
sendmail does not support.
The configuration for MMDF is compiled into the code4.
Since MMDF does not consider backwards compatibility as a design goal, the address parsing
is simpler but much less flexible.
It is somewhat harder to integrate a new channel5 into MMDF. In particular, MMDF must
know the location and format of host tables for all channels, and the channel must speak a special
protocol. This allows MMDF to do additional verification (such as verifying host names) at submis-
sion time.
MMDF strictly separates the submission and delivery phases. Although sendmail has the
concept of each of these stages, they are integrated into one program, whereas in MMDF they are
split into two programs.
4
Dynamic configuration tables are currently being considered for MMDF; allowing the installer to select either compiled or dy-
namic tables.
5
The MMDF equivalent of a sendmail “mailer.”
6
This is similar to the NBS standard.
SMM:9-10 SENDMAIL — An Internetwork Mail Router
ACKNOWLEDGEMENTS
Thanks are due to Kurt Shoens for his continual cheerful assistance and good advice, Bill Joy for
pointing me in the correct direction (over and over), and Mark Horton for more advice, prodding, and many
of the good ideas. Kurt and Eric Schmidt are to be credited for using delivermail as a server for their pro-
grams (Mail and BerkNet respectively) before any sane person should have, and making the necessary
modifications promptly and happily. Eric gave me considerable advice about the perils of network software
which saved me an unknown amount of work and grief. Mark did the original implementation of the DBM
version of aliasing, installed the VFORK code, wrote the current version of rmail, and was the person who
really convinced me to put the work into delivermail to turn it into sendmail. Kurt deserves accolades for
using sendmail when I was myself afraid to take the risk; how a person can continue to be so enthusiastic in
the face of so much bitter reality is beyond me.
Kurt, Mark, Kirk McKusick, Marvin Solomon, and many others have reviewed this paper, giving
considerable useful advice.
Special thanks are reserved for Mike Stonebraker at Berkeley and Bob Epstein at Britton-Lee, who
both knowingly allowed me to put so much work into this project when there were so many other things I
really should have been working on.
REFERENCES
[Birrell82] Birrell, A. D., Levin, R., Needham, R. M., and Schroeder, M. D., “Grapevine:
An Exercise in Distributed Computing.” In Comm. A.C.M. 25, 4, April 82.
[Borden79] Borden, S., Gaines, R. S., and Shapiro, N. Z., The MH Message Handling Sys-
tem: Users’ Manual. R-2367-PAF. Rand Corporation. October 1979.
[Crocker77a] Crocker, D. H., Vittal, J. J., Pogran, K. T., and Henderson, D. A. Jr., Standard for
the Format of ARPA Network Text Messages. RFC 733, NIC 41952. In [Fein-
ler78]. November 1977.
[Crocker77b] Crocker, D. H., Framework and Functions of the MS Personal Message System.
R-2134-ARPA, Rand Corporation, Santa Monica, California. 1977.
[Crocker79] Crocker, D. H., Szurkowski, E. S., and Farber, D. J., An Internetwork Memo Dis-
tribution Facility — MMDF. 6th Data Communication Symposium, Asilomar.
November 1979.
[Crocker82] Crocker, D. H., Standard for the Format of Arpa Internet Text Messages. RFC
822. Network Information Center, SRI International, Menlo Park, California.
August 1982.
[Metcalfe76] Metcalfe, R., and Boggs, D., “Ethernet: Distributed Packet Switching for Local
Computer Networks”, Communications of the ACM 19, 7. July 1976.
[Feinler78] Feinler, E., and Postel, J. (eds.), ARPANET Protocol Handbook. NIC 7104,
Network Information Center, SRI International, Menlo Park, California. 1978.
[NBS80] National Bureau of Standards, Specification of a Draft Message Format Stan-
dard. Report No. ICST/CBOS 80-2. October 1980.
[Neigus73] Neigus, N., File Transfer Protocol for the ARPA Network. RFC 542, NIC
17759. In [Feinler78]. August, 1973.
[Nowitz78a] Nowitz, D. A., and Lesk, M. E., A Dial-Up Network of UNIX Systems. Bell
Laboratories. In UNIX Programmer’s Manual, Seventh Edition, Volume 2.
August, 1978.
[Nowitz78b] Nowitz, D. A., Uucp Implementation Description. Bell Laboratories. In UNIX
Programmer’s Manual, Seventh Edition, Volume 2. October, 1978.
[Postel74] Postel, J., and Neigus, N., Revised FTP Reply Codes. RFC 640, NIC 30843. In
[Feinler78]. June, 1974.
[Postel77] Postel, J., Mail Protocol. NIC 29588. In [Feinler78]. November 1977.
[Postel79a] Postel, J., Internet Message Protocol. RFC 753, IEN 85. Network Information
Center, SRI International, Menlo Park, California. March 1979.
[Postel79b] Postel, J. B., An Internetwork Message Structure. In Proceedings of the Sixth
Data Communications Symposium, IEEE. New York. November 1979.
[Postel80] Postel, J. B., A Structured Format for Transmission of Multi-Media Documents.
RFC 767. Network Information Center, SRI International, Menlo Park, Califor-
nia. August 1980.
[Postel82] Postel, J. B., Simple Mail Transfer Protocol. RFC821 (obsoleting RFC788).
Network Information Center, SRI International, Menlo Park, California. August
1982.