NETWORK
PROGRAMMING
Networking APIs: Sockets and XTI‘Communications
UNIX
The only quite
to Unix network
progresses
APix you'll ever
woods
Whether you weite Web servers, clienl/server applications, or any oer network
software, you need to understand networking. APIs—-especially sockets—In greater
detail than ever before. You. need URI Betwork Programndng, Volume 1,
Second Edition.
In this Look, leading Unix networking expert, W: Richard Stevens, offers unprecedented,
slatl-to-finish guidance on making the most of sockets, Uke de facto standard for
Unix network programming—as well as extensive coverage of the X/Open Transport
Intertace OTH.
SS ianedioy 7.
wOnmicnds feet
‘Learn how to choose among today’s leading client/server design approaches, including
‘TCP iterative, concurrent, preforked and prethreaded servers. Master the X/Open
‘Transport Interface, including XT TCP clients and serveys, name and address functions,
ceptions, streams, and additional functions.
‘The Internev/intranet. revolution has dramatically increased the demand for developers
‘wilh a sophisticated understanding of nebwork programming APIs, especially
sockets. One book Contains all you need 10 know: UREX Watwork Programming,
Vohine 3, Second Edition. me
ABOUT THE AUTHOR .
W. RICHARD STEVENS is author of UNIX Notwork Programming. First Edition,
widely recognized as the classic text in Unix networking, He is also the author of
Advanced Progeamviing in the UNIX Environment and the TCPAP IMustrated Serles,
He is an acknowledged Unix and networking expert, sought-aller instructor, and
occasional consultant, .
PRENTICE HALL
Upper Saddie River, Nd 07458
tty! FerweenphoptraceenUNIX Network Programming
Volume 1
Second Edition
Networking APIs:
Sockets and XTI
by W. Richard Stevens
‘To join a Prentice Hall PTR Internet mailing list, point to
http://www:prenhall.com/mail_lists/
‘ISBN 0-13-Ws0012-x
Prentice Hall PTR
Upper Saddle River, NJ. 07458Library of cor
stevens,
ONIX network programming / by S. Richard stevens. -- and a.
Bem
des index.
s-490012-x
1. OMX (Computer Ee) 2, Computer networks. 3. Inteznet
peogranming. I. Title
GA76. 76.06 iss?
005.7°527768--deat sr-aa763
cae
Editorial/Production Supervision: Eileen Clark
‘Acquisitions Editor: Mary Franz
Marketing Manager: Miles Willianss
Buyer: Alexis K. Heyelt
Cover Design: Seott Weiss
Cover Design Direction: Jerry Votta
Egitorial Assistant: Noreen Regina
© 1998 Prentice Hall PTR
Prentice-Hall, Inc
A Simon & Schuster Company
Upper Saddte River, NJ 07458
Prentice Hall books are widely used by corporations and government agencies for training, marketing, and
resale, The publisher offers discounts on this book when ordered in bulk quantities.
For mote information, contact
Corporate Sales Department,
Phone: 800-382-3419; FAX: 201-236-7141
E-mail (Internet):
[email protected]
Or write: Prentice Hall PTR
a Corp, Sales Department
One Lake Street
Upper Saddle River, N) 07458
All rights reserved. No part of this book may be
reptochiced, in any form or by any means, without
permission in writing from the publisher,
Printed in the United States of America
1987654321
ISBN 0-13-490012-X
Prentice-Hall International (UK) Limited, London
Prentice-Hall of Australia Pty. Limited, Sydney
- Prentice: Hell Canada Inc, Toronto
Prentice-Hall Hispanoamericana, .A., Mexico
Prentice-Hall of India Private Limited, New Delhi
Prentice-Hall of Japan Inen Tokyo
Sinton & Schuster Asia Pte. td, Singapore
Editora Prentice-Hall do Brasil Ltda., Rio de JaneiroTo Sally, Bill, Ellen, and David.
Aloha nui loa.Contents
Preface xv
Part 1. Introduction and TCP/IP 1
Chapter 1. Introduction 3
14 Introduction 3
1.2 A Simple Daytime Client 6
1.3 Protocol Independence 9
14 Error Handling: Wrapper Functions 11
15 A Simple Daytime Server 13
1.6 Road Map to Client-Server Examples in the Text 16
17 OSI Model 18
1.8 BSD Networking History 19
19 Test Networks and Hosts 20
110 Unix Standards 24
1.41 6&bit Architectures 27
1.42 © Summary 28
Chapter 2. The Transport Layer: TCP and UDP 29
24 Introduction 29
22 The Big Picture 30
23 UDP: User Datagram Protocol 32
24 TOP: Transmission Control Protocol 32,
25 TOP Connection Establishment and Termination 34yi UNIX. Network Programming Contents
26 TIMEWAIT State 40
27 Port Numbers 41
28 TCP Port Numbers and Concurrent Servers 44
29 Buffer Sizes and Limitations 46
240 Standard Intemet Services 50
2.11 Protocol Usage by Common Internet Applications 52
2.12 Summary — 52
Part 2. Elementary Sockets . 55
Chapter 3. Sockets Introduction 87
3.1 Introduetion 57
32 Socket Address Stuctures 57
33 Value-Result Arguments 63
3.4 Byte Ordering Functions 66
35 Byte Manipulation Functions 69
36 inet_aton, inet_ad@r, and inet_ntoa Functions 70
3.7 inet_pton and inet_ntop Functions 72
38 sock ntop and Related Functions — 75
39 readn, writen, and readline Functions 77
3.10 isfdtype Function 81
3.11 Summary 82
Chapter 4, Elementary TCP Sockets 85
41 Introduction 85
42 socket Function 85
43 connect Function 89
44 bind Function 91
45 listen Function 93
48 accept Function 99
4.7 fork and exec Functions 102
48 — Concurrent Servers 104
49 close Function 107
410 getsockname and getpeername Functions 107
411 Summary 110
Chapter 5. TGP Client-Server Example aw
5.1 Introduction 111
62 TCP Echo Server: main Function 112
63 TCP Echo Server: str_echo Function 113
5.4 TCP Echo Client: main Function = 113
55 TCP Echo Client str_cli Function 118
56 Normal Startup 115,
57 Normal Temmination 117
58 — Posix Signal Handing 119
59 Handling StGcHLD Signals 122
5.10 wait and waitpid Functions — 124‘UNIX Network Programming Contents vii
51
5.12
5.13
5.14
5.15
5.16
5.17
5.18
5.19
Chapter 6.
64
62
63
64
65
66
67
68
69
6.10
611
612
Chapter 7.
cal
72
73
74
75
76
7
78
79
7.40
7H
Chapter 8,
at
82
83
84
85
86
87
88
89
8.40
Connection Abort before accept Relums 129
Termination of Server Process 130
SIGPIPE Signal 132
Crashing of Server Host 133
Crashing and Rebooting of Server Host 134
Shutdown of Server Host 135
Summary of TCP Example 135
Data Format 137
Summary 140
VO Multiplexing: The select and poll Functions 143
Introduction 143,
VO Models 144
select Function — 150
str_eli Function (Revisited) 155
Batch Input 157
shutdown Function 160
str_cli Function (Revisited Again) 161
TOP Echo Server (Revisited) 162
pselect Function 168
Poli Function 169
TCP Echo Server (Revisited Again) 172
Summary 175
Socket Options 77
Introduction 77
getsockopt and setseckept Functions 178
Checking If an Option Is Supported and Obtaining the Default 178
Socket States 183
Generic Socket Options 183
IPv4 Socket Options 197
ICMPV6 Socket Option 199
IPV6 Socket Options 199
TOP Socket Options 201
fent1 Function — 205
Summary 207
Elementary UDP Sockets 21
Introduction 211
recvErom and sendto Functions 212
UDP Echo Server: main Function 213
UDP Echo Server: dg_echo Function 214
UDP Echo Client: main Function 216.
UDP Echo Client: dg_cli Function 217
Lost Datagrams 217
Verifying Received Response 218
Server Not Running 220
Summary of UDP example 221UNIX Network Programming,
Contents
Bit connect Function with UDP. 224
B12 dg_cli Function (Revisited) 227
B13 Lack of Flow Control with UDP 228
814 — Determining Outgoing Interface with UDP 231
B15 TCP and UDP Echo Server Using select 233
B16 Summary 235
Chapter 9. Elementary Name and Address Conversions 237
O41 Introduction 237
9.2 Domain Name System 237
93 gethostbyname Function 240
94 RES USE INET6 Resolver Option 245
95 cethostbyname2 Function and IPv6 Support 246
9.6 gethostbyaddr Function 248
97 uname Function 249
98 gethestname Function 250
99 getservbyname and getservbyport Functions 251
8.10 Other Networking Information 255,
9.11 Summary 256
Part 3. Advanced Sockets 259
Chapter 10. IPv4 and IPv6 Interoperability 261
10.1 Introduction 261
10.2 IPv4 Client, IPv6 Server 262
10.3 IPv6 Client, [Pv4 Server 265
10.4 IPv6 Address Testing Macros 267
10.5 TPV6_ADDRFORM Socket Option (268
10.6 Source Code Portability 270
10.7 Summary = 271
Chapter 11. Advanced Name and Address Conversions 273
wa Introduction: 273
W2 getaddrinfo Function 273
aes gai_strerror Function 278
W4 freeaddrinfo Function 279
15 getaddrinfo Function: IPv6 and Unix Domain 279
11.6 getaddrinfe Function: Examples 282
W7 host_serv Function 284
11.8 tcp_connect Function 285
11.9 tep_listen Function 288
11.10 udp_client Function 293
11.11 udp_connect Function 295
11.12 udp_server Function 296
11.13 getnameinfo Function 298
14.14 Reentrant Functions 900
11.15 gethostbyname_r and gethostbyadar_x FunctionsUNIX Network Programming Contents — ix
11.16
W417
Chapter 12.
1241
12.2
123
12.4
12.5
126
127
Chapter 13.
13.1
132
13.3
13.4
135
136
137
13.8
13.9
13.10
Chapter 14.
144
14.2
14.3
144
14.5
14.6
14.7
14.8
149
Chapter 15.
15.1
15.2
153
15.4
15.5
15.6
187
Chapter 16.
161
162
163
Implementation of getaddrinfo and getnameinfo Functions 305,
Summary 328
Daemon Processes and inetd Superserver 331
Introduction 331
sysloga Daemon 332
syslog Function 333
@aemon_init Function 335
inetd Daemon 339
Gaemon_ineta Funetion 344
Summary 346
Advanced VO Functions 349
Introduction 949
Socket Timeouts 349
recy and send Functions 354
ready and writev Functions 357
xecvmsg and sendmsg Functions 358
Ancillary Data 362
How Much Data Is Queued? 365
Sockets and Standard VO 366
T/TCP: TCP for Transactions 369
Summary 371
Unix Domain Protocols 373
Introduction 373
Unix Domain Socket Address Structure 374
socketpair Function 376
Socket Functions 377
Unix Domain Stream Client-Server 378
Unix Domain Datagram Client-Server 379
Passing Descriptors 381
Receiving Sender Credentials 390
Summary 394
Nonblocking YO 397
Introduction 397
Nonblocking Reads and Writes: str_cli Function (Revisited) 399
Nonblocking connect 409
Nonblocking connect: Daytime Client 410
Nonblocking connect: Web Client 413,
Nonblocking accept 422
Summary 424
ioct1 Operations 425
Introduction 425
ioctl Function 426
Socket Operations 426x
UNIX Network Programming
164
165
166
167
168
169
16.10
Chapter 17.
WA
172
173
174
175
176
177
Chapter 18.
18.1
18.2
183
18.4
185
186
Chapter 19.
19.1
192
193
19.4
19.5
19.6
197
198
199
19.10
19.11
1912
Chapter 20.
20.1
202
203
20.4
205
20.6
207
208
209
File Operations 427
Interface Configuration 428
get_afi_infe Function — 429
Interface Operations 439
ARP Cache Operations 440
Routing Table Operations 442
Summary 443
Routing Sockets
Introduction 445
Datalink Socket Address Structure 446
Reading and Writing 447
syscti Operations 454
get_ifi_info Function — 459
Interface Namo and Index Functions 463
Summary 467
Broadcasting
Introduction 469
Broadcast Addresses 470
Unicast versus Broadcast 472
‘Gq_cli Function Using Broadcasting 475
Race Conditions 478
Summary 486
Multicasting
Introduction 487
Multicast Addresses 487
‘Mutticasting versus Broadcasting on A LAN 490
Multicasting on a WAN 493
Multicast Socket Of 495
mcast_join and Related Functions 499
dg_cii Function Using Multicasting 502
Receiving MBone Session Announcements 504
Sending and Receiving 807
SNTP: Simple Network Time Protocol 510
SNTP (Continued) 515
Summary 528
Advanced UDP Sockets
Introduction 531
Receiving Flags, Destination IP Address, and Interface Index
Datagram Truncation 539
When to Use UDP Instead Of TCP 539
‘Adding Reliability to a UDP Application 542
Binding Interface Addresses 553
Concurrent UDP Servers 857
IPv6 Packet Information 560
Summary 5€2
531
532UNIX Network Programming Contents xi
Chapter 21. Out-of-Band Data 565
21.1 Introduction 565
21.2 TCP Outot-Band Data 565,
213 sockatmark Function 572
21.4 TCP Outof-Band Data Summary 580
21.5 Client-Server Heartbeat Functions 581
216 Summary 586
Chapter 22. Signal-Driven VO 589
+ 224 Introduction: 569
i 222 Signal-Driven YO for Sockets 590
223 UDP Echo Server Using SICIo 592
22.4 Summary 598
Chapter 23. Threads 601
23.1 Introduction 601
23.2 Basic Thread Functions: Creation and Termination 602
23.3 str_cli Function Using Threads 605
23.4 TCP Echo Server Using Threads 607
235 Thread-Specific Data 611
23.6 Web Client and Simultaneous Connections (Continued) 620
23.7 Mutexes: Mutual Exclusion 622
23.8 Condition Variables 627
23.9 — Web Client and Simultaneous Connections (Continued) 631
23.10 Summary 633.
Chapter 24. IP Options 635
24.1 Introduction 635
24.2 IPv4 Options 635
243 IPv4 Source Route Options 637
24.4 IPv6 Extension Headers 645
24.5 IPv6 Hop-by-Hop Options and Destination Options 645
24.6 — IPv6 Routing Header 649
24,7 IPv6 Sticky Options 653
24.8 Summary 654
Chapter 25. Raw Sockets 655
25.1 Introduction 655
252 Raw Socket Creation 656
25.3 Raw Socket Output 657
25.4 Raw Socket Input 659
25.5 Ping Program 661
26.6 Traceroute Progam 672
25.7 An ICMP Message Daemon 685,
25.8 Summary 702UNIX Network Programming
Contents
Chapter 26. Datalink Access 703
26.1 Introduction 708
262 BPF: BSD Packet Filter 704
263 DPI: Data Link Provider Interface 706
264 Linux: SOCK_pACKET 707
265 — Likpcap: Packet Capture Library 707
26.6 Examining the UDP Checksum Field 708
267 Summary — 725
Chapter 27. Client-Server Design Alternatives 727
27.1 Introduction 727
272 TCP Client Alternatives 730
273 TCP Test Client — 730
27.4 TCP iterative Server 732
275 TCP Concurrent Server, One Child per Client 732
276 TCP Preforked Server, No Locking around accept 736
277 TCP Prelorked Server, File Locking around accept — 742
27.8 TCP Preforked Server, Thread Locking around accept — 745
27.9 TCP Preforked Server, Descriptor Passing 746
27.40 TCP Concurrent Server, One Thread per Client 752
27.41 TCP Prethreaded Server, per-Thread accept 754
2712 TCP Prethreaded Server, Main Thread accept 756
27.13 Summary 759
Part 4. XTi: X/Open Transport Interface 761
Chapter 28. XTI: TCP Clients 763
281 Introduction 763
282 t_open Function 764
283 t_error and t_strerror Functions 767
28.4 nethuf Structures and XTI Structures 769
285 t_bind Function 770
286 t_connect Function — 772
287 t_rcv and t_snd Functions 773
288 t_look Function 774
P89 t_andrel and t_rcvrel Functions 775
28.10 t_snddis and t_rcvais Functions 777
2811 XT TCP Daytime Client 778
2812 xti_vdwr Function 781
28.13 Summary 782
Chapter 29. XTI: Name and Address Functions 783
29.1 Introduction 763
292 /ete/netconfig File and netcontig Functions 784
29.3 NETPATH Variable and netpath Functions 785
netdir Functions 786UNIX Network Programming Contents
29.5 t_allec and t_free Functions 788
296 t_getprotadér Functions 790
29.7 xti_ntop Function — 791
29.8 — tcp_connect Function 792
299 Summary 796
Chapter 30. XTI: TCP Servers 797
30.1 Introduction 797
302 t_listen Function 799
303 tep_listen Function 800
304 t_accept Function 802
805 xti_accept Function 803
30.6 Simple Daytime Server 804
30.7 Multiple Pending Connections 806
30.8 — xti_accept Function (Revisited) 808
30.9 Summary 16
Chapter 31. XTI: UDP Clients and Servers 819
31.4 Introduction 819
31.2 t_revudata and t_snéudata Functions 819
31.3 udp_client Function 820
31.4 t_revuderr Function: Asynchronous Errors 824
31.5 udp_server Function 626
31.6 Reading a Datagram in Pieces 829
31.7 Summary 831
Chapter 32. XTI Options 833
32.1 Introduction 833.
32.2 t_opthdr Structure 835
323 © XTI Options 837.
32.4 toptmgmt Function 840
92.5 Checking If an Option Is Supported and Obtaining the Default 841
32.6 Getting and Setting XT! Options 844
327 © Summary 848
Chapter 33. Streams 849
33.1 Introduction 849
33.2 Overview 850
333 getmsg and putmsg Functions 854
334 getpmsg and putpmsg Functions 855
33.5 ioctl Function 855
33.6 TPI: Transport Provider Intorface 856
337 Summary 866
Chapter 34. XTI: Additional Functions 867
344 Introduction ‘867
34.2 Nonblocking VO 867
343 t_rcveonnect Function 868xiv UNIX Network Programming Contents
344 e_getinfo Function 869
34.5 t_getstate Funetion 869
346 — c_sync Function 870
347 t_unbina Function 872
348 t_revy and t_revvudata Functions ‘872
349 t_sndv and t_sndvudata Functions 873
34.10 t_revreldata and t_sndreldata Functions 874
34.11 SignalDriven VO 874.
34.12 Out-of-Band Data 875
34.13 Loopback Transport Providers 880
34.14 Summary, 881
Appendix A. |Pv4, IPv6, ICMPv4, and ICMPv6 883
Ad Introduction 883
AZ IPv4 Header 883
AZ IPv6 Header 885
AA IPv4 Addresses 887
AS — IPVG Addresses 892
AG — ICMPv4 and ICMPVE: Internet Control Message Protocol 896
Appendix B. Virtual Networks 899
Bt Introduction 699
B2 The MBone 899
B3 The 6bone 90t
Appendix C. Debugging Techniques 903
C1 System Call Tracing 903
C2 — Standard Intemet Services. 908
C3 — sock Program 908
C4 — Small Test Programs 911
cs tepdump Program 913
C6 netstat Program 914
C7 —Isof Program 914
Appendix D. Miscellaneous Source Code 915
D1 unp.h Header 915
D2 config.h Header 919
D3 unpxti-h Header 920
D4 ‘Standard Error Functions 922
Appendix E. Solutions to Selected Exercises 925
Bibliography 963
Index:
e7tPreface
Introduction
Network programming involves writing programs that communicate with other pro-
grams across a computer network, One program is normally called the client and the
other the server. Most operating systems provide precompiled programs that communi-
cate across a network—common examples in the TCP/IP world are Web clients
(browsers) and Web servers, and the FIP and Telnet clients and servers—but this book
describes how to write our own network programs.
We write network programs using an application program interface or APL We
describe two APIs for network programming:
1. sockets, sometimes called “Berkeley sockets” acknowledging their heritage from
Berkeley Unix, and
2. XTI (X/Open Transport Interface), a sight modification of the Transport Layer
Interface (TLD developed by AT&T.
All the examples in the text are from the Unix operating system, although the founda-
tion and concepts required for network programming are, to a large degree, operating.
system independent. The examples are also based on the TCP/IP protocol suite, both TP
versions 4 and 6.
To write network programs one must understand the underlying operating system
and the underlying networking protocols. This book builds on the foundation of the
my other four books in these two areas, and these books are abbreviated throughout
this text as follows:xvi UNIX Network Programming Preface
+ APUE: Advanced Programming in the UNIX Environment [Stevens 1992],
+ TCPvi: TCP/IP Mlustrated, Volume 1 [Stevens 1994],
+ TCPv2: TCP/IP Illustrated, Volume 2 [Wright and Stevens 1995}, and
© TCP3: TCPIIP illustrated, Volume 3 [Stevens 1996].
‘This second edition of UNIX Network Programming still contains information on both
Unix and the TCP/IP protocols, but many references are made to these other four texts
{o allow interested readers to obtain more detailed information on various topics. This
is especially the case for TCPv2, which describes and presents the actual 4.4BSD imple-
mentation of the network programming functions for the sockets API (socket, bind,
connect, and so on). If one understands the implementation of a feature, the use of
that feature in an application makes more sense.
Changes from the First Edition
This second edition is a complete rewrite of the first edition. These changes have been
driven by the feedback I have received teaching, this material about once a month dur-
ing 1990-1996, and by following certain Usenet newsgroups during this same time,
which lets one see the topics that are continually misunderstood. The following are the
major changes with this new edition:
+ This new edition uses ANSI C for all examples.
* The old Chapters 6 (“Berkeley Sockets”) and 8 (“Library Routines”) have been
expanded into 25 chapters. Indeed this sevenfold expansion (based on a word
count) of this material is probably the most significant change from the first to
the second edition. Most of the individual sections in the old Chapter 6 have
been expanded into an entire chapter with more examples added.
+ The TCP and UDP portions from the old Chapter 6 have been separated and we
now cover the TCP functions and a complete TCY client-server, followed by the
UDP functions and a complete UDP client-server. This is easier for newcomers
to understand than describing all, the details of the connect function, for exam-
ple, with its different semantics for TCP versus'UDP.
* The old Chapter 7 (“System V Transport Layer Interface”) has been expanded
into seven chapters. We also cover the newer XTI instead of the TLI that it
replaces.
* The old Chapter 2 (‘The Unix Model”) is gone. This chapter provided an
overview of the Unix system in about 75 pages. In 1990 this chapter was needed.
because few books existed that adequately described the basic Unix program-
ming interface, especially the differences between the Berkeley and System V
implementations that existed in 1990. Today, however, more readers have a fun-
damental understanding of Unix, so concepts such as a process ID, password
files, directories, and group IDs, need not be repeated. (My APUE book is a
700-page expansion of this material for readers desiring additional Unix pro-
gramming details.)UNIX Network Programming Preface xvii
Some of the advanced topics from the old Chapter 2 are covered in this new edi-
tion, but their coverage is moved to where the feature is used. For example,
when showing our first concurrent server (Section 4.8) we cover the fork func-
tion. When we describe how to handle the SIGCHLD signal with our concurrent
server (Section 5.9), we describe many additional features of Posix signal han-
dling (zombies, interrupted system calls, etc.).
* Whenever possible this text describes the Posix interface. (We say more about
the Posix family of standards in Section 1.10.) ‘This includes not only the Posix.1
standard for the basic Unix functions (process control, signals, etc.), but also the
forthcoming Posix.1g standard for the sockets and XTI networking APIs, and the
1996 Posix.1 standard for threads.
The term “system call” has been changed to “function” when describing func
tions such as socket and connect. This follows the Posix convention that the
distinction between a system call and a library function is an implementation
detail that is often irrelevant for a programmer.
* The old Chapters 4 (“A Network Primer”) and 5 (“Communication Protocols”)
have been replaced with Appendix A covering IP versions 4 (IPv4) and 6 (IPv6),
and Chapter 2 covering TCP and UDP. This new material focuses on the proto-
col issues that network programmers are certain fo encounter. The coverage of
IPv6 was included, even though [Pv6 implementations are just starting to
appear, since during the lifetime of this text IPv6 will probably become the pre-
dominant networking protocol.
Thave found when teaching network programming that about 80% of all net-
work programming problems have nothing to do with network programming,
per se. That is, the problems are not with the API functions such as accept and
select, but the problems arise from a lack of understanding of the underlying
network protocols. For example, I have found that once a student understands
TCP’s three-way handshake and four-packet connection termination, many net-
work programming problems are immediately understood,
The old sections on XNS, SNA, NetBIOS, the OSI protocols, and UUCP have
been removed, since it has become obvious during the early 1990s that these
proprietary protocols have been eclipsed by the TCP/IP protocols. (UUCP is
still popular and is not proprietary, but there is little we can show from a net-
work programming perspective using UUCP.)
* The following new topics are covered in this second edition:
IPv4/1Pv6 interoperability (Chapter 10),
protecol-independent name translation (Chapter 11),
routing sockets (Chapter 17),
multicasting (Chapter 19),
threads (Chapter 23),
IP options (Chapter 24),
datalink access (Chapter 26),UNIX Network Programming
Preface
* client-server design alternatives (Chapter 27),
+ virtual networks and tunneling (Appendix B), and
«+ network program debugging techniques (Appendix C).
Unfortunately, the coverage of the material from the first edition has been expanded
so much that it no longer fits into a single book. Therefore at least two additional vol-
‘umes are planned in the LINEX Network Programming series.
* Volume 2 will probably be subtitled IPC: Interprocess Communication and wil} be
an expansion of the old Chapter 3, along with coverage of the 1996 Posix.1 real-
time IPC mechanisms,
+ Volume 3 will probably be subtitled Applications and will be an expansion of
Chapters 9-18 of the first edition.
Even though most of the networking applications will be covered in Volume 3, a few
speciat applications are covered in this volume: Ping, Traceroute, and inetd.
Readers
This text can be used as either a tutorial on network programming, or as a reference for
experienced programmers. When used as a tutorial or for an introductory class on net~
work programming, the emphasis should be on Part 2 (“Elementary Sockets,” Chapters
3 through 9) followed by whatever additional topics are of interest. Part 2 covers the
basic socket functions, for both TCP and UDP, along with I/O multiplexing, socket
options, and basic name and address conversions. Chapter 1 should be read by alll read
ets, especially Section 1.4, which describes some wrapper functions used throughout the
text. Chapter 2 and perhaps Appendix A should be referred to as necessary, depending.
on the reader’s background. Most of the chapters in Part 3 (“Advanced Sockets”) can
be read independently of the others in that part.
To aid in the use as a reference, a thorough index is provided, along with sum-
aries on the end papers of where to find detailed descriptions of all the functions and.
structures. To help those reading topics in a random order, numerous references to
related topics are provided throughout the text.
Although the sockets APT has become the de facto standard for network program-
ming, XTLis still used, sometimes with protocol suites other than TCP/IP. While the
coverage of XTI in Part 4 is smaller than the coverage of sockets in Parts 2 and 3, much
of the sockets coverage describes concepts that apply to XTI as well as sockets. For
example, all of the concepts regarding the use of nonblocking I/O, broadcasting, multi-
casting, signal-driven 1/O, out-of-band data, and threads, are the same, regardless of
which API (sockets or XT1) is used. Indeed, many network programming problems are
fundamentally similar, independent of whether the program is written using sockets or
XIT, and there is hardly anything that can be done with one AVI that cannot be done
with the other. The concepts are the same—just the function names and arguments
change.UNIX Network Programming Preface xix
Source Code and Errata Availability
The source code for all the examples that appear in the book is available from
ftp: //£tp.kohala.com/pub/rsetevens/unpvi2e.tar.gz. The best way to
Jean network programming is to fake these programs, modify them, and enhance
them. Actually writing code of this form is the only way to reinforce the concepts and
techniques, Numerous exercises are also provided at the end of each chapter, and most
answers are provided in Appendix E.
A current errata for the book is also available from my home page, listed at the end
of the Preface.
Acknowledgments
Supporting every author is an understanding family, or nothing would ever get written!
Lam grateful to my family, Sally, Bill, Ellen, and David, first for their support and
understanding when I wrote my first book (the first edition of this book), and for endur-
ing this “small” revision. Their love, support, and encouragement helped make this
book possible.
Numerous reviewers provided invaluable feedback (totaling 190 printed pages or
70,000 words), catching lots of errors, pointing out areas that needed more explanation,
and suggesting alternative presentations, wording, and coding: Ragnvald Blindheim,
Jim Bound, Gavin Bowe, Allen Briggs, Joe Doupnik, Wu-chang Feng, Bil! Fenner, Bob
Friesenhahn, Andrew Gierth, Wayne Hathaway, Kent Hofer, Sugih Jamin, Scott John-
son, Rick Jones, Mukesh Kacker, Marc Lampo, Marty Leisner, Jack McCann, Craig Metz,
Bob Nelson, Evi Nemeth, John C. Noble, Steve Rago, Jim Reid, Chung-Shang Shao, lan
Lance Taylor, Ron Taylor, Andreas Terzis, and Dave Thaler. A special thanks to Sugih
Jamin and his students in EECS 489 (“Computer Networks”) at the University of Mic
gan who beta tested an carly draft of the manuscript during the spring of 1997,
The following people answered email questions of mine, sometimes lots of ques-
tions, which improved the accuracy and presentation of the text: Dave Butenhof, Dave
Hanson, Jim Hogue, Mukesh Kacker, Brian Kernighan, Vern Paxson, Steve Rago, Dennis
Ritchie, Steve Summit, Paul Vixie, John Wait, Steve Wise, and Gary Wright.
A special thanks to Larry Rafsky and the wonderful team at Gari Software for han-
ding lots of details and for many interesting technical discussions. Thank you, Larry,
for everything.
Numerous individuals and their organizations went beyond the normal call of duty
to provide either a loaner system, software, or access to a system, all of which were used
to test some of the examples in the text.
‘+ Meg McRoberts of SCO provided the latest releases of UnixWare, and Dion
Johnson, Yasmin Kureshi, Michael Townsend, and Brian Ziel, provided support
and answered questions.
‘+ Mukesh Kacker of SunSoft provided access to a beta version of Solaris 2.6 and
answered many questions about the Solaris TCP/IP implementation.xx UNIX Network Programming Preface
* Jim Bound, Matt Thomas, Mary Clouter, and Barb Glover of Digital Equipment
Corp. provided an Alpha system and access to the latest IPv6 kits for Digital
Unix.
* Michael Johnson of Red Hat Software provided the latest releases of Red Hat
Linux.
* Steve Wise and Jessie Haug of IBM Austin provided an RS/6000 system and
access to the latest IPv6 for AD.
+ Rick Jones of Hewlett-Packard provided access to a beta version of HP-UX 10.30
and he and William Gilliam answered many questions about it.
Many people helped with the Internet connectivity used throughout the text. My
thanks once again to the National Optical Astronomy Observatories (NOAO), Sidney
Wolff, Richard Wolff, and Steve Grandi, for providing access to their networks and
hosts. Dave Siegel, Justis Addis, and Paul Lucchina answered many questions, Phil
Kaslo and Jim Davis provided an MBone connection, Ran Atkinson and Pedro Marques
provided a 6bone connection, and Craig Metz, provided lots of DNS help.
The staff at Prentice Tall, especially my editor Mary Franz, along with Noreen
Regina, Sophie Papanikolaou, and Eileen Clark, have been ‘a wonderful asset to a writer.
Many thanks for letting me do so many things “my way.”
As usual, but contrary to popular fads, [ produced camera-ready copy of the book
using the wonderful Groff package written by James Clark. I typed in all 291,972 words
using the vi editor, created the 201 illustrations using the gpic program (using many
of Gary Wright's macros), produced the 81 tables using the gtb1 program, performed
all the indexing, and did the final page layout. Dave Hanson's Loom program and some
's by Gary Wright were used to include the source code in the book. A set of av
scripts written by Jon Bentley and Brian Kernighan helped in producing the final index.
L welcome electronic mail from any readers with comments, suggestions, or bug,
fixes.
Tucson, Arizona W. Richard Stevens
September 1997
[email protected]
http://www. kohala.com/~rstevensPart 1
Introduction and TCP/IP1
Introduction
Introduction
Most network applications can be divided into two pieces: a client and a server, We can
draw the communication link between the two as shown in Figure 1
Figure 11, Network application: client and server,
There are numerous examples of clients and servers that most readers are probably
familiar with: a Web browser (a client) communicating with a Web server; an FTP client
fetching a file from an FTP server; a Telnet client that we use to log in to a remote host
through a Telnet server on that remote host.
Clients normally communicate with one server at a time, although using the Web
browser as an example, we might communicate with many different Web servers over,
say, a 10-minute time period. But from the server’s perspective at any given point in
time it is not unusual for a server to be communicating with multiple clients. We show
this in Figure 12. Later in this text we will cover several different ways for a server to
handle multiple clients at the same time.
Although we think of the client application communicating with the server applica-
tion, networking protocols are involved. In this text we focus on the TCP/IP protocol
suite, also called the Internet protocol suite. For example, Web clients and servers com-
municate using the TCP protocol. TCP, in turn, uses the IP protocol, and IP communi-
cates with a datalink layer of some form. For example, if the client and server are on the
same Ethernet, we would have the arrangement shown in Figure 1.3,4
Introduction Chapter 1
client ae] server
client
Figure 1.2 Server handling multiple clients at the same time.
application layer
transport layer
protocol stack 7 TT
‘within kemel ' onl
Po the ol! or network layer
Eshernet protocol datakink layer
Therm
Figure 1.3. Client and server on the same Ethernet communicating using TCR.
Even though the client and server communicate using an application protocol, the
transport layers communicate using TCP, and so on, we note that the actual flow of
information between the client and server goes down the protocol stack on one side,
across the network, and up the protocol stack on the other side.
We also note that the client and server are typically user processes, while the TCP
and IP protocols are normally part of the protocol stack within the keel. We have
labeled the four layers on the right side of Figure 1.3,
TCP and IP are not the only protocols that we discuss. Some clients and servers use
the UDP protocol instead of TCP and we discuss both protocols in more detail in Chap-
ter 2. Furthermore, we have used the term “IP” but the protocol, which has been in use
since the early 1980s, is officially called IP version 4 (IPv4). A new version, IP version 6
(iPv6) was developed during the mid-1990s and will probably replace IPv4 in the yearsSection 1.1 Introduction 5
to come. Initial implementations of 1Pv6 were available at the time of this writing, and
this text covers the development of network applications using both IPv4 and IPv6.
Appendix A provides a comparison of IPv4 and IPv6, along, with other protocols that
we will encounter,
The client and server need not be attached to the same local area network (LAN) as
we show in Figure 1.3. Instead, in Figure 1.4 we show the client and server on different
LANs, with the both LANs connected to a wide area network (WAN) using routers.
[alent server
application) application
ost
th with
te (ieee
S Figure 14. Client and server an different LANs connected through a WAN,
Routers are the building blocks of WANs. The largest WAN today is the Internet,
although many companies build their own WANS and these private WANs may or may
not be connected to the Internet.
: ‘The remainder of this chapter provides an introduction and overview to the various
topics that are covered in detail later in the text. We start with a complete example of a
S TCP client, albeit a simple one, that demonstrates many of the function calls and con-
cepts that we encounter throughout the text. This client works with IP version 4 only,
and we show the changes required to work with IP version 6. A better solution is to
write protocol-independent clients and servers, and we discuss this in Chapter 1. This
chapter also shows a complete TCP server that works with our client.
To simplify all the code that we write, we define our own wrapper functions for
most of the system functions that we call throughout the text. We can use these most of
the time to check for an error, print an appropriate message, and terminate when an
error occurs. We also show the test network, hosts, and routers used for most examples
in the text, along with their hostnames, IP addresses, and operating systems.
Most discussions of Unix these days include the term Posix, which is the standard
that most vendors have adopted. We describe the history of Posix and how it affects the
APIs that we describe in this text, along with the other players in the standards area,6 Introduction Chapter 1
1.2 A Simple Daytime Client
Let us consider a specific example to introduce many of the concepts and terms that we
will encounter throughout the book, Figure 15 is an implementation of a TCP time-of-
day client. This client establishes a TCP connection with a server and the server simply
sends back the current time and date ina human-readable format.
*);
10 Af { (BockEd = socket (AF_INET, SOCK SEREAM, 6)) < 6)
ery sys ("socket error");
2 brervleservaddr, sizcof (servadde)) +
13 servaddr.sin_family = AF_INET:
M4 servaddr.sin port = htons (13); /* daytime sexver */
55 4f (imet_pton(AF_INET, argv[i, bsecvaddr.sin addr) <= 0)
ls err quiti'iner_pton error for ts*, argvfl)):
17 4€ (connect (scekfa, (SA *) &servadde, cizeof (servadde)) «
a8 orrsys ("connect error")?
19 while ¢ (mn = readisockfd, recvline, MAXLINE)) > 0) {
recvline[n} > 0 /* mull terminate */
LE (fputs(recvline, stdout) == EOF)
err_sys(*Epute error");
2 )
24 if <0)
25 err_sys (‘read error");
26 exit (Or
inbrojdaytimetepeli.e
igure L5. TCP daytime client.
“This is the format that we use for all the source cede in the text. Each nonblank kine is num
bbered, The text describing portions of the code begins with the starting and ending line num:
keers in the left margin, as shown shortly. Sometimes the paragraph is preceded by a short
descriptive bold heading, providing a summary statement of the code being described.
‘The horizontal vules at the beginning and end of the code fragment specify the source code
filename: the file Gaytimetcpeli.c in the directory intro for this example. Since the
source code for all the examples in the text is freely available (ace the Preface), this lets you
locate the appropriate source file. Compiling, running, and especially modifying these pro
grams while reading this text is an excellent way to learn the concepts of network program:
ring,Section 1.2 A Simple Daytime Client 7
‘Throughout the text we will use indented, parenthetical notes such as this to deseribe imple-
‘mentation details and historical points,
If we compile the program into the default a.out file and execute it, we have the
following output.
solaris © a.out 206.62.226.35 cour inpet
Fri dan 12 14:27:52 1996 the progvaw's output
Whenever we display interactive input and output we show our typed input in a bola fone,
and the computer output Like this, Comments are added on the right side m italics, We always
include the name of the system as part of the shell prompt (solaris in this example) to show
(on which host the command was run. Figure 1.16 shows the systems used to run most of the
examples inthis book. The hostnames usually describe the operating system
‘There are many details to now consider in this 27-line program. We mention them
briefly here, in case this is your first encounter with a network program, and provide
more information on these topics later in the text.
Include our own header
: We include our own header, unp.h, which we show in Section D.1, This header
includes numerous system headers that are needed by most network programs and
defines various constants that we use (e.g., MAXEINE).
Commanc-tine arguments
23 Thisis the definition of the main function along with the command-line arguments.
‘We have written the code in this text assuming an ANSI (American National Standards
Institute) C compiler.
Create a TCP socket
i011 The socket function creates an Internet (AF_INET) stream (SOCK_STREAM) socket,
which isa fancy name for a TCP socket. The function returns a smalll integer descriptor
that we use to identify the socket in all future function calls (e.g,, the calls to connect.
and read that follow),
‘The ££ statement contains a call fo the socket function , an assignment of the return value to
the variable named socicfd, and then a test of whether this assigned value bi less than 0
While we could break this into two C statements,
sock£a = socket (AP INET, SOCK STREAM, 0);
if {eocktd < 0}
itis a common C idiom to combine the two lines. The set of parentheses around the function
«alll and assignment are required, given the precedence rules of C (the less-thatt operator has a
higher precedence than assignment), As & personal style issue, the author always places a
space between the two opening parentheses, as a visual indicator that the left-hand side of the
comparison is also an assignment. (The author first saw this style in the Minx source code
FTanenbaum 1987] and has copied it ever since) We use this samne style in the whi le state-
ment later in the program.
We will encounter many different uses of the term socket. First, the application pro-
gramming interface, or API, that we are using is called the sockets APL In the preceding8 Introduction Chapter 1
paragraph we refer to a function named socket that is part of the sockets API. In the
preceding paragraph we also refer to a “TCP socket,” which is synonymous with a
“TCP endpoint”
If the call to socket fails, we abort the program by calling our own erx_sys func-
tion. It prints our error message along with a description of the system error that
occurred (e.g,, “Protocol not supported” is one possible error from socket) and termi-
nates the process, This function, and a few others of our own that begin with err_, are
called throughout the text. We describe them in Section D4.
Specify server's IP address and port
We fill in an Internet socket address structure (@ sockaddr_in structure named
servaddr) with the server's IP address and port number. We set the entire structure to
O using bzero, set the address family to AF_TNET, set the port number to 13 (which is
the well-known port of the daytime server on any TCP/IP host that supports this ser-
vice, as shown in Figure 2.13), and set the IP address to the value specified as the first
command-line argument (argv (1]). The IP address and port number fields in this
structure must be in specific formats: we call the library function htons (“host to net-
work short”) to convert the binary port number, and we call the library function
inet_pton (‘presentation to numeric”) to convert the ASCII command-line argument
(euch as 206 .62,.226 .35 when we ran this example) into the proper format.
beers is not an ANSI C function, Itis derived from early Berkeley networking code. Never-
‘theless, we use if throughout the text, instead of the ANSI C memset: function, because bzexo
is casier to remember (with only two arguments) than memset (with three arguments).
Almost every vendor that supports the sockets AP! also provides bzero, and if not, we pro-
vide a macro definition of it in our unp-1 header.
Indeed, the author made the mistake of swapping the second and! third arguments 1 menset
in 10 oocurrences inthe first printing of TCP\3. A C compiler cannot catch this error because
both arguments azo of the same type. (Actually, the second argument isan int and the third
argument is size_c, which ts typically an unsigned int, but the values specified, 0 and 16,
respectively, are still OK for the other type of argument) The call to renset still worked but
did nothing: the number of bytes to initialize was specified as 0. The programs still worked,
because only a few of the socket functions actually require thatthe final 8 bytes of an Internet
socket address structure be set to 0. Nevertheless, it was an error, and one that ean be avoided,
by using beero, because swapping the two arguments to bzeze wil always be caught by the
C compiler if funetion prototypes are used.
“This may be your first encounter with the inet_pton function. It is new with ID version 6
(which we talk more about in Appendix A). Older code uses the inet._adr function to con
vert an ASCII dotted-decimal string into the correct format, but this function has numerous
limitations that imet_peon corrects. Do not worry if your system does net (yet) support this
fnetion; we provide an implementation of it in Section 37,
Establish connection with server
‘The connect function, when applied to a TCP socket, establishes a TCP connection
with the server specified by the socket address structure pointed to by the second argu-
ment, We must also specify the length of the socket address structure as the third argu-
ment to connect, and for Internet socket address structures we always let the compiler
calculate the length using C’s sizeof operator