Unit 4

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 27
At a glance
Powered by AI
The key takeaways are about socket options, getsockopt and setsockopt functions, generic socket options, IPv4 socket options, ICMP socket options, IPv6 socket options and TCP socket options.

Socket options allow configuration of socket behavior and are accessed via getsockopt and setsockopt functions. They apply to different levels like general socket code, IPv4, IPv6, TCP etc.

Getsockopt retrieves the current value of the socket option while setsockopt sets a new value. Setsockopt takes the new value as an argument while getsockopt returns the current value.

MC9241 NETWORK PROGRAMMING UNIT - IV UNIT IV -SOCKET OPTIONS, ELEMENTARY UDP SOCKETS (A) SOCKET OPTIONS (i) INTRODUCTION

ION (ii) GETSOCKOPT AND SETSOCKOPT FUNCTIONS (iii) GENERIC SOCKET OPTIONS (iv) IPv4 SOCKET OPTIONS (v) ICMP SOCKET OPTIONS (vi) IPv6 SOCKET OPTIONS (vii) TCP SOCKET OPTIONS (B) ELEMENTARY UDP SOCKETS (i) INTRODUCTION (ii) RECVFROM AND SENDTO FUNCTIONS (iii) UDP ECHO SERVER (iv) UDP ECHO CLIENT (v) TCP AND UDP USING SELECT (MULTIPLEXING) (C) NAME AND ADDRESS CONVERSIONS (i) DOMAIN NAME SYSTEM (DNS) (ii) GETHOSTBYNAME FUNCTION (iii) IPv6 SUPPORT IN DNS (iv) GETHOSTBYADDR FUNCTION (v) GETSERVBYNAME FUNCTION (vi) GETSERVBYPORT FUNCTION

(A) SOCKET OPTIONS (i) INTRODUCTION There are various ways to get and set the options that affect a socket: The getsockopt and setsockopt functions The fcntl function The ioctl function (ii) getsockopt and setsockopt FUNCTIONS These two functions apply only to sockets.

#include <sys/socket.h> int getsockopt(int sockfd, int level, int optname, void *optval, socklen_t *optlen); int setsockopt(int sockfd, int level, int optname, const void *optval socklen_t optlen); Both return: 0 if OK,1 on error
Mrs.D.Anow Fanny, MCA Dept, RMDEC Page 75

MC9241 NETWORK PROGRAMMING UNIT - IV sockfd must refer to an open socket descriptor. level specifies the code in the system that interprets the option: the general socket code or some protocol-specific code (e.g., IPv4, IPv6, TCP, or SCTP). optval is a pointer to a variable from which the new value of the option is fetched by setsockopt, or into which the current value of the option is stored by getsockopt. The size of this variable is specified by the final argument, as a value for setsockopt and as a value-result for getsockopt. (iii) GENERIC SOCKET OPTIONS

These options are protocol-independent (that is, they are handled by the protocolindependent code within the kernel, not by one particular protocol module such as IPv4), but some of the options apply to only certain types of sockets. For example, even though the SO_BROADCAST socket option is called "generic," it applies only to datagram sockets. SO_BROADCAST Socket Option This option enables or disables the ability of the process to send broadcast messages. Broadcasting is supported for only datagram sockets and only on networks that support the concept of a broadcast message (e.g., Ethernet, token ring, etc.). You cannot broadcast on a point-to-point link or any connection-based transport protocol such as SCTP or TCP. Since an application must set this socket option before sending a broadcast datagram, it prevents a process from sending a broadcast when the application was never designed to broadcast. SO_DEBUG Socket Option This option is supported only by TCP. When enabled for a TCP socket, the kernel keeps track of detailed information about all the packets sent or received by TCP for the socket. These are kept in a circular buffer within the kernel that can be examined with the trpt program. SO_DONTROUTE Socket Option This option specifies that outgoing packets are to bypass the normal routing mechanisms of the underlying protocol. For example, with IPv4, the packet is directed to the appropriate local interface, as specified by the network and subnet portions of the destination address. If the local interface cannot be determined from the destination address (e.g., the destination is not on the other end of a point-to-point link, or is not on a shared network), ENETUNREACH is returned. The equivalent of this option can also be applied to individual datagrams using the MSG_DONTROUTE flag with the send, sendto, or sendmsg functions. This option is often used by routing daemons (e.g., routed and gated) to bypass the routing table and force a packet to be sent out a particular interface.

Mrs.D.Anow Fanny, MCA Dept, RMDEC

Page 76

MC9241 NETWORK PROGRAMMING UNIT - IV SO_ERROR Socket Option When an error occurs on a socket, the protocol module in a Berkeley-derived kernel sets a variable named so_error for that socket to one of the standard Unix Exxx values. This is called the pending error for the socket. The process can be immediately notified of the error in one of two ways: 1. If the process is blocked in a call to select on the socket, for either readability or writability, select returns with either or both conditions set. 2. If the process is using signal-driven I/O, the SIGIO signal is generated for either the process or the process group. The process can then obtain the value of so_error by fetching the SO_ERROR socket option. The integer value returned by getsockopt is the pending error for the socket. The value of so_error is then reset to 0 by the kernel. If so_error is nonzero when the process calls read and there is no data to return, read returns1 with errno set to the value of so_error. The value of so_error is then reset to 0. If there is data queued for the socket, that data is returned by read instead of the error condition. If so_error is nonzero when the process calls write, 1 is returned with errno set to the value of so_error (p. 495 of TCPv2) and so_error is reset to 0.

SO_KEEPALIVE Socket Option When the keep-alive option is set for a TCP socket and no data has been exchanged across the socket in either direction for two hours, TCP automatically sends a keep-alive probe to the peer. This probe is a TCP segment to which the peer must respond. One of three scenarios results: 1. The peer responds with the expected ACK. The application is not notified (since everything is okay). TCP will send another probe following another two hours of inactivity. 2. The peer responds with an RST, which tells the local TCP that the peer host has crashed and rebooted. The socket's pending error is set to ECONNRESET and the socket is closed. 3. There is no response from the peer to the keep-alive probe. Berkeley-derived TCPs send 8 additional probes, 75 seconds apart, trying to elicit a response. TCP will give up if there is no response within 11 minutes and 15 seconds after sending the first probe. The figure summarizes the various methods that have to be detected when something happens on the other end of a TCP connection.

Mrs.D.Anow Fanny, MCA Dept, RMDEC

Page 77

MC9241 NETWORK PROGRAMMING UNIT - IV

Ways to detect various TCP conditions. SO_LINGER Socket Option This option specifies how the close function operates for a connection-oriented protocol (e.g., for TCP and SCTP, but not for UDP). By default, close returns immediately, but if there is any data still remaining in the socket send buffer, the system will try to deliver the data to the peer. The SO_LINGER socket option helps in changing this default condition. This option requires the following structure to be passed between the user process and the kernel. It is defined by including <sys/socket.h>. struct linger { int l_onoff; int l_linger; };

/* 0=off, nonzero=on */ /* linger time, POSIX specifies units as seconds */

Calling setsockopt leads to one of the following three scenarios, depending on the values of the two structure members: 1. If l_onoff is 0, the option is turned off. The value of l_linger is ignored and the previously discussed TCP default applies: close returns immediately. 2. If l_onoff is nonzero and l_linger is zero, TCP aborts the connection when it is closed. That is, TCP discards any data still remaining in the socket send buffer and sends an RST to the peer, not the normal four-packet connection termination 3. If l_onoff is nonzero and l_linger is nonzero, then the kernel will linger when the socket is closed. That is, if there is any data still remaining in the socket send buffer, the process is put to sleep until either: (i) all the data is sent and acknowledged by the peer TCP, or (ii) the linger time expires.
Mrs.D.Anow Fanny, MCA Dept, RMDEC Page 78

MC9241 NETWORK PROGRAMMING UNIT - IV Summary of shutdown and SO_LINGER scenarios.

SO_OOBINLINE Socket Option When this option is set, out-of-band data will be placed in the normal input queue (i.e., inline). When this occurs, the MSG_OOB flag to the receive functions cannot be used to read the out-of-band data. SO_RCVBUF and SO_SNDBUF Socket Options Every socket has a send buffer and a receive buffer. The receive buffers are used by TCP, UDP, and SCTP to hold received data until it is read by the application. With TCP, the available room in the socket receive buffer limits the window that TCP can advertise to the other end. The TCP socket receive buffer cannot overflow because the peer is not allowed to send data beyond the advertised window. This is TCP's flow control, and if the peer ignores the advertised window and sends data beyond the window, the receiving TCP discards it. With UDP, however, when a datagram arrives that will not fit in the socket receive buffer, that datagram is discarded. Recall that UDP has no flow control: It is easy for a fast sender to overwhelm a slower receiver, causing datagrams to be discarded by the receiver's UDP. In fact, a fast sender can overwhelm its own network interface, causing datagrams to be discarded by the sender itself. These two socket options changes the default sizes. The default values differ widely between implementations. Older Berkeley-derived implementations would default the TCP send and receive buffers to 4,096 bytes, but newer systems use larger values, anywhere from 8,192 to 61,440 bytes. The UDP send buffer size often defaults to a value around 9,000 bytes if the host supports NFS, and the UDP receive buffer size often defaults to a value around 40,000 bytes.

Mrs.D.Anow Fanny, MCA Dept, RMDEC

Page 79

MC9241 NETWORK PROGRAMMING UNIT - IV When setting the size of the TCP socket receive buffer, the ordering of the function calls is important. This is because of TCP's window scale option, which is exchanged with the peer on the SYN segments when the connection is established. For a client, this means the SO_RCVBUF socket option must be set before calling connect. For a server, this means the socket option must be set for the listening socket before calling listen. Setting this option for the connected socket will have no effect whatsoever on the possible window scale option because accept does not return with the connected socket until TCP's three-way handshake is complete. That is why this option must be set for the listening socket. SO_RCVLOWAT and SO_SNDLOWAT Socket Options Every socket also has a receive low-water mark and a send low-water mark. These are used by the select function. These two socket options, SO_RCVLOWAT and SO_SNDLOWAT changes these two low-water marks. The receive low-water mark is the amount of data that must be in the socket receive buffer for select to return "readable." It defaults to 1 for TCP, UDP, and SCTP sockets. The send lowwater mark is the amount of available space that must exist in the socket send buffer for select to return "writable." This low-water mark normally defaults to 2,048 for TCP sockets. SO_RCVTIMEO and SO_SNDTIMEO Socket Options These two socket options allow us to place a timeout on socket receives and sends. This allows to specify the timeouts in seconds and microseconds. The receive timeout affects the five input functions: read, readv, recdv, recvfrom, and recvmsg. The send timeout affects the five output functions: write, writev, send, sendto, and sendmsg. SO_REUSEADDR and SO_REUSEPORT Socket Options The SO_REUSEADDR socket option serves four different purposes: 1. SO_REUSEADDR allows a listening server to start and bind its well-known port, even if previously established connections exist that use this port as their local port. This condition is typically encountered as follows: a. A listening server is started. b. A connection request arrives and a child process is spawned to handle that client. c. The listening server terminates, but the child continues to service the client on the existing connection. d. The listening server is restarted. 2. SO_REUSEADDR allows a new server to be started on the same port as an existing server that is bound to the wildcard address, as long as each instance binds a different local IP address. This is common for a site hosting multiple HTTP servers using the IP alias technique. 3. SO_REUSEADDR allows a single process to bind the same port to multiple sockets, as long as each bind specifies a different local IP address. This is common for UDP servers that need to know the destination IP address of client requests on systems that do not provide the IP_RECVDSTADDR socket option. This technique is normally not used with
Mrs.D.Anow Fanny, MCA Dept, RMDEC Page 80

MC9241 NETWORK PROGRAMMING UNIT - IV TCP servers since a TCP server can always determine the destination IP address by calling getsockname after the connection is established. However, a TCP server wishing to serve connections to some, but not all, addresses belonging to a multihomed host should use this technique. 4. SO_REUSEADDR allows completely duplicate bindings: a bind of an IP address and port, when that same IP address and port are already bound to another socket, if the transport protocol supports it. Normally this feature is supported only for UDP sockets. 4.4BSD introduced the SO_REUSEPORT socket option when support for multicasting was added. Instead of overloading SO_REUSEADDR with the desired multicast semantics that allow completely duplicate bindings, this new socket option was introduced with the following semantics: 1. This option allows completely duplicate bindings, but only if each socket that wants to bind the same IP address and port specify this socket option. 2. SO_REUSEADDR is considered equivalent to SO_REUSEPORT if the IP address being bound is a multicast address (p. 731 of TCPv2). The problem with this socket option is that not all systems support it, and on those that do not support the option but do support multicasting, SO_REUSEADDR is used instead of SO_REUSEPORT to allow completely duplicate bindings when it makes sense (i.e., a UDP server that can be run multiple times on the same host at the same time and that expects to receive either broadcast or multicast datagrams). We can summarize our discussion of these socket options with the following recommendations: 1. Set the SO_REUSEADDR socket option before calling bind in all TCP servers. 2. When writing a multicast application that can be run multiple times on the same host at the same time, set the SO_REUSEADDR socket option and bind the group's multicast address as the local IP address. SO_TYPE Socket Option This option returns the socket type. The integer value returned is a value such as SOCK_STREAM or SOCK_DGRAM. This option is typically used by a process that inherits a socket when it is started. SO_USELOOPBACK Socket Option This option applies only to sockets in the routing domain (AF_ROUTE). This option defaults to ON for these sockets (the only one of the SO_xxx socket options that defaults to ON instead of OFF). When this option is enabled, the socket receives a copy of everything sent on the socket.

Mrs.D.Anow Fanny, MCA Dept, RMDEC

Page 81

MC9241 NETWORK PROGRAMMING UNIT - IV (iv) IPv4 SOCKET OPTIONS These socket options are processed by IPv4 and have a level of IPPROTO_IP.

IP_HDRINCL Socket Option If this option is set for a raw IP socket, own IP header is built for all the datagrams that is send on the raw socket. Normally, the kernel builds the IP header for datagrams sent on a raw socket, but there are some applications (notably traceroute) that build their own IP header to override values that IP would place into certain header fields. When this option is set, a complete IP header is built, with the following exceptions: IP always calculates and stores the IP header checksum. If we set the IP identification field to 0, the kernel will set the field. If the source IP address is INADDR_ANY, IP sets it to the primary IP address of the outgoing interface. Setting IP options is implementation-dependent. Some implementations take any IP options that were set using the IP_OPTIONS socket option. Some fields must be in host byte order, and some in network byte order. This is implementation-dependent, which makes writing raw packets with IP_HDRINCL not as portable as we'd like. IP_OPTIONS Socket Option Setting this option allows to set IP options in the IPv4 header. This requires intimate knowledge of the format of the IP options in the IP header. IP_RECVDSTADDR Socket Option This socket option causes the destination IP address of a received UDP datagram to be returned as ancillary data by recvmsg.. IP_RECVIF Socket Option This socket option causes the index of the interface on which a UDP datagram is received to be returned as ancillary data by recvmsg. IP_TOS Socket Option This option lets us set the type-of-service (TOS) field (which contains the DSCP and ECN fields, in the IP header for a TCP, UDP, or SCTP socket. When the getsockopt function is called for this option, the current value that would be placed into the DSCP and ECN fields in the IP header (which defaults to 0) is returned. There is no way to fetch the value from a received IP datagram. IP_TTL Socket Option

Mrs.D.Anow Fanny, MCA Dept, RMDEC

Page 82

MC9241 NETWORK PROGRAMMING UNIT - IV With this option, we can set and fetch the default TTL that the system will use for unicast packets sent on a given socket. The multicast TTL is set using the IP_MULTICAST_TTL socket option. (v) ICMPv6 SOCKET OPTIONS This socket option is processed by ICMPv6 and has a level of IPPROTO_ICMPV6. ICMP6_FILTER Socket Option This option lets us fetch and set an icmp6_filter structure that specifies which of the 256 possible ICMPv6 message types will be passed to the process on a raw socket. (vi) IPv6 SOCKET OPTIONS These socket options are processed by IPv6 and have a level of IPPROTO_IPV6. IPV6_CHECKSUM Socket Option This socket option specifies the byte offset into the user data where the checksum field is located. If this value is non-negative, the kernel will: (i) (ii) compute and store a checksum for all outgoing packets, and verify the received checksum on input, discarding packets with an invalid checksum. This option affects all IPv6 raw sockets, except ICMPv6 raw sockets. If a value of -1 is specified (the default), the kernel will not calculate and store the checksum for outgoing packets on this raw socket and will not verify the checksum for received packets.

IPV6_DONTFRAG Socket Option Setting this option disables the automatic insertion of a fragment header for UDP and raw sockets. When this option is set, output packets larger than the MTU of the outgoing interface will be dropped. No error needs to be returned from the system call that sends the packet, since the packet might exceed the path MTU en-route. Instead, the application should enable the IPV6_RECVPATHMTU option to learn about path MTU changes. IPV6_NEXTHOP Socket Option This option specifies the next-hop address for a datagram as a socket address structure, and is a privileged operation. IPV6_PATHMTU Socket Option This option cannot be set, only retrieved. When this option is retrieved, the current MTU as determined by path-MTU discovery is returned.
Mrs.D.Anow Fanny, MCA Dept, RMDEC Page 83

MC9241 NETWORK PROGRAMMING UNIT - IV IPV6_RECVDSTOPTS Socket Option Setting this option specifies that any received IPv6 destination options are to be returned as ancillary data by recvmsg. This option defaults to OFF. IPV6_RECVHOPLIMIT Socket Option Setting this option specifies that the received hop limit field is to be returned as ancillary data by recvmsg. This option defaults to OFF. There is no way with IPv4 to obtain the received TTL field. IPV6_RECVHOPOPTS Socket Option Setting this option specifies that any received IPv6 hop-by-hop options are to be returned as ancillary data by recvmsg. This option defaults to OFF. IPV6_RECVPATHMTU Socket Option Setting this option specifies that the path MTU of a path is to be returned as ancillary data by recvmsg (without any accompanying data) when it changes. IPV6_RECVPKTINFO Socket Option Setting this option specifies that the following two pieces of information about a received IPv6 datagram are to be returned as ancillary data by recvmsg: the destination IPv6 address and the arriving interface index. IPV6_RECVRTHDR Socket Option Setting this option specifies that a received IPv6 routing header is to be returned as ancillary data by recvmsg. This option defaults to OFF. IPV6_RECVTCLASS Socket Option Setting this option specifies that the received traffic class (containing the DSCP and ECN fields) is to be returned as ancillary data by recvmsg. This option defaults to OFF. IPV6_UNICAST_HOPS Socket Option This IPv6 option is similar to the IPv4 IP_TTL socket option. Setting the socket option specifies the default hop limit for outgoing datagrams sent on the socket, while fetching the socket option returns the value for the hop limit that the kernel will use for the socket. IPV6_USE_MIN_MTU Socket Option Setting this option to 1 specifies that path MTU discovery is not to be performed and that packets are sent using the minimum IPv6 MTU to avoid fragmentation. Setting it to 0 causes
Mrs.D.Anow Fanny, MCA Dept, RMDEC Page 84

MC9241 NETWORK PROGRAMMING UNIT - IV path MTU discovery to occur for all destinations. Setting it to1 specifies that path MTU discovery is performed for unicast destinations but the minimum MTU is used when sending to multicast destinations. This option defaults to 1. IPV6_V6ONLY Socket Option Setting this option on an AF_INET6 socket restricts it to IPv6 communication only. This option defaults to OFF, although some systems have an option to turn it ON by default. IPV6_XXX Socket Options Most of the IPv6 options for header modification assume a UDP socket with information being passed between the kernel and the application using ancillary data with recvmsg and sendmsg. A TCP socket fetches and stores these values using getsockopt and setsockopt instead. (vii) TCP SOCKET OPTIONS

There are two socket options for TCP. We specify the level as IPPROTO_TCP. TCP_MAXSEG Socket Option This socket option allows us to fetch or set the MSS for a TCP connection. The value returned is the maximum amount of data that our TCP will send to the other end; often, it is the MSS announced by the other end with its SYN, unless our TCP chooses to use a smaller value than the peer's announced MSS. If this value is fetched before the socket is connected, the value returned is the default value that will be used if an MSS option is not received from the other end. Also be aware that a value smaller than the returned value can actually be used for the connection if the timestamp option, for example, is in use, because this option occupies 12 bytes of TCP options in each segment. The maximum amount of data that our TCP will send per segment can also change during the life of a connection if TCP supports path MTU discovery. If the route to the peer changes, this value can go up or down. TCP_NODELAY Socket Option If set, this option disables TCP's Nagle algorithm. By default, this algorithm is enabled. The purpose of the Nagle algorithm is to reduce the number of small packets on a WAN. The algorithm states that if a given connection has outstanding data (i.e., data that our TCP has sent, and for which it is currently awaiting an acknowledgment), then no small packets will be sent on the connection in response to a user write operation until the existing data is acknowledged. The definition of a "small" packet is any packet smaller than the MSS. TCP will always send a fullsized packet if possible; the purpose of the Nagle algorithm is to prevent a connection from having multiple small packets outstanding at any time.

Mrs.D.Anow Fanny, MCA Dept, RMDEC

Page 85

MC9241 NETWORK PROGRAMMING UNIT - IV The two common generators of small packets are the Rlogin and Telnet clients, since they normally send each keystroke as a separate packet. On a fast LAN, we normally do not notice the Nagle algorithm with these clients, because the time required for a small packet to be acknowledged is typically a few millisecondsfar less than the time between two successive characters that we type. But on a WAN, where it can take a second for a small packet to be acknowledged, we can notice a delay in the character echoing, and this delay is often exaggerated by the Nagle algorithm. Consider the following example: Type the six-character string "hello!" to either an Rlogin or Telnet client, with exactly 250 ms between each character. The RTT to the server is 600 ms and the server immediately sends back the echo of each character. Assume that the ACK of the client's character is sent back to the client along with the character echo and we ignore the ACKs that the client sends for the server's echo. Assuming the Nagle algorithm is disabled, and have 12 packets as shown in figure. Six characters echoed by server with Nagle algorithm disabled.

Each character is sent in a packet by itself: the data segments from left to right, and the ACKs from right to left. If the Nagle algorithm is enabled (the default), then eight packets are shown in the figure. The first character is sent as a packet by itself, but the next two characters are not sent, since the connection has a small packet outstanding. At time 600, when the ACK of the first packet is received, along with the echo of the first character, these two characters are sent. Until this packet is ACKed at time 1200, no more small packets are sent.

Mrs.D.Anow Fanny, MCA Dept, RMDEC

Page 86

MC9241 NETWORK PROGRAMMING UNIT - IV Six characters echoed by server with Nagle algorithm enabled.

The Nagle algorithm often interacts with another TCP algorithm: the delayed ACK algorithm. This algorithm causes TCP to not send an ACK immediately when it receives data; instead, TCP will wait some small amount of time (typically 50200 ms) and only then send the ACK. The hope is that in this small amount of time, there will be data to send back to the peer, and the ACK can piggyback with the data, saving one TCP segment. This is normally the case with the Rlogin and Telnet clients, because the servers typically echo each character sent by the client, so the ACK of the client's character piggybacks with the server's echo of that character. The problem is with other clients whose servers do not generate traffic in the reverse direction on which ACKs can piggyback. These clients can detect noticeable delays because the client TCP will not send any data to the server until the server's delayed ACK timer expires. These clients need a way to disable the Nagle algorithm, hence the TCP_NODELAY option. (B) ELEMENTARY UDP SOCKETS (i) INTRODUCTION

There are some fundamental differences between applications written using TCP versus those that use UDP. These are because of the differences in the two transport layers: UDP is a connectionless, unreliable, datagram protocol, quite unlike the connection-oriented, reliable byte stream provided by TCP. The diagram shows the function calls for a typical UDP client/server. The client does not establish a connection with the server. Instead, the client just sends a datagram to the server using the sendto function, which requires the address of the destination (the server) as a parameter. Similarly, the server does not accept a connection from a client. Instead, the server just calls the recvfrom function, which waits until data arrives from some
Mrs.D.Anow Fanny, MCA Dept, RMDEC Page 87

MC9241 NETWORK PROGRAMMING UNIT - IV client. recvfrom returns the protocol address of the client, along with the datagram, so the server can send a response to the correct client. SOCKET FUNCTIONS FOR UDP CLIENT/SERVER.

(ii) recvfrom and sendto FUNCTIONS These two functions are similar to the standard read and write functions, but three additional arguments are required. #include <sys/socket.h> ssize_t recvfrom(int sockfd, void *buff, size_t nbytes, int flags, struct sockaddr *from, socklen_t *addrlen); ssize_t sendto(int sockfd, const void *buff, size_t nbytes, int flags, const struct sockaddr *to, socklen_t addrlen); Both return: number of bytes read or written if OK, 1 on error The first three arguments, sockfd, buff, and nbytes, are identical to the first three arguments for read and write: descriptor, pointer to buffer to read into or write from, and number of bytes to read or write.
Mrs.D.Anow Fanny, MCA Dept, RMDEC Page 88

MC9241 NETWORK PROGRAMMING UNIT - IV The to argument for sendto is a socket address structure containing the protocol address (e.g., IP address and port number) of where the data is to be sent. The size of this socket address structure is specified by addrlen. The recvfrom function fills in the socket address structure pointed to by from with the protocol address of who sent the datagram. The number of bytes stored in this socket address structure is also returned to the caller in the integer pointed to by addrlen. The final argument to sendto is an integer value, while the final argument to recvfrom is a pointer to an integer value (a value-result argument). The final two arguments to recvfrom are similar to the final two arguments to accept: The contents of the socket address structure upon return tell us who sent the datagram (in the case of UDP) or who initiated the connection (in the case of TCP). The final two arguments to sendto are similar to the final two arguments to connect: We fill in the socket address structure with the protocol address of where to send the datagram (in the case of UDP) or with whom to establish a connection (in the case of TCP). (iii)UDP ECHO SERVER UDP client and server programs follow the function call flow as depicted in the diagram Simple echo client/server using UDP.

UDP echo server Program. int main(int argc, char **argv) { int sockfd; struct sockaddr_in servaddr, cliaddr; sockfd = socket(AF_INET, SOCK_DGRAM, 0); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_addr.s_addr = htonl(INADDR_ANY); servaddr.sin_port = htons(SERV_PORT); bind(sockfd, (SA *) &servaddr, sizeof(servaddr)); dg_echo(sockfd, (SA *) &cliaddr, sizeof(cliaddr)); } Create UDP socket, bind server's well-known port A UDP socket is created by specifying the second argument to socket as SOCK_DGRAM (a datagram socket in the IPv4 protocol). As with the TCP server example, the IPv4 address for the bind is specified as INADDR_ANY and the server's well-known port is the
Mrs.D.Anow Fanny, MCA Dept, RMDEC Page 89

MC9241 NETWORK PROGRAMMING UNIT - IV constant SERV_PORT from the unp.h header. The function dg_echo is called to perform server processing. dg_echo function: echo lines on a datagram socket. void dg_echo(int sockfd, SA *pcliaddr, socklen_t clilen) { int n; socklen_t len; char mesg[MAXLINE]; for ( ; ; ) { len = clilen; n = Recvfrom(sockfd, mesg, MAXLINE, 0, pcliaddr, &len); Sendto(sockfd, mesg, n, 0, pcliaddr, len); } } Read datagram, echo back to sender This function is a simple loop that reads the next datagram arriving at the server's port using recvfrom and sends it back using sendto. Despite the simplicity of this function, there are numerous details to consider. First, this function never terminates. Since UDP is a connectionless protocol, there is nothing like an EOF as we have with TCP. Next, this function provides an iterative server, not a concurrent server as we had with TCP. There is no call to fork, so a single server process handles any and all clients. In general, most TCP servers are concurrent and most UDP servers are iterative. (iv)UDP ECHO CLIENT The UDP client main function is shown below. UDP echo client. int main(int argc, char **argv) { int sockfd; struct sockaddr_in servaddr; if(argc != 2) err_quit("usage: udpcli <IPaddress>"); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_port = htons(SERV_PORT); inet_pton(AF_INET, argv[1], &servaddr.sin_addr); sockfd = Socket(AF_INET, SOCK_DGRAM, 0);
Mrs.D.Anow Fanny, MCA Dept, RMDEC Page 90

MC9241 NETWORK PROGRAMMING UNIT - IV dg_cli(stdin, sockfd, (SA *) &servaddr, sizeof(servaddr)); exit(0); } Fill in socket address structure with server's address An IPv4 socket address structure is filled in with the IP address and port number of the server. This structure will be passed to dg_cli, specifying where to send datagrams. A UDP socket is created and the function dg_cli is called. The function dg_cli, which performs most of the client processing. dg_cli function: client processing loop. void dg_cli(FILE *fp, int sockfd, const SA *pservaddr, socklen_t servlen) { int n; char sendline[MAXLINE], recvline[MAXLINE + 1]; while (fgets(sendline, MAXLINE, fp) != NULL) { sendto(sockfd, sendline, strlen(sendline), 0, pservaddr, servlen); n = recvfrom(sockfd, recvline, MAXLINE, 0, NULL, NULL); recvline[n] = 0; /* null terminate */ fputs(recvline, stdout); } } There are four steps in the client processing loop: read a line from standard input using fgets, send the line to the server using sendto, read back the server's echo using recvfrom, and print the echoed line to standard output using fputs. Our client has not asked the kernel to assign an ephemeral port to its socket. With a TCP client, we said the call to connect is where this takes place. With a UDP socket, the first time the process calls sendto, if the socket has not yet had a local port bound to it, that is when an ephemeral port is chosen by the kernel for the socket. As with TCP, the client can call bind explicitly, but this is rarely done. Notice that the call to recvfrom specifies a null pointer as the fifth and sixth arguments. This tells the kernel that we are not interested in knowing who sent the reply. There is a risk that any process, on either the same host or some other host, can send a datagram to the client's IP address and port, and that datagram will be read by the client, who will think it is the server's reply. As with the server function dg_echo, the client function dg_cli is protocol-independent, but the client main function is protocol-dependent. The main function allocates and initializes a socket address structure of some protocol type and then passes a pointer to this structure, along with its size, to dg_cli

Mrs.D.Anow Fanny, MCA Dept, RMDEC

Page 91

MC9241 NETWORK PROGRAMMING UNIT - IV (v)TCP AND UDP USING SELECT (MULTIPLEXING) Combination of concurrent TCP echo server with iterative UDP echo server into a single server that uses select to multiplex a TCP and UDP socket. Create listening TCP socket A listening TCP socket is created that is bound to the server's well-known port. We set the SO_REUSEADDR socket option in case connections exist on this port. Create UDP socket A UDP socket is also created and bound to the same port. Even though the same port is used for TCP and UDP sockets, there is no need to set the SO_REUSEADDR socket option before this call to bind, because TCP ports are independent of UDP ports. Establish signal handler for SIGCHLD A signal handler is established for SIGCHLD because TCP connections will be handled by a child process. Prepare for select Initialize a descriptor set for select and calculate the maximum of the two descriptors for which has to be waited. Call select Call select, waiting only for readability on the listening TCP socket or readability on the UDP socket. Since our sig_chld handler can interrupt our call to select, we handle an error of EINTR. Handle new client connection Accept a new client connection when the listening TCP socket is readable, fork a child, and call our str_echo function in the child. Handle arrival of datagram If the UDP socket is readable, a datagram has arrived. We read it with recvfrom and send it back to the client with sendto. First half of echo server that handles TCP and UDP using select. int main(int argc, char **argv) { int listenfd, connfd, udpfd, nready, maxfdp1;
Page 92 Mrs.D.Anow Fanny, MCA Dept, RMDEC

MC9241 NETWORK PROGRAMMING UNIT - IV char mesg[MAXLINE]; pid_t childpid; fd_set rset; ssize_t n; socklen_t len; const int on = 1; struct sockaddr_in cliaddr, servaddr; void sig_chld(int);

/* create listening TCP socket */ listenfd = socket(AF_INET, SOCK_STREAM, 0); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_addr.s_addr = htonl(INADDR_ANY); servaddr.sin_port = htons(SERV_PORT); setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on)); bind(listenfd, (SA *) &servaddr, sizeof(servaddr)); listen(listenfd, LISTENQ); /* create UDP socket */ udpfd = socket(AF_INET, SOCK_DGRAM, 0); bzero(&servaddr, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_addr.s_addr = htonl(INADDR_ANY); servaddr.sin_port = htons(SERV_PORT); b ind(udpfd, (SA *) &servaddr, sizeof(servaddr)); Second half of echo server that handles TCP and UDP using select. Signal(SIGCHLD, sig_chld); /* must call waitpid() */

FD_ZERO(&rset); maxfdp1 = max(listenfd, udpfd) + 1; for ( ; ; ) { FD_SET(listenfd, &rset); FD_SET(udpfd, &rset); if ( (nready = select(maxfdp1, &rset, NULL, NULL, NULL)) < 0) { if (errno == EINTR) continue; /* back to for() */ else err_sys("select error"); }
Mrs.D.Anow Fanny, MCA Dept, RMDEC Page 93

MC9241 NETWORK PROGRAMMING UNIT - IV if (FD_ISSET(listenfd, &rset)) { len = sizeof(cliaddr); connfd = accept(listenfd, (SA *) &cliaddr, &len); if ( (childpid = Fork()) == 0) { /* child process */ close(listenfd); /* close listening socket */ str_echo(connfd); /* process the request */ exit(0); } close(connfd); /* parent closes connected socket */ } if (FD_ISSET(udpfd, &rset)) { len = sizeof(cliaddr); n = recvfrom(udpfd, mesg, MAXLINE, 0, (SA *) &cliaddr, &len); sendto(udpfd, mesg, n, 0, (SA *) &cliaddr, len); } } }

( C) NAME AND ADDRESS CONVERSIONS (i) DOMAIN NAME SYSTEM (DNS)

The DNS is used primarily to map between hostnames and IP addresses. A hostname can be either a simple name, such as solaris or freebsd, or a fully qualified domain name '(FQDN). Technically, an FQDN is also called an absolute name and must end with a period, but users often omit the ending period. The trailing period tells the resolver that this name is fully qualified and it doesn't need to search its list of possible domains. Resource Records Entries in the DNS are known as resource records (RRs). A An A record maps a hostname into a 32-bit IPv4 address. For example, here are the four DNS records for the host freebsd in the unpbook.com domain, the first of which is an A record: freebsd IN A IN AAAA IN MX IN MX 12.106.32.254 3ffe:b80:1f8d:1:a00:20ff:fea7:686b 5 freebsd.unpbook.com. 10 mailhost.unpbook.com.

AAAA

A AAAA record, called a "quad A" record, maps a hostname into a 128-bit IPv6 address. The term "quad A" was chosen because a 128-bit address is four times larger than a 32-bit address.
Page 94

Mrs.D.Anow Fanny, MCA Dept, RMDEC

MC9241 NETWORK PROGRAMMING UNIT - IV A An A record maps a hostname into a 32-bit IPv4 address. For example, here are the four DNS records for the host freebsd in the unpbook.com domain, the first of which is an A record: freebsd IN A IN AAAA IN MX IN MX 12.106.32.254 3ffe:b80:1f8d:1:a00:20ff:fea7:686b 5 freebsd.unpbook.com. 10 mailhost.unpbook.com.

PTR

PTR records (called "pointer records") map IP addresses into hostnames. For an IPv4 address, then 4 bytes of the 32-bit address are reversed, each byte is converted to its decimal ASCII value (0255). The resulting string is used in the PTR query. For an IPv6 address, the 32 4-bit nibbles of the 128-bit address are reversed, each nibble is converted to its corresponding hexadecimal ASCII value (09af).

MX

An MX record specifies a host to act as a "mail exchanger" for the specified host. In the example for the host freebsd above, two MX records are provided: The first has a preference value of 5 and the second has a preference value of 10. When multiple MX records exist, they are used in order of preference, starting with the smallest value.

CNAME CNAME stands for "canonical name." A common use is to assign CNAME records for common services, such as ftp and www. If people use these service names instead of the actual hostnames, it is transparent when a service is moved to another host. For example, the following could be CNAMEs for our host linux: ftp www IN IN CNAME linux.unpbook.com. CNAME linux.unpbook.com.

Resolvers and Name Servers Organizations run one or more name servers, often the program known as BIND (Berkeley Internet Name Domain). Applications such as the clients and servers that we are writing in this text contact a DNS server by calling functions in a library known as the resolver. The common resolver functions are gethostbyname and gethostbyaddr, both of which are described in this chapter. The former maps a hostname into its IPv4 addresses, and the latter does the reverse mapping. The figure shows a typical arrangement of applications, resolvers, and name servers and then write the application code. On some systems, the resolver code is contained in a system library and is link-edited into the application when the application is built. On others, there is a centralized resolver daemon that all applications share, and the system library code performs RPCs to this daemon. In either case, application code calls the resolver code using normal function calls, typically calling the functions gethostbyname and gethostbyaddr.

Mrs.D.Anow Fanny, MCA Dept, RMDEC

Page 95

MC9241 NETWORK PROGRAMMING UNIT - IV Typical arrangement of clients, resolvers, and name servers.

The resolver code reads its system-dependent configuration files to determine the location of the organization's name servers. The file /etc/resolv.conf normally contains the IP addresses of the local name servers. It might be nice to use the names of the name servers in the /etc/resolv.conf file, since the names are easier to remember and configure, but this introduces a chicken-and-egg problem of where to go to do the name-to-address conversion for the server that will do the name and address conversion! The resolver sends the query to the local name server using UDP. If the local name server does not know the answer, it will normally query other name servers across the Internet, also using UDP. If the answers are too large to fit in a UDP packet, the resolver will automatically switch to TCP. DNS Alternatives It is possible to obtain name and address information without using the DNS. Common alternatives are static host files (normally the file /etc/hosts), the Network Information System (NIS) or Lightweight Directory Access Protocol (LDAP). Unfortunately, it is implementationdependent how an administrator configures a host to use the different types of name services. Solaris 2.x, HP-UX 10 and later, and FreeBSD 5.x and later use the file /etc/nsswitch.conf, and AIX uses the file /etc/netsvc.conf. BIND 9.2.2 supplies its own version named the Information Retrieval Service (IRS), which uses the file /etc/irs.conf. If a name server is to be used for hostname lookups, then all these systems use the file /etc/resolv.conf to specify the IP addresses of the name servers.

Mrs.D.Anow Fanny, MCA Dept, RMDEC

Page 96

MC9241 NETWORK PROGRAMMING UNIT - IV (ii) GETHOSTBYNAME FUNCTION

Host computers are normally known by human-readable names. All the examples uses IP addresses instead of names, so that it is known exactly what goes into the socket address structures for functions such as connect and sendto, and what is returned by functions such as accept and recvfrom. But, most applications should deal with names, not addresses. The most basic function that looks up a hostname is gethostbyname. If successful, it returns a pointer to a hostent structure that contains all the IPv4 addresses for the host. However, it is limited in that it can only return IPv4 addresses. The POSIX specification cautions that gethostbyname may be withdrawn in a future version of the spec. #include <netdb.h> struct hostent *gethostbyname (const char *hostname); Returns: non-null pointer if OK,NULL on error with h_errno set The non-null pointer returned by this function points to the following hostent structure: struct hostent { char *h_name; /* official (canonical) name of host */ char **h_aliases; /* pointer to array of pointers to alias names */ int h_addrtype; /* host address type: AF_INET */ int h_length; /* length of address: 4 */ char **h_addr_list; /* ptr to array of ptrs with IPv4 addrs */ };

hostent structure and the information it contains

Mrs.D.Anow Fanny, MCA Dept, RMDEC

Page 97

MC9241 NETWORK PROGRAMMING UNIT - IV The returned h_name is called the canonical name of the host. Some versions of gethostbyname allow the hostname argument to be a dotted-decimal string. That is, a call of the form hptr = gethostbyname ("192.168.42.2"); will work. This code was added because the Rlogin client accepts only a hostname, calling gethostbyname, and will not accept a dotted-decimal string. gethostbyname differs from the other socket functions that it does not set errno when an error occurs. Instead, it sets the global integer h_errno to one of the following constants defined by including <netdb.h>:

HOST_NOT_FOUND TRY_AGAIN NO_RECOVERY NO_DATA (identical to NO_ADDRESS)

The NO_DATA error means the specified name is valid, but it does not have an A record. An example of this is a hostname with only an MX record. Most modern resolvers provide the function hstrerror, which takes an h_errno value as its only argument and returns a const char * pointer to a description of the error. (iii) IPv6 SUPPORT IN DNS

The RES_USE_INET6 Constant Since gethostbyname doesn't have an argument to specify what address family is of interest (like getaddrinfo's hints.ai_family struct entry), the first revision of the API used the RES_USE_INET6 constant, which had to be added to the resolver flags using a private, internal interface. This API was not very portable since systems that used a different internal resolver interface had to mimic the BIND resolver interface to provide it. Enabling RES_USE_INET6 caused gethostbyname to look up AAAA records first, and only look up A records if a name had no AAAA records. Since the hostent structure only has one address length field, gethostbyname could only return either IPv6 or IPv4 addresses, but not both. Enabling RES_USE_INET6 also caused gethostbyname2 to return IPv4 addresses as IPv4mapped IPv6 addresses. We will describe gethostbyname2 next. The gethostbyname2 Function The gethostbyname2 function adds an address family argument to gethostbyname. #include <sys/socket.h> #include <netdb.h> struct hostent *gethostbyname2 (const char *name, int af) ; Returns: non-null pointer if OK, NULL on error with h_errno set
Mrs.D.Anow Fanny, MCA Dept, RMDEC Page 98

MC9241 NETWORK PROGRAMMING UNIT - IV When the af argument is AF_INET, AF_INET, gethostbyname2 behaves just like gethostbyname, looking up and returning IPv4 addresses. When the af argument is AF_INET6, AF_INET6, gethostbyname2 looks up and returns only AAAA records for IPv6 addresses. The getipnodebyname Function RFC 2553 deprecated RES_USE_INET6 and gethostbyname2 because of the global nature of the RES_USE_INET6 flag and the wish to provide more control over the returned information. It introduced the getipnodebyname function to solve some of these problems. #include <sys/socket.h> #include <netdb.h> struct hostent *getipnodebyname (const char *name, int af, int flags, int *error_num) ; Returns: non-null pointer if OK, NULL on error with error_num set This function returns a pointer to the same hostent structure that was described with gethostbyname. The af and flags arguments map directly to getaddrinfo's hints.ai_family and hints.ai_flags arguments. For thread safety, the return value is dynamically allocated, so it must be freed with the freehostent function. #include <netdb.h> void freehostent (struct hostent *ptr) ; (iv) GETHOSTBYADDR FUNCTION

The function gethostbyaddr takes a binary IPv4 address and tries to find the hostname corresponding to that address. This is the reverse of gethostbyname. #include <netdb.h> struct hostent *gethostbyaddr (const char *addr, socklen_t len, int family); Returns: non-null pointer if OK, NULL on error with h_errno set This function returns a pointer to the same hostent structure that was described with gethostbyname. The field of interest in this structure is normally h_name, the canonical hostname. The addr argument is not a char*, but is really a pointer to an in_addr structure containing the IPv4 address. len is the size of this structure: 4 for an IPv4 address. The family argument is AF_INET. In terms of the DNS, gethostbyaddr queries a name server for a PTR record in the inaddr.arpa domain.
Mrs.D.Anow Fanny, MCA Dept, RMDEC Page 99

MC9241 NETWORK PROGRAMMING UNIT - IV (v) GETSERVBYNAME FUNCTION

Services, like hosts, are often known by names, too. When a service is refered by its name in the code, instead of by its port number, and if the mapping from the name to port number is contained in a file (normally /etc/services), then if the port number changes, the modification lies in only one line of /etc/services file instead of having to recompile the applications. The getservbyname function, looks up a service given its name. The canonical list of port numbers assigned to services is maintained by the IANA. A given /etc/services file is likely to contain a subset of the IANA assignments. #include <netdb.h> struct servent *getservbyname (const char *servname, const char *protoname); Returns: non-null pointer if OK, NULL on error This function returns a pointer to the following structure: struct servent { char *s_name; /* official service name */ char **s_aliases; /* alias list */ int s-port; /* port number, network-byte order */ char *s_proto; /* protocol to use */ }; The service name servname must be specified. If a protocol is also specified (protoname is a non-null pointer), then the entry must also have a matching protocol. Some Internet services are provided using either TCP or UDP while others support only a single protocol (e.g., FTP requires TCP). If protoname is not specified and the service supports multiple protocols, it is implementation-dependent as to which port number is returned. Normally this does not matter, because services that support multiple protocols often use the same TCP and UDP port number, but this is not guaranteed. The main field of interest in the servent structure is the port number. Since the port number is returned in network byte order, we must not call htons when storing this into a socket address structure. Typical calls to this function could be as follows: struct servent *sptr; sptr = getservbyname("domain", "udp"); /* DNS using UDP */ sptr = getservbyname("ftp", "tcp"); /* FTP using TCP */ sptr = getservbyname("ftp", NULL); /* FTP using TCP */ sptr = getservbyname("ftp", "udp"); /* this call will fail */

Mrs.D.Anow Fanny, MCA Dept, RMDEC

Page 100

MC9241 NETWORK PROGRAMMING UNIT - IV (vi) GETSERVBYPORT FUNCTION

The getservbyport function, looks up a service given its port number and an optional protocol. #include <netdb.h> struct servent *getservbyport (int port, const char *protoname); Returns: non-null pointer if OK, NULL on error The port value must be network byte ordered. Typical calls to this function could be as follows: struct servent *sptr; sptr = getservbyport (htons (53), "udp"); /* DNS using UDP */ sptr = getservbyport (htons (21), "tcp"); /* FTP using TCP */ sptr = getservbyport (htons (21), NULL); /* FTP using TCP */ sptr = getservbyport (htons (21), "udp"); /* this call will fail */ The last call fails because there is no service that uses port 21 with UDP. Be aware that a few port numbers are used with TCP for one service, but the same port number is used with UDP for a totally different service.

Mrs.D.Anow Fanny, MCA Dept, RMDEC

Page 101

You might also like