0% found this document useful (0 votes)
20 views2 pages

Performance Issues in WWW Servers

This paper evaluates performance issues in WWW servers on UNIX-style platforms, focusing on improvements through new socket functions, per-byte optimizations, and per-connection optimizations. The study shows that using the send-file function can significantly enhance server throughput by eliminating unnecessary data copies and reducing packet exchanges. Overall, the proposed optimizations can lead to performance increases of up to 53 percent in server throughput and a 25 percent aggregate improvement.

Uploaded by

Bird and Comb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views2 pages

Performance Issues in WWW Servers

This paper evaluates performance issues in WWW servers on UNIX-style platforms, focusing on improvements through new socket functions, per-byte optimizations, and per-connection optimizations. The study shows that using the send-file function can significantly enhance server throughput by eliminating unnecessary data copies and reducing packet exchanges. Overall, the proposed optimizations can lead to performance increases of up to 53 percent in server throughput and a 25 percent aggregate improvement.

Uploaded by

Bird and Comb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Performance Issuesin WWW Servers

(Extended Abstract)
Erich Nahum, Tsipora Barzilai, and Dilip Kandlur
IBM T.J. Watson ResearchCenter
Hawthorne, NY 10532
{nahum,tsipora,kandlur}@watson.ibm.com

Abstract l new socketfinctions. Microsofl has added two new socket


limctionstoNT[4],acceptexOandtransmitfile(),
This paper evaluates performance issues in WWW servers on UNIX- and HP has a similar function send-file (1 in HPUX.
style platforms. While other work has focused on reducing the use These API’s streamline the programming interface used by a
of kernel primitives, we consider ways in which the operating sys- web server in a typical HTTP transaction, but do they pro-
tem and the network protocol stack can improve support for high- vide any performance benefit? Does transmi tf i le ( ) or
performance WWW servers. We study techniques in 3 categories: send-f i 1 e ( 1show any improvement over the already avail-
new socket functions, per-byte optimizations, and per-connection able mmap ( 1and wri tev ( 1system calls?
optimizations. We examine two proposed socket functions, ac-
ceptex ( 1and send-file (1, comparing send-file () ‘s ef- l per-byte optimizutions. It is well-known that data touch-
fectiveness with an mmap ( 1 /wri tev t 1combination. We show ing operations, such as copying and checksumming, are ex-
how send-file ( 1provides the necessary semantic support to pensive. BSD-derived Unix operating systems use different
eliminate copies and checksums in the kernel, and quantify the buffering mechanisms in the file system and the networking
utility of the Iunction’s header and close options. We also present code, forcing data to be copied when it is moved from one
mechanismsto reduce the number ofpacketsexchangedin an HTTP subsystemto another. How well can we approacha zero-copy
transaction, both increasing server performance and reducing net- integrated FO architecture [7], while continuing to exploit the
work utilization, without compromising interoperability. We evalu- benefits of existing file systems? What sort of performance
ate these issues with a high-performance WWW server, using IBM impact will eliminating data touching operations have for
AIX workstations connected over 100 mbps Ethernet, driven by the WWW servers?
WebStone and SURGE WWW server workload generators. Mi-
l per-connection optimizations. TCP connectionmanagement
crobenchmark results using WebStone show that our combination
was not designed for client-server traffic, exchanging more
of mechanisms can improve server throughput by up to 53 percent,
packets that is semantically necessary. While the transi-
and can eliminate up to 33 percent of the packets in an HTTP ex-
tion to persistent connections in HTTP 1.1 will improve this,
change. Macrobenchmarkresults with SURGE show an aggregate
most machines are still using HTTP 1.0. How can the per-
increase in server throughput of 25 percent.
connection overhead be reduced, without violating the TCP
protocol specification?
1 Introduction
We study these issues using a testbed of several IBM RS/6000
The phenomenal growth of the World-Wide Web, in both the vol- AIX workstations connectedover 100 mbps Ethernet. We use Rice
ume of information on it and the numbers of users desiring access University’s Flash WWW server, which exploits most currently-
to it, is dramatically increasing the performance requirements for known user-level optimizations, and utilize the WWW workload
large scale information servers. WWW server performance is thus generators WebStone and SURGE [3] to drive the system with
a central issue in providing ubiquitous, reliable, and efficient in- HTTP 1.Orequests.
formation access. This paper evaluates issues in WWW server Our experience conhmrs previous work showing that WWW
performance on UNIX-style platforms. While other work has fo- servers spend most of their time in the kernel [ 1, 51. We build
cused on reducing the use of kernel primitives, we explore ways upon previous work by studying ways in which the operating sys-
in which the operating system and the network protocol stack can tem and protocol stack can improve support for high-performance
improve support for high-performance WWW servers. Issues we WWW servers. We examine the benefits of two proposed socket
consider include: functions, acceptex (1 and send-file (1, comparing the ef-
fectiveness of send-file ( 1with an combination ofmmap (1 and
PermIssion to make dagital or hard copies of all or part of the work for writev (1. We show how send-file (1 provides the neces-
personal or classroom use IS granted without lee provided that
copwzs are not made or distributed for proflt or commercial advan-
sary semantic support for further optimizations, such as eliminating
tage and that copes bear this notice and the full cltatmn on the hrst page. copies and reducing packet exchanges, and quantify the utility of
To copy otherwse, to republish, to post on servers or to the function’s header and close options. We present mechanisms to
redlstrabute to 11sts. requires prmr specdic perm~ssaon and/or a fee.
SIGMETRICS ‘99 5/99 Atlanta, Georgia, USA reduce the number of packets exchanged in an HTTP transaction,
0 1999 ACM l-581 13-083-X/99/0004,.,$5.00 both increasing server performance and reducing network utiliza-
tion.

216
1 File Size 11 Flash ( Flash 1 Diff fl configuration SURGE Diff
Opdsec (%)
Flash-Poll 437.72
+ send-file () 418.05 -05
+ Mbuf Caching 519.83 +20
+ Checksum Offload 555.14 +06
+ FIN Piggyback 560.66 +01
+ Delayed Ack of FIN 571.60 to2
+ Delayed Ack of SYN 581.56 +02
Total Improvement: 11 +25

Table 1: HTTP Throughput in ops/sec (WebStone) Table 2: HTTP Throughput in ops/sec (SURGE)

2 Overview of Results incrementally since they do not require both hosts in a con-
versation to adopt them, unlike T/TCP or SACK.
Space constraints prevent us from describing our results fully, thus
we can only provide an overview of our findings. Table 1 shows aggregate benefits. Using SURGE as a macrobenchmark, we
how our optimizations improve server throughput across requests show that the combination of techniques improve aggregate
for different file sizes as measured by WebStone. Table 2 shows server performance by 25 percent.
the aggregate increase in server throughput as our optimizations are
incrementally added as measured by SURGE. Interested readers While we have evaluated these optimizations in the context of a
should consult the IBM research report [6] for more details. We WWW server, they have utility for other programs as well. Reduc-
summarize our findings as follows: ing packet exchanges should help other TCP-based applications,
and send-f i 1 e ( 1 is a general function that can be used by other
l newsocketfinction~ We evaluate the proposed socket fimc- network servers, such as NFS, FTP, or SMB. As a consequence of
tionsacceptex()andsend-file(). Wefindlittleorno our findings, IBM’s AM division has releasedthese features in AIX
increaseinperformance usingthe acceptex ( 1function, on 4.3.2.
either process-based or thread-based WWW servers. In ad-
dition, kernel profiling shows that servers spend a relatively References
smallamountoftime inthe accept 0, getsockname () ,
andread ( ) system calls. A send-file ( ) implementation [l] JussaraM. Ahneida, Virgilio Ahneida, and David J.Yates. Measuring
the behaviorofaworld-widewebserver. InSeventhZFZP Conferenceon
that incurs a single copy provides no advantage over a com-
High Peflrmance Networking (HPN), White Plains, NY, April 1997.
binationofmmap()andwritevO.
[2] Martin F. Arlitt and Carey L. Williamson. Internet web servers: Work-
l per-byte optimizations. Per-byte optimizations that we ex- load characterization and performance implications. IEEE/ACM Trans-
amine include eliminating a data copy on the fast path by actions on Networking, 5(5):63 1-646,Oct 1997.
caching mbufs within the kernel and offloading the TCP [3] Paul Barford and Mark Crovella. Generating representative web work-
checksum to the adaptor. A send-f i 1 e ( ) implementation loads for network and server performance evaluation. In Proceedings
tied to an integrated I/O system which does not copy data of the ACM Sigmelrics Conference on Measurement and Modeling of
provides substantially better performance. In our testbed, Computer Systems, Madison, WI, June 1998.
we observe an increase in throughput of up to 53 percent. [4] James C. Hu, I&n Pyarali, and Douglas C. Schmidt. Measuring the
We find that offloading the checksum to the network device impact of event dispatching and concurrency models on web server
can improve WWW server performance by up to 7 percent. performance over high-speed networks. In Proceedings of the 2nd
Our mbuf cache mechanism can also be enhanced to allow Globallntemet Conference (held aspart of GLOBECOM ‘97), Phoenix,
caching of the checksum values in the mbufs, for network AZ, Nov 1997.
interfaces that do not support the checksum offload. [5] Yiing Hu, Ashwini Nan&, and Qing Yang. Measurement, analysis,
and performance improvement of the Apache web server. Technical
0 per-connection optimizations. Our per-connection optimiza- Report 1097-0001,UniversityofRhodeIslandDepartmentofElectrical
tionsreduce overheadby eliminating redundantpacketsin the and Computer Engineering, Ott 1997.
TCP connection setup and teardown. We show how the close [6] Erich M. Nahum, Tsipora Btilai, and Dilip KandIur. Performance
option to sendfile ( ) provides the semantic support to issues in WWW servers. IBM Research Report, May 1999.
enable piggybacking the FIN on the last data segment, elim- [7] Vivek S. Pai, Peter Dmschel, and Willy Zwaenepoel. I/O Lite: A
inating one packet in small transfers and improving through- copy-free UNIX I/O system. In 3rd USENLYSymposium on Operating
put by 6 percent in those cases. We also show how delaying SystemsDesign andZmplementation, New Orleans, LA, February 1999.
acknowledgments for the FIN and SYN-ACK packets can
eliminate 2 more packets, increasing performance an addi-
tional 14 percent for small transfers. In total, we reduce the
packets in a small HTTP exchange from 9 to 6, reducing
network utilization and raising server throughput by up to
20 percent in those scenarios, all without violating the TCP
protocol specification. Our changes are more easily deployed

217

You might also like