Hardware Support For A Hash-Based IP Traceback'
Hardware Support For A Hash-Based IP Traceback'
Luis A. Sanchez,2Walter C. Milliken, Alex C. Snoeren, Fabrice Tchakountio, Christine E. Jones, Stephen T. Kent, Craig Partridge, and W. Timothy Strayer
Abstract
The Source Path Isolation Engine (SPIE) is a system f f capable o tracing a single IP packet to its point o origin o r point o ingress into a network. SPIE supports tracing f f by storing a few bits o unique information about each packet for a period of time a s the packets traverse the network. Software implementations of SPIE can trace packets through networks comprised of slow to medium speed routers (up to OC-12), but higher speed routers (OC-48 and faster) require hardware support. In this paper, we discuss these hardware design aspects of SPIE. Most of the hardware resides in a self-contained SPIE processing unit, which may be implemented in a line card form factor f o r insertion into the router itself; o r as a stand-alone unit that connects to the router through an external interface.
1. Introduction
Today's Internet infrastructure is vulnerable to motivated and well-equipped attackers. Much work is being done to safeguard resources, detect an attack, and, if possible, attempt to thwart the attack. A more difficult problem is determining the origins of an attack. Accurate and reliable identification of attackers is currently extremely difficult because the network routing infrastructure is stateless and based largely on destination addresses-no records are kept in the routers, and the source address is not trustworthy. The attacker can generate offending IP packets masquerading as having originated almost anywhere, including from I P
' This work was performed under DARPA contract N66001-00-C-8038 ' At the time the work described in this paper was completed, Mi-. Sanchezwas an employee of BBN Technologies.
146
The obvious approach to single packet traceback is simply to log packets at various points throughout the network, and then use appropriate extraction techniques to discover the packets path. Logging requires no computation on the routers fast path, and hence can be implemented efficiently in todays router architecture. However, the effectiveness of the logs is limited by the amount of space available to store them and the internal data bandwidth available to copy the packets as they pass through the router. Given todays link speeds, packet logs quickly grow to intractable sizes, even over relatively short time frames. An OC-192 link is capable of transferring 1.25GB per second. If one allows 60 seconds to discover an attack and conduct a query, a router with 16 links would require 1.2TB of high-speed storage. Sampling techniques can lessen these requirements but also reduces the probability of detecting small flows. Alternatively, routers can be tasked to perform more sophisticated auditing in real time, extracting a smaller amount of information as packets are forwarded. Many currently available routers support input debugging, a feature that identifies on which incoming port a particular outgoing packet (or set of packets) of interest arrived. Since no history is stored, however, this process must be activated while the flow of interest is currently passing through the router. Furthermore, due to the high overhead of this operation on many popular router architectures, activating it may have adverse effects on the traffic currently being serviced by the router. Both Sager [SI and Stone [9] have proposed sophisticated logging of router events for attack analysis, but both schemes introduce significant processing and storage overhead in the routers. Schnackenberg et al. propose a special Intruder Detection and Isolation Protocol (IDIP) to facilitate interaction between routers involved in a traceback effort [IO]. IDIP does not specify how participating entities should track packet traffic; it simply requires that they be able to determine whether or not they have seen a component of an attack matching a certain description. SPIE, the Source Path Isolation Engine, is a system that provides traceback capability on a per-packet basis. SPIE-enabled routers record tiny digests of each packet as it passes. These digests are kept for a period of time long enough to allow after-the-fact traces of the packet. At medium to high speeds, however, calculating and storing the packet digests requires hardware assistance. SPIEs design allows this support to be provided through a combination of simple modifications to current router line cards and a limited amount of additional hardware. A full description of SPIE and an analysis of its effectiveness are given in [ 1 I]. In this paper, we present in detail the hardware components of the SPIE system which provides hash-based single packet traceback.
147
ISP's 6 2 - Network
--
--
as a stand-alone unit that connects to the router through an external interface. In either case, all interfaces on the router must be extended to support data collection through the implementation of signature taps. The signature tap is a relatively simple piece of hardware that computes n independent 32-bit digests (S32 in the diagram) of each packet that arrives on a SPIEcapable interface. These digests are then passed to the SPIE processing unit on a separate signature bus. The signature aggregation stage of the SPIE card produces a periodic digest table of size 2k bits (where k is at most 32), covering a time interval R, which is then stored in a large bulk history memory organized as a time-slotted ring buffer containing the past digest tables. Entire digest tables in the history memory, indexed by collection interval, can be transferred to the control processor on demand for transmission to a SCAR. Generally, due to network latencies and timing uncertainties in the arrival time of a packet of interest at a given router, several tables will need to be fetched surrounding the estimated time of packet arrival. Note that different systems can choose different values of n, k , and R, as long as they report how each of the n digests are computed, which k bits of the 32 digest bits are used, and the time span covered by each digest table sent in response to a query. Selection of appropriate values of n, k , and R is discussed after a more detailed description of each individual stage of the hardware design.
3. HardwareDesign
Data Generation Agents are the primary hardware components of the SPIE system. The entire system depends on their functionality and feasibility. One particular concern is the ability to implement DGAs for very high data rate interfaces. Here we present the issues of implementing DGAs in hardware, with a particular focus on high performance implementations. Figure. 2 illustrates. the key components of the SPIE hardware. Most of the hardware exists in a self-contained SPIE processing unit, which may be implemented in a line card form factor or insertion int6 the router itself, or
I
148
checksum, and TOS header fields are zeroed out since they change frequently in transit. Router line cards can be easily modified to include signature taps; in some routers, the IP forwarding engine may be able to compute signatures during the forwarding process. In other architectures, it may make more sense to place the signature tap on the output of the ingress layer-2 packet framer. In any event, the amount of logic required to compute a CRC-32 digest is small, and easily added to most line cards. Signature taps can be connected to the SPIE processing unit in several ways. For lower-speed routers, a simple serial interface with specialized framing will suffice. Depending on the form factor, the serial link could be routed over the back plane to an in-chassis SPIE card, or run to an auxiliary external connector on the line card bulkhead for connection to a separate SPIE box. (For OC-192 line cards, a Gigabit Ethernet interface may be a relatively simple way to transmit digests, since the signature bit rate is about 10%of the link rate.) With an external SPIE implementation, it is also possible to put the signature taps in the SPIE box itself, with a set of feed-through connectors passing the input signals on to the actual router input ports. This requires the taps to include sufficient logic to extract IP packets from the link. This may be a viable approach for Ethernet interfaces and lower-speed single-channel packet over SONET (POS) interfaces, for which one-chip framers are available.
Line Cards
Signature Taps
Signature Aggregation
History Memory
3.2. Transforms
Signature taps do not encompass all packet recording at the router. IP packets may undergo valid transformation (e.g., fragmentation, IPsec tunneling, network address translation) while traversing the network, in which case, packet digests do not sufficiently enable traceback of a packet. Packet transforms must be recorded such that the original packet is able to be reconstructed during the traceback process. Therefore, DGAs record within a transform lookup table (TLT) the hash of a transformed packet, the type of transformation, and sufficient packet data to enable packet reconstruction. Transform information is provided to the DGA by the control processor because packets that undergo transformation are generally routed through the control path of the router. DGAs record and store within the history memory a corresponding TLT for every packet digest table. Accordingly, digest table/TLT pairs are read out of the history memory to service a SPIE traceback query. For a more in-depth discussion of SPIEs handling of packet transformation, refer to [ 111.
149
readout process can again be assumed to be negligible. Assuming somewhat optimistic 5ns cycle times per read or write, up to 100 Msignaturedsec can be accommodated. This is still well below the rate required for the next-generation core routers described above, but might suffice for routers from the current OC-48 generation. Therefore, the requirements of high-end routers lead to the design of Figure 2, using multiple signatures SRAMs and additional aggregation at the digest table level. Each aggregation SRAM is shared by as many SPIE ports as will comfortably fit within its performance limit. If we choose to use relatively cheap and dense SRAMs, a lOns access time is reasonable, supporting about 50 Msignatureskec. This is sufficient for one OC-192 link, using a simple time-multiplexing scheme. It may also be feasible to use DRAMs for the firstlevel signature memory for a single OC-192 link, but this requires the highest speed grades available, and an SRAM solution may actually be more cost-effective. A 1.92 Gsignaturehec SPIE unit supporting a 32-port OC-192 router requires about 12 to 16 of the SRAM-based firstlevel aggregation memories. All of these memories are then read out in parallel, with the outputs ORed together, to produce a single global digest table to pass to the history memory.
150
The history memory cost will likely dominate the cost of the SPIE board or box. Assuming the 32-port OC-192 router example, with a 16Mb digest table every 5ms, a single current-generation 256Mb DRAM can store about 80ms of history. (Note that the input data bandwidth is about 3Gb/s, which is within the range of feasibility for DRAMs performing sequential memory accesses.) A buffer of 30 seconds requires 375 of these devices, which is barely within the range of feasibility; if new 1Gb DRAMs were used, the memory array would require a much more reasonable 94 chips. With current DRAM memory prices below $l/MByte, this is less than $12k worth of memory. This seems a very reasonable cost to support a 32-port OC-192 router.
bits per packet or between 0.3% and 0.6% of the link bandwidth. This is a very manageable data rate, even at high link bandwidths. The challenge, therefore, in high performance SPIE systems is to find a method to reduce the volume of SPIE data being handled from k bits to one or two bits as swiftly as possible. In this paper we achieved this goal using the signature aggregation module described in sections 3.3 and 3.4, along with an efficient implementation of history memory.
5. Acknowledgment
We thank Charles Lynn for providing great ideas during the early stages of this work. We also thank John Lowry for his encouragement and many helpful discussions.
6. References
[l] Y. Rekhter, B. Moskowitz, D. Karrenberg, G. J. de Groot, and E. Lear, Address Allocation for Private Intemets, IETF Network Working Group, RFC 1918, Naval Research Laboratory, February 1996.
[2] H. Burch, B. Cheswick, Tracing Anonymous Packets to Their Approximate Source, Proc. USENIX LISA 00, December 2000.
[3] S. Bellovin, ICMP Traceback, Message to the IETF ICMP Traceback WG, http://www.research.att.com/-smb
[4] S. Savage, D. Wetherall, A. Karlin, and T. Anderson, Technical Report UW-CSE-00-02-01, Practical Network Support for IP Traceback, Proc. ACM SIGCOMM 00, August 2000.
[SI D. Song and A. Perrig, Advanced and Authenticated Marking Schemes for IP Traceback, Proc. IEEE INFOCOM 2001, April 2001.
[6] Computer Emergency Response Team. Cert advisory ca2000-01 denial of service developments. http://www.ceet.org/advisories/CA-2000-01 .html.2000.
4. Conclusion
It is worth stepping back from the details of the hardware design presented to look at the larger message. The architecture of a SPIE DGA requires the computation of n digest values, each k bits long. Given an average Internet packet size of between 1K and 2K bits, an n of 3, and a k of at most 32, the data rate of digests being produced is between 5% and 10% of the link bandwidth. When link bandwidths are large (e.g., 10s of gigabits), those portions become challenging numbers. Yet, the final amount of data that must be stored is fairly small. SPIE stores the digest values in a bitmap$ thus reducing the digest cost from k to one or two bits. So, in the end, the cost of keeping SPIE data is only about 6
[7] Microsoft Corporation. Stop O in Tcpipsys When A Of Band (OOB) Data, Receiving Out
http://support.microsoft.com/supportflcb/articles/Q143/4/78.asp [8] G. Sager, Security Fun with ocxmon and cflowd, Presentation at the Internet 2 Working Group, November 1998. http://www.caida.org/projects/NGI/content/security/1l98.
[9] R. Stone, Centertrack: An IP overlay network for tracking DOSFloods, Proc. of gth USENIX Security Symposium, August 2000.
[IO] D. Schnackenberg, K., Djahandari, and D. Steme, Infrastructure for Intrusion Detection and Response, Proc.
151
January 2000.
[I 11 Alex C. Snoeren, Craig Partridge, Luis A. Sanchez, W. Timothy Strayer, Christine E. Jones, Fabrice Tchakountio, and Stephen T. Kent, Hash-Based IP Traceback, BBN Technical Memo 1284, February 12,2001,
[I21 B. H. Bloom. Space/time trade-offs in hash coding with allowable errors, Communications of the ACM, Vol. 13, No. 7, July 1970, pp. 422-426. [13] D. Mills, Network Time Protocol Version 3 Specification, Implementation and Analysis, RFC 1305, UDEL, March 1992. [14] J. Stone, M. Greenwald, C. Partridge, and J. Hughes, Performance of Checksums and CRCs over Real Data, IEEHACM Trans. on Networking, Vol. 6, No. 5, October 1998, pp. 529-543. [15] L. Fan, P. Cao, J. Almeida, and A. Broder. Summary cache: a scalable wide-area web cache shanng protocol, IEEELACM Trans.-on Networking, Vol. 8, No. 3, June 2000.
152