Performance Study of The GSM Circuit-Switched Data Channel Research Project
Performance Study of The GSM Circuit-Switched Data Channel Research Project
Performance Study of The GSM Circuit-Switched Data Channel Research Project
Research Project
Submitted to the Department of Electrical Engineering and Computer Sciences, University of California at Berkeley, in partial satisfaction of the requirements for the degree of Master of Science, Plan II. Approval for the Report and Comprehensive Examination:
Committee:
Professor Anthony D. Joseph Research Advisor 16 December 1999 ****** Professor Steven McCanne Second Reader 16 Dercember 1999
1 Abstract
Wireless Communication is the fastest growing area in communications today. Many studies are being performed to understand and improve the performance over wireless connections, a high bit error rate and lossy environment. One of the Iceberg 7] project's research area focuses on the performance of multilayered protocols in a wireless environment. In particular, we are studying the interactions between the Transmission Control Protocol (TCP), a reliable end-to-end transport layer protocol, and the Radio Link Protocol (RLP) 4], 5], a reliable link layer protocol for the wireless connection in the GSM (Global System for Mobile communications) network 2]. Each protocol has its own error recovery mechanisms and we believe that by studying the interactions of these protocols we will improve the performance of the wireless GSM system. We have developed a multi-layer tracing tool to analyze the protocol interactions between the layers. Initially, we hypothesized that delay introduced by RLP would unnecessarily trigger TCP's congestion control algorithm, thus degrading performance. However, our studies show this to be false 2]. Using our multi-layer analysis tool, we have identi ed some of the causes that degrade performance: (1) ine cient interaction with TCP/IP header compression, and (2) excessive queuing caused by overbu ered links. Furthermore, in a medium where the wireless error rate is low, it is natural to think that a big frame size would increase performance. However, if the wireless medium has high error rate, which would cause numerous retransmissions, performance decreases as the frame size increases. In this masters' report, we present an analysis of how the frame size in the RLP layer a ects performance and determine the optimal frame size.
(e.g., stationary indoors, walking, and driving). By analyzing these traces, we have drawn conclusions about the interaction between the protocols. We were interested in determing whether there are many ine cient interactions between TCP and RLP during bulk data transfer operations. Our analysis shows that competing error recovery is not a problem, however we have discovered other issues that degrade the overall performance of the data transfer. Furthermore, in this work we show that the throughput on GSM wireless channel can be improved by using a larger RLP frame size. We show that we can increase performance by 25 percent when the channel quality is good, and by 18 percent when the channel quality is bad. Thus, we determine the optimal xed frame size in terms of throughtput.
3 Background
In this section, we rst provide a background on the Radio Interface in GSM. We describe the Radio Link Protocol (RLP) used by GSM in the radio interface to provide reliability. During a data call, RLP runs between the Terminal Adaptation Function (TAF) of the mobile host and the Interworking Function (IWF) of the Mobile Switching Center 1]. We conclude the section by providing an overview on the functionalities of Transmission Control Protocol (TCP).
GSM implements several error control mechanisms, including adaptive power control, frequency hopping, Forward Error Correction (FEC), and Interleaving. FEC (channel coding) in GSM is performed by using a convolutional code, which adds redundancy bits to the data to detect and correct errors. Besides FEC, GSM performs interleaving to avoid error burstiness. Instead of transmitting a RLP frame, the frame is divided into separate small blocks, these blocks are then interleaved into 22 time slots for transmission. A time slot is 5 milliseconds, and can send 114 bits per time slot, producing a data rate of 22.8 kbits/sec. The encoded frame carries 456 bits (sent in 4 time slots), and the encoded frame has 240 bits, resulting in a data rate of 240 bits every 20 ms (12 kbits/sec). Out of the 240 bits, 48 bits are RLP overhead and 192 bits are RLP user data, yielding a user data rate of 9.6 kbits/sec.
There are two error recovery mechanisms in RLP, (1) Selective Reject (SREJ), and (2) checkpointing. In gure 1 we illustrate the SREJ mechanism. SREJ is initiated by the receiver. The receiver explicitly requests retransmission of missing I-frames. Whenever an I-frame is received out of order, the receiver will send an S-frame with SREJ in the control eld, and N(R) equal to the sequence number of the missing frame. The sender is not required to \roll back", but instead retransmits the I-frame of the sequence number equal to the N(R) sent by the receiver. The sender then returns to its state before retransmission and continues to transmit frames. It should be noted that every SREJ frame is protected by a single retransmission timer and the sender is limited to retransmitting the same frame no more than N2 times.
sender 1 2 3 2 4 5
frame 2 receiver
ack 1
ack 3 SREJ
ack2
ack4
Figure 1: SREJ illustration. Figure 2, exempli es the checkpointing mechanism. Checkpointing is a phase initiated by the sender whenever there is a timeout of an aknowledgment. The sender sends an S-frame, sets the poll bit of the control eld to one and waits for the receiver's response. The receiver responds with an S-frame indicating the receive sequence number (N(R)) of the frame that it is expecting. Finally, the sender retransmits all the frames in sequence, starting from N(R) (GoBackN).
timeout fr. 2 sender 1 2 3 Poll 4 5
N(R)=4
receiver
ack 1 ack 2
ack 3
Final
ack4
ack5
4 Methodology
Our study is based upon measurement-based trace analysis. This section describes the testbed and tools we used to collect traces. Then, we explain the di erent enviThe receiver will assume that a segment has been lost, whenever it receives a segment out of sequence.
2
Figure 3: TCP congestion control. ronments in which we performed measurements and how we collected block erasure traces. We conclud with an explanation of the target metrics that we used to perform our analysis.
In the future we will use a testbed which is being developed in the ICEBERG project 7]. The ICEBERG testbed will have a stand-alone GSM base station together with a gateway that \translates" between circuit-switched and IP-based packet-switched voice and data tra c. For this purpose, we are currently implementing the network side of RLP, which terminates RLP in the gateway. For this work, we have used the commercial GSM network for which the network-side of RLP was not accessible.
Traffic Source/Sink (e.g. sock) TCP/IP/PPP RLP Traffic Source/Sink (e.g. sock)
GSM Network
GSM Base Station
TCPDUMP
TCPDUMP
TCPSTATS
TCPSTATS
RLPDUMP
BONES Simulator
Figure 4: Measurement platform and tools. In order to study performance in terms of throughput, we performed data bulk transmissions using the sock 3] program at the application layer. Going end-to-end, we had TCP running on the transport layer, IP on the network layer, and PPP (point-to-point protocol) 11] at the data link layer. We used several diagnostic tools to collect information at the TCP and RLP layers. To trace information at the TCP sender and receiver, we used tcpdump 8] and tcpstats 9]. tcpdump monitors a host's interface I/O bu ers and generates a single log le containing a timestamp specifying when a packet was placed in the sender's bu er and the packet header. From the data generated by tcpdump, we can graph time/sequence plots. We used tcpstats to
generate information at the sender TCP layer, such as: congestion window, slow start threshold, and the retransmission timeout value. To collect and correlate TCP and RLP measurements, we ported the RLP protocol of a data PC-Card DC23 to BSDi3.0 UNIX. We, also instrumented the RLP code to log connection related information in the fashion of tcpdump/tcpstats. Thus, rlpdump logs time/sequence information and also events, like SREJs, retransmissions, ow control signals (XON/XOFF) and RLP link resets.
layer and the RLP layer. The traces collected at the RLP layer provide information down to the level of whether an FEC (Forward Error Correction) encoded radio block was decoded successfully or had to be retransmitted. Our block erasure traces consist of binary time series where each element represents the state of an RLP frame. A corrupted block (unsuccessfully decoded) has the value \1", while a non-corrupted block (successfully decoded) has the value \0". Our analysis is based on 500 minutes of \air-time" traces: 258 minutes of stationary good, 215 minutes of stationary bad, and 44 minutes of mobile. We used three di erent block erasure traces for our analysis. One which we call trace-A is a concatenation of all block erasure traces collected in environment A. Likewise, trace-C and trace-B are the concatenation of the block erasure traces collected in environments C and B, respectively.
Bytes 30000 28000 26000 24000 22000 20000 18000 16000 14000 12000 10000 6 8 10 12 14 16 18 20 22 24 Time of Day (sec)
TcpSnd_ack RTT Fast retransmit on 3rd DUPACK TcpSnd_data Bytes unacked MSS = Difference between 2 "dots"
Figure 5: A TCP sender-side time/sequence plot. TcpSnd-cwnd: size of the congestion window at each time it changes value. TcpSnd-srtt: value of the smooth round trip time (SRTT)3 . TcpSnd-vrtt: value of the round trip time varience (VRTT). TcpSnd-sstrsh: value of the slow-start threshold. At the receiver TCP layer: TcpRcv-data: time and sequence number at which segments are received. TcpRcv-ack: time and sequence number at which acknowledge are sent.
3
At the sender RLP layer: RlpSnd-data: time and sequence number at which frames are sent. RlpSnd-ack: time and sequence number at which acknowledgements are received. RlpSnd-data-rexmt: time and sequence number at which frames are retransmitted. RlpSnd-poll-snd: time at which sender sends a supervisory frame with poll bit set to \1" (this correspons to checkpointing mechanism on section 3.1.1). RlpSnd-rpoll-rcv: time at which sender receives a supervisory frame with xxx bit set to \1" responding to the checkpointing mechanism. RlpSnd-rst: time at which rlp resets the link. RlpSnd-srej-rcv: time at sender receives a select reject (SREJ) from the receiver. RlpSnd-xo : time at which control message xo arrives to the sender asking the sender to stop sending. RlpSnd-xon: time at which control message xon arrives to the sender asking the sender to continue sending. By using a plotting tool (e.g. xgraph), we can graphically plot the les we want.
6 Measurement Results
In this section, we present the results obtained by using MultiTracer 2] on the collected data. We rst were interested in nding spurious timeouts and identifying the causes. Spurious timeouts are due to an RLP ine ciency and indicate a poor TCP/RLP interaction. However, we show in section 6.1, that this is not the case. By doing a detailed analysis of the plots generated by MultiTracer, we identi ed other interesting events. In section 6.2, we explain the problem of local bu er over ow. Section 6.3, provides a detailed analysis of the impact of link resets on TCP, which is a cause of poor protocol interaction. The last section, refsec:other e ects, explains the xon/xo e ect found in RLP.
50% 40% 30% 7.0 - 8.8 Kb/s 20% 6.4 - 6.6 Kb/s 10% 0% 92% 93% 95% 99% 100% TCP Channel Utilization (in percent) 5.7 - 6.2 Kb/s 5.8 - 6.2 Kb/s
Figure 6: TCP channel utilization. derived UNIX 3] is 50 packets. Obviously, this is an inappropriate size for a mobile device, which usually does not have a large number of simultaneous connections. We have purposefully compiled a kernel with an interface bu er that was smaller than 8 KBytes, the default socket bu er size used by BSDi3.0, to provoke a local packet drop as shown in Figure 7. This triggers the \tcp-quench" (source quench 3]) function call to the TCP sender which in response resets the congestion window back to one. After about one half of the current RTT, the sender can again send additional segments until the \dupack" for the dropped packet trigger the fast retransmit algorithm (see Section 3.2). This leads to setting the congestion window to one half of its value before the local drop occurred. At this point, the sender has reached the advertised window and cannot send any additional segments (which it could have otherwise) while further \dupack" return. Thus, when the retransmission is acknowledged, a burst of half the size of the sender-side socket bu er (8 segments) is sent out by the TCP sender at once. As can be seen from the TCP receiver trace in gure 7, excessive queuing, the ups and downs of the congestion window at the TCP sender, and even retransmissions do not degrade throughput performance. But excessive queueing has a number of other negative e ects:
45 00 0 40 00 0 35 00 0 30 00 0
8 KBytes
T cpS nd_a ck
25 00 0 20 00 0 15 00 0 10 00 0 5 00 0
Tc pS nd_cw nd Fast Retr ans mit on 3 rd DUP ACK Packet dropped due to inte rfac e buffer overflow
Figure 7: Local bu er over ow. It in ates the RTT. In fact, a second TCP connection established over the same link is likely to su er from a timeout on the initial connect request. This timeout occurs because it takes longer to drain the pipe queue (here up to 14 x MTU or 7 KBytes) on a 960 bytes/s link than the commonly used initial setting for TCP's retransmission timer (6 seconds). If the timestamp option is not used, the RTT sampling rate is reduced, leading to an inaccurate retransmission timer value 14]. An in ated RTT inevitably leads to an in ated retransmission timer value, which can have a signi cant negative impact on TCP's performance, e.g., in the case of multiple losses of the same packet. The negative impact results from the exponential back-o of the retransmission timer and can be seen in Figure 9. For downlink transmissions (e.g., web browsing), where no appropriate limit is imposed on the outbound interface bu er of the bottleneck router, the data in the pipe queue may become obsolete (e.g., when a user aborts the download of a web page in favor of another one). The \stale data" must rst drain from the queue, which in case of a low bandwidth link, may take on the order of several seconds.
A simple solution to these problems is to statically adjust the interface bu er size to the order of the interface's bit rate. A more advanced solution is to deploy active queue management 15] at both sides of the bottleneck link. The goal is to adapt the bu er size available for queueing to the bit rate of the interface, a given worst-case RTT, and the number of connections actively sharing the link. Combining active queue management with an explicit congestion noti cation mechanism 16] would further improve network performance as fewer packets would have to be dropped and retransmitted (in the case of TCP). In fact we regard it as imperative that these mechanisms be implemented at both ends of wide-area wireless links, which we believe will be the bottleneck in a future Internet.
By te s
4160 00 4140 00 4120 00 4100 00 4080 00 4060 00 4040 00 4020 00 4000 00 3980 00 4 80
RlpS nd_rs t 18 Se gm ent s 5 Se gm ents lost due to R LP Re se t Tcp Rcv _ dat a 13 Se gm ent s dropped a t TC P rec eiv er
Tcp Rcv_ ac k
4 85
4 90
4 95
5 00
5 05
5 10
5 15 5 20 Tim e of Day (s ec )
Figure 8: Impact of link reset on TCP layer (example 1). knowledgment get lost, including the one for the rst retransmission; again due to a RLP link reset. This loss leads to an exponential timer back-o of the retransmission timer. Since the retransmission timer value is signi cantly in ated (see Section 6.2), this has a particularly bad e ect. We want to point out, though, that RLP link resets are very rare events. We have captured 14 resets, all of which occurred when the receiver signal strength was extremely low. In all cases, the link reset was triggered because a particular RLP frame had to be retransmitted more than 6 times (the default value of the RLP parameter N2, \maximum number of retransmissions"). Our results suggest that this default value is too low and needs to be increased. TCP connections before and after the link reset usually progress without problems and there is no apparent reason why the link should be reset. Increasing N2 is also supported by the fact that we did not nd any sign of competing error recovery between TCP and RLP during bulk data transfers (see Section 6.4). Initial results indicate that TCP can tolerate a fairly high N2 without causing competing error recovery. This initial result and the negative interactions with header compression suggest that link layer retransmissions should be more persistent when transmitting fully reliable ows, e.g., TCP-based ows. This not only pertains to RLP 4] 5], but also to comparable protocols which are intentionally designed to operate in a semi-reliable mode 17]. Recent studies of TCP over WLAN
(Wireless Local Area Network) links report similar results 18]. On the other hand, persistent link layer retransmissions are not tolerable for delay-sensitive ows.
B yt es
72 000
67 000
T cpRcv_ack T cpRcv _data
62 000
T cpS nd _data 1st Retr ansm ission 2nd Retr ansm ission
57 000
1st RT O : 7s RlpSnd_ rs t 2n d RT O: 14 s T cpS nd _ack
shows a burst of retransmissions on the RLP layer of 1325 ms leading to a \hole" of 2260 ms at the TCP receiver. One reason for the di erence in these values is that the end of a segment could have been a ected by the retransmissions, which would require a full round-trip time on RLP layer (about 400 ms, see 6]). It cannot be the case that the returning acknowledgments were delayed in addition to the segment, as the plot shows no sign of acknowledgment compression 21]. We were curious to understand why 13] did nd spurious timeouts in their study which used almost the same network setup as ours. The authors of that study believed that these spurious timeouts were caused by excessive RLP retransmissions (i.e., because of competing error recovery between TCP and RLP). While it appears as if our results contradict the results of 13], our in-progress work indicates that this is not the case. The reason apparently lies in di erences between the implementations of TCP that were used in both studies. Some implementations of TCP seem to maintain a more aggressive retransmission timer than others. Moreover, the TCP implementation we used (BSDi 3.0) uses the timestamp option 14], yielding a more accurate estimation of the RTT and consequently also a more accurate retransmission timer. Timing every segment instead of only one segment per RTT (which is done when the timestamp option is not used) enables a TCP sender to more quickly adapt the retransmission timer to sudden delay increases. Thus, we believe that timing every segment is an attractive enhancement for TCP in a wireless environment. However, we are not convinced that this requires the overhead of 12 bytes for the timestamp option eld in the TCP header.
3 00 00 2 50 00
2 00 00
T cpR cv_data
8 K Bytes
1 50 00 1 00 00
T cpSnd _data
5 00 0
1325ms
Figure 10: TCP spurious timeout. is, however, an unacceptable alternative. Not only will the user in many cases have to re-initiate the data transfer (e.g., a le transfer), but will also be charged for air time that yielded an unsuccessful transmission.
27000
22000
17000
R lpSnd _data (1198 bytes/sec)
12000
RlpSnd_XOFF
Figure 11: RLP/L2R ow control. throughput for di erent frame sizes (from 30 to 1500 bytes) for a given block erasure trace. The optimal frame size corresponds to the frame size which yield the maximum throughput value. We used trace-C and trace-A (see section 4.3) to perform retrace analysis. We also wanted to understand the impact of error burstiness. For this purpose, we generated arti cial traces with evenly distributed errors. Trace-C-even is an arti cial generated trace with same error rate as trace-C, but with the errors being evenly distributed across the trace. These arti cial traces show that error burstiness allows for larger frame sizes. Figure 12 shows the results of our retrace analysis on trace-A, trace-C, and traceC-even. For trace-A the optimal frame size is 1410 bytes, yielding 25 percent increased in throughput. For Trace-C the optimal frame size is 210 bytes with an 18 percent increased in throughput. From this, we concluded that the frame size chosen for RLP was too conservative. Even in the worst case scenario (trace-C), performance can be improved by 18 percent. It is interesting to compare the plot for trace-C to the plot for trace-c-even. TraceC-even yields an optimal frame size of 60 bytes. This results show that error burstiness
in the channel allows for larger frame sizes. Assuming errors in the channel are evenly distributed leads to the wrong frame size choice. Given that the choice for the frame size in the next generation of GSM is 60 bytes, this decision raises the interesting question of what error model was used to make the decision.
1600
trace_C
trace_A
Optimal Frame Size = 1410 bytes (Throughput ~ 1423 bytes/s => 25 % improvement)
Throughput (Bytes/s)
1200
1000
Optimal Frame Size = 210 bytes (Throughput ~ 1295 bytes/s => 18 % improvement)
800
600
trace_C_even
400
200
0 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500
8 Conclusion
In this project we have presented an analysis tool MultiTracer, which allows protocol developers the study of complex interaction between TCP and RLP. We have used MultiTracer to analyze 500 minutes of collected traces, and draw conclusions on the interaction e ciency. The main result of the study of the two protocols is that competing error recovery is a rare event. We instead found poor protocol interaction whenever RLP link resets occur and TCP uses VJ header compression. We also, demonstrate the negative impact of overbu ered links. The second part of this project focused on optimizing the RLP frame size to increase throughput. We show that the throughput can be improved by up to 25 percent by increasing the frame size on RLP. We also argue that the GSM channel has an error burstiness nature which allows larger frame sizes.
9 Acknowledgments
I would like to thank Prof. Anthony Joseph and Reiner Ludwig for their support and valuable contribution to the work presented in this thesis. Special thanks to Keith Sklower for all his help and time. Thanks to Kimberly Oden for her help in developing MultiTracer. I also would like to thank Bela Rathonyi for implemeting and instrumenting the RLP code for BSDI. And many thanks to all the members of the ICEBERG group.
References
1] M. Mouly, M. B. Pautet, The GSM System for Mobile Communications, Cell and Sys, France 1992. 2] R. Ludwig, B. Rathonyi, A. Konrad, K. Oden, A. Joseph, Multi-Layer Tracing of TCP over a Reliable Wireless Link, In Proceedings of ACM SIGMETRICS 99. 3] W. R. Stevens, TCP/IP Illustrated, Volume 1 (The Protocols), Addison Wesley, November 1994. 4] ETSI, Radio Link Protocol for data and telematic services on the Mobile Station - Base Station System (MS-BSS) interface and the Base Station System - Mobile Switching Center (BSS-MSC) interface, GSM Speci cation 04.22, Version 5.0.0, December 1995. 5] ETSI, Digital cellular communications system (Phase 2+); Radio Link Protocol for data and telematic services on the Mobile Station - Base Station System (MSBSS) interface and the Base Station System - Mobile Switching Center (BSSMSC) interface, GSM Speci cation 04.22, Version 6.1.0, November 1998. 6] R. Ludwig, B. Rathonyi, Link Layer Enhancements for TCP/ IP over GSM, In Proceedings of IEEE INFOCOM 99. 7] The ICEBERG project, CS Division, EECS Department, University of California at Berkeley, http://iceberg.cs.berkeley.edu/. 8] Jacobson V., Leres C., McCanne S., tcpdump. Available at http://ee.lbl.gov/. 9] Padmanabhan V., tcpstats, Appendix A of Ph.D. dissertation, University of California, Berkeley, September 1998. 10] Xgraph, available at http://jean-luc.ncsa.uiuc.edu/Codes/xgraph/index.html. 11] W. Simpson, The Point-to-Point Protocol, RFC 1661, July 1994.
12] Jacobson V., Compressing TCP/IP Headers for Low-Speed Serial Links, RFC 1144, February 1990. 13] M. Kojo, K. Raatikainen, M. Liljeberg, J. Kiiskinen, T. Alanko, An E cient Transport Service for Slow Wireless Telephone Links, IEEE JSAC, Vol. 15, No. 7, pp. 1337-1348, September1997. 14] R. C. Durst, G. J. Miller, E. J. Travis,TCP Extensions for Space Communications, In Proceedings of ACM MOBICOM 96. 15] B. Braden, et al.,Recommendations on Queue Management and Congestion Avoidance in the Internet, RFC 2309, April 1998. 16] K. K. Ramakrishnan, S. Floyd,A Proposal to add Explicit Congestion Noti cation (ECN) to IP, RFC 2481, January 1999. 17] P. Karn, The Qualcomm CDMA Digital Cellular System, In Proceedings of the USENIX Mobile and Location-Independ- ent Computing Symposium, USENIX Association, August 1993. 18] D. A. Eckhardt, P. Steenkiste,Improving Wireless LAN Per- formance via Adaptive Local Error Control, In Proceedings of IEEE ICNP 98. 19] Balakrishnan H., Padmanabhan V., Seshan S., Katz R. H.,A Comparison of Mechanism for Improving TCP Performance over Wireless Links, In Proceedings of ACM sigcomm 96. 20] DeSimone A., Chuah M. C., Yue O.-C, Throughput Performance of TransportLayer Protocols over Wireless LANs, In Proceedings of IEEE globecom 93. 21] Paxson, V., End-To-End Routing Behavior in the Internet, IEEE/ACM Transactions on Networking, Vol.5, No.5, pp. 601-615, October 1997. 22] R. Ludwig, A. Konrad, A. Joseph, Optimizing the End-To-End Performance of Reliable Flows over Wireless Links, In Proceedings of ACM/IEEE MobiCom 99.