How To Enable Better Service Assurance Using The PCRF: Kerstin Bengtsson
How To Enable Better Service Assurance Using The PCRF: Kerstin Bengtsson
How To Enable Better Service Assurance Using The PCRF: Kerstin Bengtsson
KERSTIN
BENGTSSON
KERSTIN
BENGTSSON
Masters Thesis in Computer Science (20 credits) at the School of Computer Science and Engineering Royal Institute of Technology year 2006 Supervisor at CSC was Henrik Eriksson Examiner was Stefan Arnborg TRITA-CSC-E 2006:154 ISRN-KTH/CSC/E--06/154--SE ISSN-1653-5715
Royal Institute of Technology School of Computer Science and Communication KTH CSC SE-100 44 Stockholm, Sweden URL: www.csc.kth.se
Preface
This thesis is the result of a masters project at Nada, Royal Institute of Technology carried out at Ericsson Research. Supervisor at Nada has been Henrik Eriksson. I would like to thank the people of department KI/EAB/TGF/F at Ericsson Research for their support and involvement throughout the project; Niklas Bjrk, Mona Matti, Tor Kvernvik, Tony Larsson and Mattias Lidstrm. A special thanks to my supervisor at Ericsson, Mattias Lidstrm who was always willing to answer my questions. I also want to thank my family and Ken for their support during this period.
Table of Contents
1 INTRODUCTION ................................................................................. 1
1.1 Assignment and objectives ..................................................................................................... 1
2 SERVICE ASSURANCE...................................................................... 2
2.1 Advantages with Implementing Service Assurance................................................................ 2 2.2 Quality of Service ................................................................................................................... 3 2.2.1. Key Performance Indicators........................................................................................... 4 2.2.2 Classes ............................................................................................................................ 5 2.2.3 Service Layer Agreement................................................................................................ 5
4 NETWORK .......................................................................................... 8
4.1 The 3GPP................................................................................................................................ 9 4.2 Radio Access Network............................................................................................................ 9 4.3 Core Network........................................................................................................................ 10 4.4 PDP Context..................................................................................................................... 11 4.5 Policy and Charging Control ................................................................................................ 11 4.5.1 Release 6 ....................................................................................................................... 11 4.5.2 Release 7 ....................................................................................................................... 12 4.6 The Diameter Protocol.......................................................................................................... 14 4.6.1 Diameter Header ........................................................................................................... 15 4.6.2 Attribute Value Pairs..................................................................................................... 15
5 PRESTUDY ....................................................................................... 17
5.1 Method.................................................................................................................................. 17 5.2 The Flow of Information through the PCRF......................................................................... 17 5.2.1 Use Case: IMS Voice Telephony.................................................................................. 19 5.2.1.1 Description ............................................................................................................ 19 5.2.1.2 Actors .................................................................................................................... 19 5.2.1.3 Assumptions .......................................................................................................... 19
5.2.1.4 Sequence diagram ................................................................................................. 19 5.2.1.5 KPIs....................................................................................................................... 20 5.2.1.6 Loss of Bearer scenario ......................................................................................... 20 5.2.2 The General IMS Case .................................................................................................. 24 5.2.3 User Location AVP....................................................................................................... 25
6 THE PROTOTYPE............................................................................. 27
6.1 The Traffic generator ............................................................................................................ 27 6.2 The PCRF ............................................................................................................................. 29 6.3 Service Tracing..................................................................................................................... 32 6.3.1 The implementation ...................................................................................................... 32 6.4 Extraction of data to an XML file......................................................................................... 35
7 CONCLUSIONS................................................................................. 36 LITERATURE ....................................................................................... 38 APPENDIX A: MESSAGES OVER RX ................................................. 42 APPENDIX B: MESSAGES OVER GX ................................................. 45 APPENDIX C: UML OF SERVICE TRACING ....................................... 47 APPENDIX D: ABBREVIATIONS ......................................................... 48
1 Introduction
Do you get annoyed when you cannot reach your friend on your mobile phone? And what about that time when you sent a message to your colleague that you were running late and it turned out that she never got it? Today your mobile phone can handle more services than just voice telephony. A lot of effort is put into introducing services such as video call, mobile TV, streaming, Push to talk etc. As a user you want to be certain that when you pay for a service it will work properly! No matter whether it is because you want to make sure that you will be able to see that particular episode of your favourite programme or because you want to be certain that you can reach your business partner at any time. As an operator, there are a number of advantages of being able to measure the performance in order to be able to guarantee a certain quality of the services. One of these advantages is the possibility to prioritize traffic flows of customers that generate high revenue streams. More and more people are starting to use IP telephony according to Computer Sweden [45], but at the same time the sound quality is steadily degrading. The number of calls where the quality is unacceptable has increased from 15 to 20 percent during only one year. Still, 20 million people are using the Internet for telephony today. Considering the fast development in this area it is highly likely that the number of users will increase significantly. As I hope you have now realized, service assurance is a highly interesting area and will most likely only become more and more important. Additional benefits will be further discussed in chapter 2.1.
2 Service Assurance
One of the main challenges today is the convergence of a large number of networks, mobile as well as fixed, into one single packet based IP network [18] [35]. The network of today is optimized for one service, namely voice telephony and the network is circuit switched. With the new focus on delivering a number of different services to an end-user, the network will become a multiservice packet switched network [40]. Service assurance (SA) will be an important factor, where the term is defined as a guarantee that a specific service will be quantitatively and qualitatively provided as expected by the end user.
Jitter (variations in delay) [s] Jitter refers to the variation in the delay of successive packets. A jitter buffer can be used to smooth the transmission. It can seriously affect streaming audio and video, causing jerky motion or even loss of video. The design rule in Ciscos QoS in IP-Networks [32], states that the jitter should be no more than 30ms.
In case of congestions it is easy to believe that simply increasing the bandwidth will solve the problem. This is however not the case. The protocols that caused the congestion might merely use up the added bandwidth, causing the same congestion problems as before. It is also impossible to foresee what demands might be set upon bandwidth in the future. The best known IP network today is of course the Internet where traffic is carried with best-effort. This means that there is no guarantee when the network will deliver the packets, nor if they will at all be delivered. Packets can be dropped on the way in case of for example congestion. Different services may require different quality. File Transfer Protocol (FTP) for example, is not that sensitive to network delay or to bandwidth limitations. Of course, the downloading will take longer but the end result will not be affected. This can be compared with services such as Voice and Video which are very sensitive to network delay and to variations in delay. It is almost impossible to keep a normal conversation over the phone if the voice packets are delayed more than 150-200 ms. The dialogue will sound choppy and distorted. In these cases QoS can be used to provide assured services. There is also the case of different users needing different quality. Corporate customers in the banking or publishing industry cannot tolerate outages and failure in transmissions [4]. They are obviously willing to pay more to be ascertained good quality. Other businesses might not be as sensitive and prefer to pay a lower price. These corporate customers should set up an agreement with the service provider, a Service Level Agreement (SLA), defining what quality the customer pays for. This concept is further described in chapter 2.2.3. A private customer on the other hand is often cost conscious and satisfied with somewhat poorer quality.
2.2.2 Classes
In order to be able to assure a certain quality of service, the traffic of the network can be divided into classes. In the so-called Olympic Service, packets are assigned to three service classes: gold, silver and bronze. In the network the gold class should then experience a lighter load then silver, and silver accordingly a lighter load than bronze. In other words, the packets belonging to the best class have a greater possibility of being delivered on time. [42] [21] [31]. If the link is congested, the packets with gold service will get the largest part of the link. In case there are no gold or silver packets, packets assigned to bronze can use up the whole link. A possible scenario would be an end-user putting voice traffic as a gold service and less critical services e.g. FTP as a bronze service. This also enables the operator to bill the user according to the chosen service class.
SLO Reporter
KPI summary reports
Fulfillment
Data refinement
report tool report tool report tool alarm tool alarm tool report tool
raw data
Infrastructure QoS
Service Network
Fig. 3.1: Data about the performance is collected from the network. This information is then aggregated into KPIs and monitored in the Service Level Manager. Source: Service Assurance for Communication Networks: Solution Description [40].
Quality of Service data is gathered from the access, core and service network through performance monitoring processes [40]. The data can be classified as belonging to either infrastructure QoS, traffic streams QoS or end user QoS. It is
then aggregated in the Data Refinement layer and KPIs are defined. Thresholds are set up in order to be able to issue service alarms and reports. The status of the service components is presented to the Service Level Manager in the form of events and statistical reports. The Service Level Manager in turn, gathers the information and correlates it if necessary in order to create an either general or individual performance view. The KPIs are then compared with the pre-defined Service Level Objectives (SLOs) agreed upon in the SLA. While the Service Level Manager provides a real-time view, the SLO reporting function provides a historical view. It is used for follow-up and final analysis of the service performance and SLA agreements.
4 Network
The challenge today is the connection of all different networks; Internet, networks for mobiles phones as well as for fixed telephones etc [5]. The complexity is best expressed by figure 4.1. I will only explain the most important elements present in this picture. The intention of the picture is to get an insight into the many nodes, and versions of network standards that has to be considered when connecting the many networks. A more detailed description of the interesting parts of the networks will follow in later chapters.
Fig. 4.1: The figure shows the packet switched and circuit switched networks as well as the IP Multimedia Subsystem (IMS). Source: UMTS Networks and Internet Telephony [14].
GSM Radio, GERAN and UTRAN are access networks where the user connects to the network. Depending on if they are using GSM, GPRS or 3G they are then redirected through the network. The IP Multimedia Subsystem (IMS) is a standardised architecture for providing multimedia services via mobile and fixed telephone networks. Poikselk et al. [5] propose the following definition of the IMS. IMS is a global, access-independent and standard-based IP connectivity and service control architecture that enables various types of multimedia services to end-users using common Internet-based protocols.
The relevant parts of the network can be divided into Radio Access Network (RAN) and Core Network (CN) which will be further described in chapter 4.2 and 4.3.
Core Network
RNC
RBS
RBS
RBS
RBS
UE
UE
UE
UE
UE
UE
Fig. 4.2: The access network consisting of the radio network controller and the radio base stations.
The RBS receives signals from the User Equipment (UE), converts it and forwards it to the RNC [3]. It provides the physical radio link to the network [12]. The RNC in turn, is responsible for the control of the radio resources. An RBS is connected to one RNC.
Core Network
Fig. 4.3: The SGSN, the GGSN and the HLR are the main components of the core network.
The task of the SGSN is to keep track of the location of the UE. It also carries out security functions and access control [17]. Two types of subscriber data are stored in the location register function of the SGSN [1]:
10
Subscription information: o IMSI o One or more temporary identities o Zero or more PDP addresses Location information: o Cell/routing area of the UE o The GGSN address of each GGSN for which an active PDP context exists
The SGSN receives information from the HLR about each subscriber who is allowed to use the network. The HLR is a central database that keeps details of every SIM card, such as IMSI (International Mobile Subscriber Identity), the telephone number (known as MSISDN), information on allowed services, forbidden roaming areas, current location of the subscriber etc [3] [11]. GGSN is connected to SGSN and detunnels user data from the GPRS Tunnelling Protocol (GTP). The information is then sent out as normal user data IP packets. To external packet networks such as e.g. the internet, the GGSN works as an IP router. In order to protect the integrity of the core network it also performs firewall and filtering functionality [9].
11
AF
Gq
Rx
CRF
Go
Gx
GGSN
Fig. 4.4: The Policy Decision Function (PDF) and Charging Rules Function (CRF) communicate with the Application Function (AF) and the GGSN via the interfaces Gq, Go, Rx and Gx.
The PDF provides Service-Based Local Policy (SBLP) control [37]. This is done by authorizing required QoS resources and storing the SBLP for the session. The decisions are based on session and media related information obtained from the AF via the Gq interface. When a bearer authorization request is sent from the GGSN to the PDF, the PDF authorizes the request based on the stored service based local policy information for this particular session. The GGSN is then informed to open the gate if the session was authorized. The CRF provides Charging Rules and informs the AF about bearer session events. Charging rule decisions are based on session and media related information obtained from the AF, the bearer and subscriber related information obtained via the Gx interface as well as any other subscriber and service related data the CRF is aware of. The AF hereby indicates to the CRF whether the IP flows should be enabled or disabled at the bearer level. The CRF also informs the AF via Rx regarding bearer events such as bearer release and bearer establishment. How these nodes interact with the user equipment will be further described in the next chapter.
4.5.2 Release 7
In 3GPP R7 the two nodes PDF and CRF are merged into one single node, the Policy and Charging Rules Function, PCRF [36]. The intention is to simplify the architecture. This will decrease the required signalling and enable a unified filter handling. The resulting network can be viewed in figure 4.5 below.
12
Subscription Profile Repository (SPR) is a database for storing information about the users policies. The user might for example be a gold subscriber and thus entitled to more bandwidth than the average.
AF SPR
Rx+
GGSN
Fig. 4.5: The Policy and Charging Rules Function (PCRF) is connected to the Application Function (AF) and the Gateway GPRS Support Node (GGSN) via the interfaces Rx+ and Gx+. The Subscription Profile Repository (SPR) contains information about the subscriber and its policies.
The network has a layered architecture which aims to separate the media and control signalling [28] [3] [23]. All entities in the control plane control media streams and signalling links between other entities. Some examples of tasks are routing call signalling and telling the media plane what traffic to allow. The media plane on the other hand, handles all actual information sent and received by the user. This can be for example coded voice from a voice call or packets sent in an Internet connection. In figure 4.6 the two layers are marked as the control plane and the media plane (also called traffic plane, user plane, data plane, bearer plane).
13
Control Plane
AF
PCRF
Media Plane
GGSN
Fig. 4.6: All control signalling between the user equipment and the PCRF is sent on the control plane. The media plane in turn handles the transfer of media information.
Similar to R6 and the tasks of the PDF and the CRF, the PCRF provides the GGSN with network control regarding service data flow detection, gating, QoS and flow based charging [36]. The AF provides the PCRF with session and media related information and in turn receives information about traffic plane events. The reference points Gx+ and Rx+ are based on the Diameter protocol [27] [24] which is further described in the next chapter.
14
A Diameter message consists of a Diameter Header followed by Attribute Value Pairs (AVPs) [5].
Table 4.2: Some AVPs with name, code and data type. Attribute Name Code Data Type Session-Id 263 UTF8String Termination-Cause 295 Enumerated Origin-Host 264 DiamIdent Origin-Realm 296 DiamIdent AF-Application-Identifier 504 OctetString
15
As mentioned in previous chapter, some AVPs are connected to a certain message. Session-Id is for example always present in all messages and each session has two unique identifiers, one for the Rx+ interface and one for the Gx+ interface. The attribute AF-Application-Identifier contains information in order to identify which service the session belongs to [38].
16
5 Prestudy
5.1 Method
The approach to solve the task is here briefly described: 1. Discussions were held with, among others, Dick Bergstrm, Strategic Solution Architect for SA at Ericsson, and Per Sundin, Operational Product Manager of NTM (see chapter 6.4) who are working in the area of Service Assurance. The intention was to understand what kind of information that could be interesting to derive from the network in order to enable Service Assurance. 2. We examined the flow through the PCRF by analysing papers from 3GPP, use cases as well as discussing with people familiar with the area. We were especially interested in what information was transported to the PCRF. Use cases were made regarding the services voice telephony (at Ericsson MMTel), WeShare and MMS. 3. The next step was to compose relevant KPIs. 4. Studies were performed on the existing prototype, consisting of a traffic generator and the PCRF node. 5. We implemented an improved version of the traffic generator. 6. An examination took place regarding what further information was necessary to store in the database of the PCRF, in order to derive the requested KPIs. 7. The design of the Service Tracing program was presented to relevant persons for approval. The GUI was further discussed with a specialist on interaction design, Didier Chincholle, senior specialist in interaction design for mobile services at Ericsson. 8. I implemented the service tracing program. 9. Extraction of KPIs to an XML file.
17
UE
GGSN
PCRF
2. Activate PDP Context 3. CCA 4. CCA 5. Activate PDP Context 6. Accept Primary PDP ONLINE 7. .Initiate Primary PDP 8. Delete PDP Context Request 9. CCA 10. CCA 11. Delete PDP Context 12. Accept PrimaryPDP Fig. 5.1: The flow at IP-CAN session setup and disconnect.
When the user has established an IP-CAN session, it is possible to start a session, such as voice telephony or streaming. We made a use case covering the flow during voice telephony which can be reviewed in more detail in chapter 5.2.1. The idea was to find how to calculate KPIs as well as to look at how loss of bearer could be handled by the network.
18
Sender (MO)
Recipient (MT)
Fig. 5.2: Use case diagram showing the IMS Voice Telephony service.
5.2.1.1 Description
An end-user calls another person and expects the recipient to be notified about the connection attempt. If the recipient accepts the call, the two parties expect to be able to talk to each other without distortion.
5.2.1.2 Actors
Sender: The actor that initiates the phone call (Mobile Originator). Recipient: Person who receives the phone call (Mobile Terminator).
5.2.1.3 Assumptions
For MMTel session setup a large number of different session alternatives exist. This use case covers only the general approach including two users which both are IP-CAN connected.
19
MO 1. SIP
SGSN
GGSN
PCRF
P-CSCF
MT
2. AAR 3. AAA 4. SIP 5. Initiate 2ry PDP Context 6. Activate PDP Context 7. CCR 8. CCA 9. Activate PDP Context 10. Accept 2ry PDP Context 11. SIP 12. Media 13. SIP 15. RAR 16. RAA 17. STA 14. STR
Fig. 5.3: Sequence diagram of IMS Multi Media Telephony with simplified SIP signalling.
5.2.1.5 KPIs
[The text under this heading has been removed for confidentiality reasons.]
20
5.2.1.6.1 Loss of bearer leads to session termination This scenario is mapped from Enhanced Policy Decision Function and Gx+ [38] to match this MMTel use case. The network recognizes a change in bit rate to 0 Kbit/s [39] and a timer starts in the PCRF. If the timer expires before the bearer has been recovered, the session is ended. Preconditions The preconditions are that the all signalling before step 10 in figure 5.4 were performed without failure and that two notifications to enable both loss-ofbearer and recovery-of-bearer were included in the AAR message sent from AF to PCRF. These notifications are defined in TS 29.209 [38] and are included in the Specific-Action-AVP. It should be transmitted in step 2 to make the PCRF apply these features. The SIP BYE signalling can be initiated from the network. This is made in step 17 and 18, figure 5.4, which violates a basic idea of SIP signalling, where messages are only initiated by the end-points, in this case represented by MO and MT.
21
Sequence Diagram The sequence diagram for the loss of bearer in a dedicated MMTel session is shown in figure 5.4.
MO
UE GGSN PCRF IMS IMS 1 2 3 4 PRCF
MT
GGSN UE
6 7 9 10. Media 11. Update PDP 12. CCR 13. CCA 14. Update PDP
5 8
Timer starts Timer expires. SA registration! 15. RAR 16. RAA 18. SIP BYE 21.STR 22.STA
23.RAR
5.2.1.6.2 Loss of bearer followed by bearer recovery This scenario is mapped from [29] to match this MMTel use case. Here the network recognizes a change in bit rate to 0 Kbit/s [38] and the timer starts in the PCRF. But before the timer expires, the network acknowledges a change in bit rate from 0 Kbit/s and the session continues. Preconditions Preconditions are that all signalling before step 10 in figure 5.4 was performed correctly and a notification to enable this information was included in the AAR message sent from AF to PCRF. The notification is defined in TS 29.209 [38] and is included in the Specific-Action-AVP. It should be sent in step 2 to make the PCRF apply the loss-of-bearer and recovery-of-bearer features.
22
Sequence Diagram The sequence diagram for the loss-of-bearer and recovery-of-bearer in a dedicated MMTel session is shown in figure 5.5.
MO
UE GGSN PCRF IMS IMS 1 2 3 4
MT
PRCF GGS UE
6 7 9 10. Media 11. Update PDP 12. CCR 13. CCA 14. Update PDP 15. Update PDP 16.CCR 17. CCA 18. Update PDP
5 8
Timer starts
No bearer
Timer stops
Bearer
5.2.1.6.3 Conclusions/Comments There are both advantages and disadvantages with this feature. The two main advantages are: It enables the operator to register what cells are near, or include, bad radio access areas. Creating an indicator of this kind, will give the possibility to dynamically offer the operator recommendations on what cells should be upgraded or where handovers between cells regularly fails. By enabling this feature, the KPIs cut-off call ratio and call completion ratio will increase in accuracy. The feature enables the operator to acknowledge a subscriber during loss of bearer. This helps the operator to increase its service assurance. Churn can be reduced since the subscriber can get an operator notification that his/her session failure has been registered, and the trust in the operator can thereby remain. An operator also has the ability to compensate the affected subscriber.
23
There are also some disadvantages connected to these scenarios. The introduction of timers in the PCRF is both time and resource consuming. There is no standardized way how to trigger the loss of bearer in any of the nodes listed in the scenarios. None of the UE, SGSN or GGSN has today any ability to trigger this event. There is according to Kvernvik [44] a possibility that the RNC node will record the bit rate in the dedicated bearers in a near future, but no decisions has been made in this area. There will be an increased signalling on the Rx+ and Gx+ interfaces.
MO
GGSN 1. SIP
PCRF
P-CSCF
MT
2. AAR 3. AAA 4. CCR 5. CCA 6. SIP 7. Media 8. SIP 9. CCR 10. CCA 11. ASR 12 ASA 13. STR 14. STA 15. SIP
Fig. 5.6: Flow 1, the flow when session establishment and termination is initiated by the UE.
24
MO
GGSN 1. SIP
PCRF
P-CSCF
MT
2. AAR 3. RAR 4. RAA 5. AAA 6. SIP 7. Media 8. SIP 9. STR 10. RAR 11. RAA 12. STA 13. SIP
Fig. 5.7: Flow 2, the flow when session establishment and termination is initiated by the PCRF.
The KPIs were then taken from the IMS Voice Telephony use case and slightly adjusted to fit the general flows. We also added three new KPIs. The calculations of the first four KPIs were made in the same manner as when calculating the IMS Voice Telephony KPIs but with the triggers at different messages.
25
location of the UE. Below is the new AVP presented in the form of an ABNF Grammar according to the specifications in RFC 3588 [24]. User-Location ::= < AVP Header: 1100 > { Local-Area-Code } { Cell-Identity } Both the Local-Area-Code AVP (AVP code 1101) and the Cell-Identity AVP (AVP code 1102) are of type Integer32. The idea is to send this AVP in CCR and RAA to the PCRF. Our first thought was to store the last and the second last visited cell in the PCRF. This might help in finding weak spots when handover fails between two cells. However, this turned out to be a bit troublesome since the user location information is not updated as frequently as would be necessary. During a session a modify message might be sent which contains the user location information [30], but this does not happen often enough to be considered reliable. The result would be that what is marked as the second last cell can in fact be a much older cell. We finally decided to only store the last position of the UE. This eliminates the confusion which the above described problem might raise. But it still gives a good indication on where the problem occurred.
26
6 The prototype
A prototype existed, consisting of the PCRF node and a traffic generator. There was however a need to improve it. The structure of the traffic generator had to be updated in order to support new flows as well as different flows depending on the type of service. Changes also had to be made to the PCRF part. The traffic generator simulates the traffic from a UE and implements both the Rx+ and the Gx+ interfaces. A descriptive figure can be found below where blue marks the PCRF part and green the parts of the traffic generator.
Fig. 6.1: The prototype consists of two parts; the PCRF including the databases (blue) and the traffic generator (green).
A number of assumptions have been made for the implementation of the prototype: Only one session per user is active at the same time. The users involved in the communication have the same service provider. There exists only one PCRF. Two users are involved in a session.
27
stated as the percentage of the active sessions in that LA. The time of a certain session can also be changed. The graphical interface can be found in figure 6.2 and 6.3.
Fig. 6.2: The graphical interface of the traffic generator. The user chooses to start a new simulation and gets default values of the number of sessions and local areas. These figures can be edited as well as the distribution of services and their uptime.
28
Fig. 6.3: The default values can be changed for each local area.
When the user pushes the button Generate Traffic, an IP-CAN is set up per user, simulating that the user turns on her mobile phone. One service is thereafter started per user, i.e. only one session at a time per user is active. Half of the sessions represent the MOs who make a connection to the MTs. Depending on the value of the Radio Access Technology (RAT) Type, one of the two flows is selected. Some access technologies support the network initiated flow, i.e. flow 2. The RAT Type is an AVP using integer as data type. We have chosen to state that all sessions with a RAT Type equal to or smaller than 3, are to use flow 2. We also assume that RAT Types smaller than 3 are R7+ compliant.
29
The table policy contains information about what performance a customer should have when using a certain service, depending on what service class the customer is associated with. Table spr holds records about the subscriber, such as service class and what services that are enabled for that particular user (stored under the attribute services). The table ipcan stores all information that can be obtained when the IP-CAN is established. As the name implies, sessiondata consists of all data gathered during a session. Timestamps are set, making it possible to calculate the KPIs discussed in chapter 5.2.2. For flow 1, the timestamps were placed at step 1, 4, 6 respectively 11 in figure 5.6. For flow 2 they were placed at step 1, 4, 6 respectively 9 in figure 5.7. The attribute errormessage is OK by default, but if the network is out of resources or the bearer is lost, this is indicated by this attribute. sessiondata source sourceport destination destinationport maxul maxdl serviceclass serviceid state rxsessionid gxsessionid qci timestamp1 timestamp2 timestamp3 timestamp4 lac cellid errormessage sgsn ggsn
ipcan sessionid subscriptionidtype subscriptioniddata ipv4address ipv6address sessionstart sessionend ipcanstatus ccrequestnumber rattype lac cellid
policy serviceclass service maxdl maxul qci spr number imsi serviceclass username services
In order to calculate the KPIs timestamps were set in the database of PCRF at the positions pointed out in chapter 5.2.2. A better overview of the timestamps can be found in figure 6.5 and figure 6.6.
30
MO
GGSN 1. SIP
PCRF
P-CSCF
MT
2. AAR 3. AAA 4. CCR 5. CCA 6. SIP 7. Media 8. SIP 9. CCR 10. CCA 11. ASR 12. ASA 13. STR 14. STA 15. SIP
Fig. 6.5: The T marks where timestamps in flow 1 are taken to be stored in the database of the PCRF.
MO
GGSN 1. SIP
PCRF
P-CSCF
MT
2. AAR 3. RAR 4. RAA 5. AAA 6. SIP 7. Media 8. SIP 9. STR 10. RAR 11. RAA 12. STA 13. SIP
Fig. 6.6: The T marks where timestamps in flow 2 are taken to be stored in the database of the PCRF. 31
Storing information for a week, like in the first alternative, might still result in too much information. But it is likely that gathering data during an even smaller time span will not give enough facts in order to analyse the problem accurately. The second alternative will significantly reduce the number of sessions stored. On the other hand, the sessions that worked without problems might be a good source of information when analysing why only certain sessions failed. In the third alternative the figures are updated with every new session, but it is a constant number of data stored. E.g. the number of failed sessions is stored for user A and the service MMS. The obvious downside with this solution is that it is not possible track the session that failed to see what caused the failure.
32
Fig. 6.7: The search is made by entering the IMSI of the user. A window with two tabs is then opened (see figure 6.8). Personal details about the user are presented, taken from the table spr. Below this area, key performance indicators are shown for the different services. In the figure 6.8 the KPIs are greyed out due to confidentiality reasons.
Fig. 6.8: Personal information as well as key performance indicators are presented under the tab GENERAL INFO.
33
Under the next tab, all sessions can be found since the troubleshooting started (see figure 6.9). A list of the sessions together with date, time and the kind of service is found to the left. When a session is marked, all information about that particular session is presented. If there was a problem, the type of problem is shown like in figure 6.10. It is possible to see the other IMSI that was involved in the communication and who initiated it. Since the quality of the session might depend on what quality the other user is entitled to, that type of information is also presented. Additionally, the last route the communication took can be followed from one end point to another.
Fig. 6.9: Information about each session can be found under the tab SESSION INFO.
34
Fig. 6.10: If an error occurred, the type is presented. It is also possible to see if the failure happened in the present users part of the network or in the partners.
The most likely scenario is that these features are only offered to corporate customers since these are the customers who have an SLA.
35
7 Conclusions
The PCRF clearly receives a lot of information about the sessions that are active. There is however still much work going on which might open up even more possibilities. The PCRF can be a good complement to other sources of information, together forming a complex collection of performance measurements. It will still require the creation of synthetic traffic to get an endto-end view of the service quality. Service assurance is however clearly a highly important area in which more studies have to be made. Sending location information about the UE to the PCRF is an interesting possibility that could be worth looking into more. But then further investigations have to be made regarding how often the location can be updated. As of today, the location is not send at delete PDP context, and modify PDP context is not sent often enough to present any reliable information in the PCRF. I think that the most important issue to investigate is the possibility of getting information about the location also at delete PDP context. This would enable us to locate where the failure occurred which I consider to be the number one reason why you want to get location information. It could also be interesting to examine whether there are more interesting AVPs sent to the PCRF. The new node also opens up for other opportunities, e.g. statistics on how the different services are used. Together with a customer database, it would be possible to tell what services that are preferred by certain age groups, what services that are most often used together with others etc. As pointed out in chapter 5.2.2, the KPIs calculated from the data in the PCRF do not perfectly correspond to the definitions. Most of these are however impossible to calculate without simulating a session, e.g. a telephone call. The time when the end user pushes the button Send can quite clearly not be observed at any point in the core network. On the other hand, some of the KPIs are actually more accurately calculated from the PCRF. We have used the proposed KPIs as general performance parameters since we have assumed that the flow is the same for all these services and thus are the same KPIs interesting. It is nevertheless important to remember that different services require different measurements to be made. Some parameters can be service specific. More effort has to be put into finding appropriate KPIs the services. For this task I would recommend Telecommunications: Quality of Service Management: from legacy to emerging services [4]. It would also be interesting to analyse how much memory that would be required in order to store all information about the sessions. Is it at all possible? Should the stored information be exclusively for corporate customers? Another thing worth mentioning is that not all services will have a flow of information through the PCRF. Best effort services like e.g. SMS might pass
36
through the network without going through the new node. On the other hand, the quality of most of these services are not subjects to any SLA. Discussions have also been made regarding the possibility to use deep packet inspection in the cases where it could be interesting to measure the quality.
37
Literature
BOOKS [1] CASTRO, J., All IP in 3G CDMA Network : the UMTS Infrastructure and Service Platforms for Future Mobile Systems. Chichester, John Wiley & Sons Ltd 2004. ISBN: 0470853220. [2] GOZDECKI, J., JAJSZCZYK, A., STANKIEWICZ, R., Quality of Service Terminology in IP Networks. IEEE Communications Magazine, March 2003. ISBN: 0470019069. HOLMA, H., TOSKALA, A., 2004. WCMDA for UMTS: RadioAccess For Third Generation Mobile Communications. John Wiley & Sons Ltd, Chichester. ISBN: 0-470-87096-6 OODAN, A., WARD, K., SAVOLAINE, C., DANESHMAND, M., HOATH, P., Telecommunication: Quality of Service Management: from legacy to emerging services. London: The institution of Electrical Engineers 2003. ISBN: 0852964242. POIKSELK, M., MAYER, G., KHARTABIL, H., NIEMI, A., The IMS: IP Multimedia Concepts. Chichester: John Wiley & Sons Ltd 2006.
[3]
[4]
[5]
INTERNET SOURCES [6] 3GPP, Wikipedia. Last visited: 2006-09-17. URL: http://en.wikipedia.org/wiki/3gpp [7] CARLE, G., ZANDER, S., ZSEBY, T., Evaluation of Building Blocks for Pure Passive One-way-delay measurements. Last visited: 2006-08-22. URL: http://www.ripe.net/pam2001/Abstracts/poster_04.html DIAMETER, Wikipedia. Last visited: 2006-09-17. URL: http://en.wikipedia.org/wiki/DIAMETER GGSN, mpirical. Last visited: 2006-09-17 URL: http://www.mpirical.com/companion/mpirical_companion.html GPRS Core Network, Wikipedia. Last visited: 2006-09-17. URL: http://en.wikipedia.org/wiki/PDP_Context Network Switching Subsystem, Wikipedia. Last visited: 2006-09-17. URL: http://en.wikipedia.org/wiki/GSM_core_network
[8]
[9]
[10]
[11]
38
[12]
Node B, mpirical. Last visited: 2006-09-17 URL: http://www.mpirical.com/companion/mpirical_companion.html PDP context, mpirical. Last visited: 2006-09-17 URL: http://www.mpirical.com/companion/mpirical_companion.html PETRAK, L., HOENE, C., UMTS Networks and Internet Telephony. 2006. Last visited: 2006-09-17 URL: http://net.informatik.uni-tuebingen.de/fileadmin/RI/teaching/umtsvoip/ss2006/UMTS_Abbreviations.pdf#search=%22ggsn%20sgsn%20% 22node%20b%22%20ims%22 Quality of service, Wikipedia. Last visited: 2006-09-16. URL: http://en.wikipedia.org/wiki/Quality_of_service Service Level Agreement, Wikipedia. Last visited: 2006-09-17. URL: http://en.wikipedia.org/wiki/Service_level_agreement SGSN, mpirical. Last visited: 2006-09-17 URL: http://www.mpirical.com/companion/mpirical_companion.html VEGESNA, S., Ip Quality of Service. Indianapolis: Cisco Press 2003. URL: http://books.google.com/books?vid=ISBN1578701163& id=p1H9wVJJujAC&pg=PA5&lpg=PA5&dq=quality+of+service&sig= oGJc-YQXl1fP06EK_lml1Kiq_EI YAN, Z., 2.5G/3G-compatible PS-domain Core Network Solution. Last visited: 2006-09-17. URL: http://www.huawei.com/publications/view.do?id=216&cid=90 &pid=61
[13]
[14]
[15]
[16]
[17]
[18]
[19]
PAPERS / REPORTS [20] Basic Concepts of WCDMA Radio Access Network. Ericsson Radio Systems AB 2001. URL: http://www.ericsson.com/technology/whitepapers/ e207_whitepaper_ny_k1.pdf [21] BAUMGARTNER, F., BRAUN, T., HABEGGER, P., Differentiated Services: A New Approach for Quality of Service in the Internet. Chapman &Hall. Last visited: 2006-09-17. URL: http://www.iam.unibe.ch/~baumgart/pubs/hpn98.pdf#search=%22 BHUSHAN, B., Measurment and Analysis of End-to-end Service Quality of 3G Networks and Services. Fraunhofer FOKUS 2004. URL: http://www.ist-albatross.org/QoSWhitePaper.pdf
[22]
39
[23]
BJRK, N., Service Control and Service Delivery Control. Internal power point presentation: SC_20060124_overview.ppt. CALHOUN, P., LOUGHNEY, J., GUTTMAN, E., ZORN, G., ARKKO, J., Diameter Base Protocol. IETF, rfc 3588, September 2003. CALHOUN, P., LOUGHNEY, J., GUTTMAN, E., ZORN, G., ARKKO, J., Diameter Credi- Control Application. IETF, rfc 4006, August 2005. CALHOUN, P., ZORN, G., SPENCE, D., MITTON, D., Diameter CrediControl Application. IETF, rfc 4005, August 2005. Charging rule provisioning over Gx interface, 2006-03. 3GPP TS 29.210 v6.5.0. CUMMING, J., Session Border Control in IMS. Data Connection Limited 2005. URL: http://www.dataconnection.com/network/download/ whitepapers/SBCinIMS.pdf Enhanced Policy Decision Function and Gx+, 1/155 17-HSC 113 05/2 Uen. GPRS Tunnelling Protocol (GTP) across the Gn and Gp interface, 2006-06. 3GPP TS 29.060 v7.2.0. HEINANEN, J., BAKER, F., WEISS, W., WROCLAWSKI, J., Assured Forwarding PHB Group. IETF, rfc 2597, June 1999. Last visited: 2006-09-17 URL: http://www.ietf.org/rfc/rfc2597 LIDSTRM, M., 2005-06-14. Ciscos QoS in IP-Networks. Ericsson, Uen. LINDERBCK, F., PCRF Information Contributions to Service Assurance and Dimensioning. Masters thesis, 2006-10. MULLER, N., Managing ServiceLevel Agreements. International Journal of Network Management. John Wiley & Sons, Ltd 1999. URL: http://delivery.acm.org.focus.lib.kth.se/10.1145/340000/ 336747/p155-muller.pdf?key1=336747&key2=6364158511& coll=ACM&dl=ACM&CFID=71260489&CFTOKEN=50734145
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32] [33]
[34]
40
[35]
NILSSON, T., Toward third-generation mobile multimedia communication. Ericsson Review no. 3, 1999. URL: http://www.cms.livjm.ac.uk/library/cd3014/2002/Reference %205%20-%20Nilsson.pdf Policy and Charging Control over Gx reference point, 2006-05. 3GPP TS 29.212 v0.3.0. Policy control over Go interface, 2005-09. 3GPP TS 29.207 v6.5.0. Policy control over Gq interface, 2006-06. 3GPP TS 29.209 v6.5.0. Policy and Charging Control signalling flows and QoS, 2006-05. 3GPP TS 29.213 v0.1.0. Service Assurance for Communication Networks: Solution Description, 2004-11-11. Ericsson, 221 01-FGB 101 241 Uen. Service Delivery Control (SDC) Use Cases. TGF/F, 2005-08-19. Using Quality of Service to Deliver Customer-Focused Metro Ethernet Services. Cisco Systems 2004. URL: http://www.cisco.com/application/pdf/en/us/guest/netsol/ns223/ c654/cdccont_0900aecd8020a0ff.pdf
[36]
[37] [38]
[39]
[40]
[41] [42]
ORAL SOURCES [43] BERGSTRM, D., Strategic Solution Architect, Ericsson BUGS, 2006-05-04. [44] KVERNVIK, T., Ericsson Research, System Management, 2006-06-22.
MAGAZINES [45] SBLOM, J., Allt smre kvalitet p ip-telefoni, Computer Sweden. 2006-07-28. nr 73, sid 5.
41
42
43
44
45
46
UserFrame
UserFrame() initGUI() fillList(DefaultListModel) createJList(DefaultListModel) valueChanged(ListSelectionEvent)
DetailsPanel
DetailsPanel() initGUI()
KPITextPane SessionTextPane
SessionTextPane(String l) addStylesToDocument(StyledDocument) updateInfo(String, String, String) getError(String, String) KPITextPane() insertText() addStylesToDocument(StyledDocument)
KPICalculator
KPICalculator(String, int) getAccessRatio() getCutoffCommunicationRatio() getCommunicationCompletionRatio() getSetupTime() getTeardownTime() getMeanHoldingTime() getAvgCommunicationTime()
SQLHandler
SQLHandler() getSessionsWithIP(String) getIP(String) getSessionInfo(String, String) getContactedIMSI(String, String) getPartnerInfo(String, String) getPartnerInfoIfFailed(String, String) getAccessNumbers(String, int) getCutoffNumbers(String, int) getCompletionNumbers(String, int) getSetupTime(String, int) getTeardownTime(String, int) getMeanHoldingTime(String, int) getAverageCommunicationTime(String, int) getDetailsImsi(String) userPresentInDB(String)
47
Appendix D: Abbreviations
3GPP AAA AF AVP CN CR CRF FTP GGSN GTP HLR IMS IMSI KPI LA MO MT NTM OPEX PCRF PDF QoS RAN RBS RNC SA SBLP SGSN SLA SLO SPR UE 3rd Generation Partnership Project Authentication, Authorization and Accounting Application Function Attribute Value Pair Core Network Charging Rule Charging Rule Function File Transfer Protocol Gateway GPRS Support Node GPRS Tunnelling Protocol Home Location Register IP Multimedia Subsystem International Mobile Subscriber Identity Key Performance Indicator Local Area Mobile Originator Mobile Terminator Network Traffic Management Operational Expenditure Policy and Charging Rules Function Policy Decision Function Quality of Service Radio Access Network Radio Base Station Radio Network Controller Service Assurance Service-Based Local Policy Serving GPRS Support Node Service Level Agreement Service Level Objective Subscription Profile Repository User Equipment
48
www.kth.se