Handbook On Session Initiation Protocol Networked Multimedia Communications For IP Telephony
Handbook On Session Initiation Protocol Networked Multimedia Communications For IP Telephony
Engineering - Electrical
Roy
Session Initiation
Session Initiation Protocol (SIP), standardized by the Internet Engineering Task Force (IETF), has
emulated the simplicity of the protocol architecture of hypertext transfer protocol (HTTP) and is being
popularized for VoIP over the Internet because of the ease with which it can be meshed with web services.
However, it is difficult to know exactly how many requests for comments (RFCs) have been published
Handbook on
useful.
The text of each RFC from the IETF has been reviewed by all members of a given working group made up
of world-renowned experts, and a rough consensus made on which parts of the drafts need to be mandatory
and optional, including whether an RFC needs to be Standards Track, Informational, or Experimental.
Texts, ABNF syntaxes, figures, tables, and references are included in their original form. All RFCs, along
with their authors, are provided as references. The book is organized into twenty chapters based on the
major functionalities, features, and capabilities of SIP.
K27057
6000 Broken Sound Parkway, NW
Suite 300, Boca Raton, FL 33487 ISBN: 978-1-4987-4770-7
711 Third Avenue 90000
New York, NY 10017
an informa business 2 Park Square, Milton Park
TO ORDER
Call: 1-800-272-7737 • Fax: 1-800-374-3401 • E-mail: [email protected]
Handbook on
Session Initiation
Protocol
Networked Multimedia Communications
for IP Telephony
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and
information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission
to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic,
mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or
retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact
the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides
licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment
has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation
without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
To our dearest son, Debasri Roy, Medicinae Doctoris (MD) (January 20, 1988–October 31, 2014), who had a brilliant career
(summa cum laude, undergraduate) and had so much more to contribute to this country and to the world as a whole. He
had been so eager to see this book published, saying, “Daddy, you are my hero.” Observing the deep relationship between
Debasri and his fiancée, their friends exclaimed with wonder: “Love is forever.” He wrote as his lifelong wishes: “I am ready
for my life, I’d like to see the whole world, I will read spiritual scriptures to find the mystery of life, and I want to make a
difference in the world.” He loved all of us from the deepest part of his kindest heart, including his longtime fiancée, who is
also an MD and was his classmate, and to whom he was engaged to be married on October 30, 2015, in front of relatives,
friends, colleagues, and neighbors who, until his last breath, he had inspired in so many ways to love God, along with his
prophetic words: “Mom, I am extremely happy that you are with me and I want to be happy in life.” May God let his soul
live in peace in His abode.
To my grandma for her causeless love; my parents Rakesh Chandra Roy and Sneholota Roy whose spiritual inspiration
remains vividly alive within all of us; my late sisters GitaSree Roy, Anjali Roy, and Aparna Roy and their spouses; my brother
Raghunath Roy and his wife Nupur for their inspiration; my daughter Elora and my son-in-law Nick; my son Ajanta; and
finally my beloved wife, Jharna. I thank them all for their love.
And my heartfelt thanks go to my son Ajanta, for inspiring me with his wonderful creativity since his first major thoughtful
invention of a multimedia telephony model using cupboard papers along with a vivid description written on a piece of paper
detailing how the phone would work. I had been the technical lead of AT&T Vision 2001 Multimedia Architecture project
at AT&T Bell Laboratories in 1993, when Ajanta was only in the fifth grade. Immediately I took the model to AT&T Bell
Labs, and all of my colleagues were surprised to see his wonderful creativity in multimedia telephony. I wish I could patent
his wonderful idea. Since then, I have been so inspired that I have worked on VoIP/multimedia telephony. Today, Ajanta
is an energetic electrical engineer progressing toward a bright future in his own right, keeping his colleagues and all of us
amazed. This book is a culmination of the vivid inspiration that he has embedded in me.
Contents
List of Figures.....................................................................................................................................................................xix
List of Tables....................................................................................................................................................................xxvii
Preface.............................................................................................................................................................................. xxix
Author..............................................................................................................................................................................xxxi
1 Networked Multimedia Services.......................................................................................................................... 1
1.1 Introduction.......................................................................................................................................................1
1.2 Functional Characteristics..................................................................................................................................1
1.3 Performance Characteristics...............................................................................................................................1
1.4 Summary............................................................................................................................................................3
References.....................................................................................................................................................................3
2 Basic Session Initiation Protocol.......................................................................................................................... 5
2.1 Introduction.......................................................................................................................................................5
2.2 Terminology.......................................................................................................................................................5
2.3 Multimedia Session............................................................................................................................................5
2.4 Session Initiation Protocol................................................................................................................................19
2.4.1 Augmented Backus–Naur Form for the SIP.........................................................................................20
2.4.2 SIP Messages........................................................................................................................................36
2.4.3 SIP Message Structure.........................................................................................................................38
2.4.4 SIP Network Functional Elements.......................................................................................................39
2.5 SIP Request Messages...................................................................................................................................... 42
2.6 SIP Response Messages.................................................................................................................................... 42
2.7 SIP Call and Media Trapezoid Operation........................................................................................................51
2.8 SIP Header Fields.............................................................................................................................................62
2.8.1 Overview.............................................................................................................................................62
2.8.2 Header-Field Descriptions....................................................................................................................72
2.9 SIP Tags...........................................................................................................................................................72
2.10 SIP Option Tags...............................................................................................................................................72
2.11 SIP Media Feature Tags..................................................................................................................................154
2.11.1 Contact Header Field.........................................................................................................................154
2.11.2 Feature Tag Name, Description, and Usage.......................................................................................154
2.11.3 Conveying Feature Tags with REFER............................................................................................... 155
2.12 Summary........................................................................................................................................................164
References.................................................................................................................................................................165
3 SIP Message Elements...................................................................................................................................... 167
3.1 Introduction...................................................................................................................................................167
3.1.1 SIP UA General Behavior..................................................................................................................167
3.1.2 UAC General Behavior......................................................................................................................168
3.1.3 UAS General Behavior.......................................................................................................................172
3.1.4 Redirect Server General Behavior......................................................................................................175
vii
viii ◾ Contents
16.3.1 Overview...........................................................................................................................................575
16.3.2 Diversion and History-Info Header Interworking in SIP...................................................................576
16.4 Call Services Using Session Border Controller................................................................................................582
16.4.1 Overview...........................................................................................................................................582
16.4.2 Distributed SBC Architecture............................................................................................................583
16.4.3 Conclusion.........................................................................................................................................587
16.5 Referring Call to Multiple Resources..............................................................................................................588
16.5.1 Overview...........................................................................................................................................588
16.5.2 Operation..........................................................................................................................................588
16.5.3 Multiple-Refer SIP Option Tag..........................................................................................................588
16.5.4 Suppressing REFER’s Implicit Subscription.......................................................................................588
16.5.5 URI-List Format................................................................................................................................589
16.5.6 Behavior of SIP REFER-Issuers.........................................................................................................590
16.5.7 Behavior of REFER-Recipients..........................................................................................................590
16.5.8 Example.............................................................................................................................................590
16.6 Call Services with Content Indirection...........................................................................................................591
16.6.1 Overview...........................................................................................................................................591
16.6.2 Use-Case Examples............................................................................................................................592
16.6.3 Requirements.....................................................................................................................................593
16.6.4 Application of MIME-URI Standard to Content Indirection............................................................593
16.6.5 Examples............................................................................................................................................596
16.7 Transcoding Call Services..............................................................................................................................596
16.7.1 Transcoding Services Framework.......................................................................................................596
16.7.2 Third-Party Transcoding Services......................................................................................................597
16.7.3 Conference Bridging Transcoding Call Control Flows..................................................................... 604
16.8 INFO Method—Mid-Call Information Transfer...........................................................................................607
16.8.1 Overview...........................................................................................................................................607
16.8.2 Motivation.........................................................................................................................................607
16.8.3 UAs Are Allowed to Enable Both Legacy INFO Usages and Info..................................................... 608
16.8.4 INFO Method.................................................................................................................................. 608
16.8.5 INFO Packages................................................................................................................................. 609
16.8.6 Formal INFO Method Definition and Header Fields........................................................................ 611
16.8.7 INFO Package Considerations........................................................................................................... 611
16.8.8 Alternative Mechanisms..................................................................................................................... 611
16.8.9 INFO Package Requirements............................................................................................................612
16.8.10 Examples............................................................................................................................................ 614
16.9 SIP Call Control UUI Transfer Services......................................................................................................... 616
16.9.1 Overview........................................................................................................................................... 616
16.9.2 Requirements for UUI Transport...................................................................................................... 616
16.9.3 Possible Approaches for UUI Transport in SIP.................................................................................. 617
16.9.4 SIP Extensions for UUI Transport..................................................................................................... 619
16.9.5 Normative Definition........................................................................................................................ 619
16.9.6 Guidelines for UUI Packages.............................................................................................................621
16.9.7 Use Cases...........................................................................................................................................622
16.10 Call Services Using DTMF............................................................................................................................623
16.11 Emergency Call Services in SIP......................................................................................................................624
16.11.1 Overview...........................................................................................................................................624
16.11.2 Emergency Services Uniform Resource Name...................................................................................625
16.11.3 Multilevel Precedence and Preemption..............................................................................................625
16.12 Summary........................................................................................................................................................626
References.................................................................................................................................................................627
xvi ◾ Contents
xix
xx ◾ List of Figures
Figure 3.20 S IP transaction: (a) SIP network, (b) SIP transaction relationships, (c) INVITE client transaction state
machine (including updates from RFC 6026), and (d) non-INVITE client transaction state machine.......255
Figure 3.21 State machines for server transactions: (a) INVITE server transaction (including updates
from RFC 6026) and (b) non-INVITE server transaction......................................................................... 260
Figure 4.1 SIP trapezoid with last-hop exception...........................................................................................................278
Figure 4.2 SIP trapezoid without last-hop exception......................................................................................................279
Figure 4.3 GRUU usage in a SIP network: (a) GRUU, AOR, Contact, and Instances; (b) SIP network;
and (c) call flows using GRUU.....................................................................................................................292
Figure 5.1 Typical operation of event subscription and notification.................................................................................. 304
Figure 5.2 Logical view of SUBSCRIBE/NOTIFY event model implementation..........................................................305
Figure 5.3 T
ypical operation of event publication, subscription, and notification.......................................................... 306
Figure 5.4 Logical view of PUBLISH event model implementation.................................................................................. 306
Figure 6.1 Publication, subscription, and notifications of presence/presentity for reachability in real time.................... 310
Figure 6.2 Example message flows for presence operations............................................................................................. 311
Figure 6.3 Example message flow...................................................................................................................................313
Figure 6.4 Basic IM session example.............................................................................................................................. 314
Figure 6.5 M
ultiparty chat overview shown with MSRP relays and a conference focus UA........................................... 315
Figure 6.6 Multiparty chat in a centralized chat room................................................................................................... 316
Figure 7.1 RTP packet header........................................................................................................................................320
Figure 7.2 Example of an RTCP compound packet.......................................................................................................322
Figure 7.3 SRTP packet format......................................................................................................................................323
Figure 7.4 Translation or proxying between IPv6 and IPv4 addresses............................................................................339
Figure 7.5 Tunneling through an IPv4 domain..............................................................................................................339
Figure 7.6 Application’s screen...................................................................................................................................... 348
Figure 8.1 DNS naming system.....................................................................................................................................352
Figure 8.2 DNS address lookup.....................................................................................................................................355
Figure 8.3 SIP trapezoid with two administrative domains............................................................................................356
Figure 8.4 ENUM’s DNS-based tiered architecture.......................................................................................................362
Figure 8.5 ENUM functional architecture for implementation in North America........................................................363
Figure 8.6 DDDS algorithm flow chart.........................................................................................................................367
Figure 8.7 High-level example of ENUM operations.....................................................................................................370
Figure 8.8 Point-to-point call flows from IP subscriber to the PSTN subscriber when both subscribers
have the same home network........................................................................................................................377
Figure 8.9 Point-to-point call flows between IP subscribers when both subscribers have separate home networks..........378
Figure 9.1 SIP network, functional entities, and message flows......................................................................................384
Figure 9.2 E
xample of double Route-Record: (a) SIP and IP network configuration, (b) call flows with IPv4–IPv6
multihomed proxy, and (c) call flows with TCP/UDP transport protocol switching.....................................389
List of Figures ◾ xxi
Figure 14.15 STUN protocol: (a) address discovery by STUN clients residing behind NATs
and (b) communications between hosts residing behind NATs................................................................ 500
Figure 14.16 PSTN origination–PSTN termination (SIP bridging).................................................................................502
Figure 14.17 Elementary call flows................................................................................................................................502
Figure 15.1 Simple call flows..........................................................................................................................................512
Figure 15.2 C
all flows where receiver does not understand namespace..........................................................................513
Figure 15.3 Access preemption with obscure reason....................................................................................................... 518
Figure 15.4 Network diagram scenario A....................................................................................................................... 519
Figure 15.5 Network preemption with obscure reason................................................................................................... 519
Figure 15.6 TDM/IP preemption event.........................................................................................................................520
Figure 15.7 Access preemption with reason: UA preemption.........................................................................................521
Figure 15.8 N
etwork preemption with Reserved Resources Preempted.........................................................................522
Figure 15.9 Non-IP preemption flow.............................................................................................................................523
Figure 15.10 Basic session establishment using preconditions........................................................................................524
Figure 15.11 Example using the end-to-end status type.................................................................................................530
Figure 15.12 Session modification with preconditions...................................................................................................531
Figure 15.13 Example using the segmented status type..................................................................................................531
Figure 15.14 Example of an initial offer in a 1xx response.............................................................................................532
Figure 15.15 Session mobility using 3PCC....................................................................................................................534
Figure 16.1 Basic transfer call flow................................................................................................................................ 546
Figure 16.2 Transfer with dialog reuse...........................................................................................................................548
Figure 16.3 Failed transfer—target busy........................................................................................................................549
Figure 16.4 Failed transfer—target does not answer......................................................................................................550
Figure 16.5 Transfer with consultation hold—exposing Transfer Target.................................................................550
Figure 16.6 Transfer protecting Transfer Target............................................................................................................ 551
Figure 16.7 Attended transfer call flow..........................................................................................................................553
Figure 16.8 Recovery when one party does not support REFER....................................................................................555
Figure 16.9 A
ttended transfer call flow with a contact URI not known to be globally routable.....................................556
Figure 16.10 A
ttended transfer call flow with nonroutable contact URI and AOR failure.............................................557
Figure 16.11 Recommended semi-attended transfer call flow........................................................................................559
Figure 16.12 S emi-attended transfer as blind transfer call flow (not recommended)......................................................560
Figure 16.13 S emi-attended transfer as attended transfer call flow (not recommended).................................................561
Figure 16.14 Attended transfer fallback to basic transfer using Require: replaces...........................................................561
Figure 16.15 Attended transfer fallback to basic transfer...............................................................................................562
Figure 16.16 Attended transfer call flow with Referred-By.............................................................................................563
Figure 16.17 Attended transfer as an ad hoc conference.................................................................................................565
List of Figures ◾ xxiii
xxvii
xxviii ◾ List of Tables
I have worked on networked multimedia communications, mandatory and optional texts, in a chronological and sys-
including in my present position at the US Army Research, tematic way for use as a single super-SIP RFC with an almost
Development and Engineering Command (RDECOM), one-to-one integrity from beginning to end. It aims to show
Communications–Electronics Research, Development, and the big picture of SIP for the basic SIP functionalities.
Engineering Center (CERDEC), Space and Terrestrial It should be noted that the text of each RFC from the
Communications Directorate (S&TCD) Laboratories, for IETF has been reviewed by all members of a given work-
large-scale global Session Initiation Protocol (SIP)-based ing group composed of worldwide experts, and a rough con-
Voice-over-Internet Protocol (VoIP)/multimedia networks sensus was made on which parts of the drafts needed to be
since 1993 when I was at AT&T Bell Laboratories. I was mandatory or optional, including whether an RFC needed to
the editor of the Multimedia Communications Forum be Standards Track, Informational, or Experimental. Trying
(MMCF) when it was created by many participating com- to put all SIP-related RFCs together to make a textbook has
panies worldwide, including AT&T, to promote techni- serious challenges because the key point is not simply put-
cal standards for networked multimedia communications ting one RFC after another chronologically. The text of each
to fill an important gap at a time when no standard bod- RFC needs to be put together for each particular functional-
ies came forward to do so. Later, I had the opportunity to ity, capability, and feature while retaining its integrity. Since
participate in International Telecommunication Union– this book is planned to serve as a single-SIP RFC specifica-
Telecommunication (ITU-T) on behalf of AT&T for the tion, I had very limited freedom to change the text of the
standardization of H.323. H.323 was the first success- original RFCs aside from some editorial changes. I have
ful technical standard for VoIP/multimedia telephony. used texts, figures, tables, and references from the original
However, SIP, which was standardized by the Internet RFCs as much as necessary so that readers can use them in
Engineering Task Force (IETF) much later than H.323, their original form. All RFCs, along with their authors, are
and emulated the simplicity of the protocol architecture provided as references, and all credit goes primarily to the
of Hypertext Transfer Protocol (HTTP), has been popu- authors of these RFCs and the many IETF working group
larized for VoIP over the Internet because of the ease with members who shaped the final RFCs with their invaluable
which it can be meshed with web services. comments and input. In this connection, I also extend my
After so many years of working on SIP to build large-scale sincere thanks to Paul Brigner, IETF Secretariat, for his kind
VoIP networks, I found that it was an urgent requirement to consent to reproduce text, figures, and tables with IETF
have a complete book that integrates all SIP-related Requests copyright notification. My only credit, as I mentioned ear-
for Comment (RFCs) in a systematic way—a book that net- lier, is to put all those RFCs together in such a way that will
work designers, software developers, product manufacturers, make one complete SIP RFC.
implementers, interoperability testers, professionals, profes- I have organized this book into 20 chapters based on
sors, and researchers can use like a super-SIP RFC since the the major functionalities, features, and capabilities of SIP,
publication of SIP RFC 3261 in 2002. No one knows exactly as follows:
how many RFCs, or even just those related to the base SIP
specification, have been published over the last two decades ◾◾ Chapter 1: Networked Multimedia Services
or how they are interrelated after so many extensions and ◾◾ Chapter 2: Basic Session Initiation Protocol
enhancements with new features and capabilities, correc- ◾◾ Chapter 3: SIP Message Elements
tions, and modifications with the latest consensus based on ◾◾ Chapter 4: Addressing in SIP
implementation and interoperability test experiences. Future ◾◾ Chapter 5: SIP Event Framework and Packages
studies are expected to break new ground in the current ◾◾ Chapter 6: Presence and Instant Messaging in SIP
knowledge of SIP. This book on SIP is the first of its kind in ◾◾ Chapter 7: Media Transport Protocol and Media
an attempt to put together all SIP-related RFCs, with their Negotiation
xxix
xxx ◾ Preface
◾◾ Chapter 8: DNS and ENUM in SIP different vendors as well as for intercarrier com-
◾◾ Chapter 9: Routing in SIP munications. The main objective of this book is,
◾◾ Chapter 10: User and Network-Asserted Identity in as explained earlier, to create a single integrated
SIP SIP RFC. The text has been reproduced from
◾◾ Chapter 11: Early Media in SIP the IETF RFCs for providing interoperability,
◾◾ Chapter 12: Service and Served-User Identity in SIP with permission from the IETF, in Chapters 2
◾◾ Chapter 13: Connections Management and Overload through 20. The copyright for the text that is
Control in SIP being reproduced (with permission) in the differ-
◾◾ Chapter 14: Interworking Services in SIP ent sections and subsections of this book belongs
◾◾ Chapter 15: Resource Priority and Quality of Service to the IETF. It is recommended that readers
in SIP consult the original RFCs posted in the IETF
◾◾ Chapter 16: Call Services in SIP website.
◾◾ Chapter 17: Media Server Interfaces in SIP
◾◾ Chapter 18: Multiparty Conferencing in SIP I am greatly indebted to many researchers, professionals,
◾◾ Chapter 19: Security Mechanisms in SIP software and product developers, network designers, profes-
◾◾ Chapter 20: Privacy and Anonymity in SIP sors, intellectuals, and individual authors and contributors
of technical standard documents, drafts, and RFCs world-
However, presenting the SIP RFCs chronologically is not wide for learning from their high-quality technical papers
the only way to group them. My best intellectual instinct and discussions in group meetings, conferences, and e-mails
has guided me in arranging these RFCs according to basic in working groups for more than two decades. In addition,
SIP functionalities; however, the much more complex intel- I had the privilege to meet many of those great souls in per-
ligent capabilities of SIP are yet to be included. I am looking son during the MMCF, ITU-T, IETF, and other technical
forward to see whether readers will validate my judgment. In standard conferences held in different countries of the world.
addition, I am providing a general statement for the IETF Their unforgettable personal touch has enriched my heart
copyright information, as follows: very deeply as well.
I admire Richard O’Hanley, publisher, ICT, Business, and
IETF RFCs have texts that are mandatory and Security, CRC Press, for his appreciative approach in publishing
optional including the use of words like shall, this book. I am thankful to Adel Rosario, project manager,
must, may, should, and recommended. These texts for her sincere proofing of this book and helping in a variety
are very critical for providing interoperabil- of ways, and to Tara Nieuwesteeg, CRC project editor, for
ity for implementation in using products from overseeing the production process.
Author
Radhika Ranjan Roy has been an electronics engineer, US 2007, supporting modeling, simulations, architectures, and
Army Research, Development and Engineering Command system engineering of many Army projects: WIN-T, FCS,
(RDECOM), Communications–Electronics Research, and JNN.
Development, and Engineering Center (CERDEC), Space During his career, Dr. Roy worked at AT&T/Bell
and Terrestrial Communications Directorate (S&TCD) Laboratories, Middletown, New Jersey, as senior consultant
Laboratories, Aberdeen Proving Ground (APG), Maryland, from 1990 to 2004, and led a team of engineers in design-
since 2009. Dr. Roy leads his research and development ing AT&T’s worldwide SIP-based VoIP/multimedia com-
efforts in the development of scalable large-scale SIP-based munications network architecture, which consisted of wired
VoIP/multimedia networks and services, mobile ad hoc and wireless parts, from the preparation of Requests for
networks (MANETs), peer-to-peer (P2P) networks, cyber Information (RFI), evaluation of vendor RFI responses, and
security detection application software and network vul- interactions with all selected major vendors related to their
nerability, jamming detection, and supporting array of US products. He participated and contributed in the develop-
Army/Department of Defense’s Nationwide and Worldwide ment of VoIP/H.323/SIP multimedia standards in ITU-T,
Warfighter Networking Architectures and participating in IETF, ATM, and Frame Relay standard organizations.
technical standards development in multimedia/real-time Dr. Roy worked as a senior principal engineer in CSC,
services collaboration, IPv6, radio communications, enter- Falls Church, Virginia, from 1984 to 1990, and worked in
prise services management, and information transfer of the design and performance analysis of the US Treasury
Department of Defense (DoD) technical working groups. nationwide X.25 packet-switching network. In addition,
He earned his PhD in electrical engineering with a major in he designed the network architectures of many proposed
computer communications from the City University of New US government and commercial worldwide and nation-
York, New York, in 1984, and his MS in electrical engineer- wide networks: Department of State Telecommunications
ing from Northeastern University, Boston, Massachusetts, Network (DOSTN), US Secret Service Satellite Network,
in 1978. He earned his BS in electrical engineering from Veteran Communications Network, and Ford Company’s
the Bangladesh University of Engineering and Technology, Dealership Network. Prior to CSC, he worked from
Dhaka, Bangladesh, in 1967. 1967 to 1977 as deputy director (design) in PDP, Dhaka,
Prior to joining CERDEC, Dr. Roy worked as the lead Bangladesh.
systems engineer at CACI, Eatontown, New Jersey, from Dr. Roy’s research interests include the areas of mobile
2007 to 2009, and developed the Army Technical Resource ad hoc networks, multimedia communications, peer-to-peer
Model (TRM), Army Enterprise Architecture (AEA), DoD networking, and quality of service. He has published more
Architecture Framework (DoDAF), and Army LandWarNet than 50 technical papers and either holds or has pending
(LWN) Capability Sets, as well as technical standards for the more than 30 patents. He is a life member of IEEE and is
Joint Tactical Radio System (JTRS), Mobile IPv6, MANET, a member of the Eta Kappa Nu honor society. He is also a
and Session Initiation Protocol (SIP) supporting Army Chief member of many IETF working groups. Dr. Roy authored
Information Officer (CIO)/G-6. Dr. Roy worked as a senior a book, Handbook of Mobile Ad Hoc Networks for Mobility
systems engineer, SAIC, Abingdon, Maryland, from 2004 to Models, Springer, in 2010.
xxxi
Chapter 1
1
2 ◾ Handbook on Session Initiation Protocol
videoconferencing (VC) are considered RT services because One-way VOD [2], which is considered a near-RT com-
of real-time two-way, point-to-point/multipoint conversa- munication, can have much less stringent performances than
tions between users and, the audio and video performance those of TC or VTC. The text or graphics are non-RT appli-
requirements can be stated as follows [1]: cations, and the one-way delay requirement can be of the
order of a few seconds; however, unlike audio or video, it
◾◾ One-way end-to-end delay (including propagation, cannot tolerate any BER.
network, and equipment) for audio or video should be The synchronization requirements between different
between 100 and 150 ms. media of multimedia applications impose a heavy burden on
◾◾ Mean-opinion-score (MOS) level for audio should be the multimedia transport networks, especially for the packet
between 4.0 and 5.0. networks such as the Internet Protocol (IP). RT applications
◾◾ MOS level for video should be between 3.5 and 5.0. are also considered live multimedia applications with the
◾◾ End-to-end delay jitter should be very short, less than generation of live audio, video, and/or data from live sources
250 μs in some cases. of microphones, video cameras, and/or application sharing
◾◾ Bit error rate (BER) should be very low for good quality by human/machine, while near-RT applications are usually
audio or video, although some BER can be tolerated. retrieved from databases and can be considered as retrieval
◾◾ Intermedia and intramedia synchronization need to be multimedia applications. Consequently, the synchroniza-
maintained using suitable algorithms. tion requirements between RT and near-RT applications are
◾◾ Differential delay between audio and video transmis- also significantly different. The transmission side of the RT
sion should be between no more than −20 ms to +40 ms applications does not require much control, while near-RT
for maintaining proper intermedia synchronization. applications must have some defined relationships between
media and require some scheduling mechanisms for guaran- need to be carefully designed and developed to meet these
teed synchronization between the retrieved and transmitted performance requirements. In this context, the characteris-
media. The end-to-end delay requirements for RT applica- tics of RT, near-RT, and non-RT services, especially focus-
tions are more stringent than for near-RT applications. RT ing on their performances, are described. In the subsequent
applications require synchronization between media gen- chapters, we will describe all the multimedia signaling and
erated by different live sources, although it may not be so media protocols, and how the technical challenges of multi-
common. However, near-RT applications are commonly media communications services are met.
retrieved from different multimedia servers and are presented
to users synchronously.
PROBLEMS
In general, multimedia synchronization can deal with
many aspects: temporal, spatial, or even a logical relation- 1. What are the key differences in performances
ship between objects, data entities, or media streams [3]. In between networked and non-networked multimedia
the context of multimedia computing and communications, applications?
the synchronization accuracy is critical for high-quality mul- 2. What are intramedia and intermedia synchronization?
timedia applications and can be measured by the following Why are they important for each category of multime-
performance parameters: delay, delay jitter, intermedia skew, dia services?
and tolerable error rate. In RT applications, delay is mea- 3. What is lip synchronization? What are the key techni-
sured in end-to-end delay from the live source to the des- cal challenges in maintaining lip synchronization in
tination, while in near-RT applications, delay is measured VTC/VC?
in retrieval time, i.e., the delay from the time a request is 4. What are the performance differences between RT,
made to the time the application is retrieved from the server near-RT, and non-RT multimedia services?
and reaches the destination. It implies that excessive buffer- 5. What are the problems caused by packet losses in TC,
ing should not be used in RT applications. Delay jitter mea- VTC/VC, and file transfer services?
sures the deviation of presentation time of the continuous 6. What are the problems caused by delay jitters in TC
media samples from their fixed or desired presentation time. and VTC/VC applications?
Intermedia skew measures the time shift between related
media from the desired temporal relationship. The accept-
able value for intermedia skew is determined by the media
types concerned. Table 1.1 shows some examples of interme-
References
dia skew tolerance [1–7]. 1. Roy, R.R., “Networking constraints in multimedia confer-
However, the implementation of multimedia synchro- encing and the role of ATM networks,” AT&T Technical
nization services can be done using a variety of techniques Journal, vol. 73, no. 4, 1994.
2. Roy, R.R. et al., “An analysis of universal multimedia switch-
depending on different mode of applications. Each synchro-
ing architectures,” AT&T Technical Journal, vol. 73, no. 6,
nization implementation model can be quite complex [8]. As 1994.
mentioned earlier, continuous media can tolerate some errors 3. Georganas, N.D. et al., Editors, “Synchronization issues in
and error-free transmission is not essential to achieve accept- multimedia communications,” IEEE SAC, vol. 14, no. 1,
able good quality; moreover, the tolerable error rate measures 1996.
the allowable BER and packet error rate (PER) for a particu- 4. Fluckinger, F., Understanding Networked Multimedia—
lar media in a specific application. Multimedia traffic can Applications and Technology, Prentice Hall, Hertfordshire,
be very bursty, and the burstiness can vary from 0.1 to 1. If UK.
5. Steinmetz, R. et al., “Multimedia synchronization tech-
constant bit rate (CBR) audio or video codec is used, there
niques: Experience based on different system structures,”
will be no variation in bit rates of the codec and the bursti- Computer Communication Review, vol. 22, no. 1, 1992.
ness will be 1. The multimedia call duration can vary from a 6. Hehmann, D. et al., “Transport services for multimedia appli-
few seconds to few hours. cations on broadband networks,” Computer Communications,
vol. 13, no. 4, 1990.
7. Ghinea, G. et al., “Perceived synchronization of olfac-
tory multimedia,” IEEE Transactions on Systems Man and
1.4 Summary Cybernetics—Part A: Systems and Humans, vol. 40, no. 4,
2010.
The signaling (e.g., session initiation protocol/session descrip- 8. Blakowski, G. and Steinmetz, R., “A media synchronization
tion protocol) and media (e.g., real-time transport protocol) survey: Reference model, specification, and case studies,”
protocols dealing with the networked multimedia services IEEE SAC, vol. 14, no. 1, 1996.
Chapter 2
5
6 ◾ Handbook on Session Initiation Protocol
Address of record (AOR) A Session Initiation Protocol (SIP) or SIP Security (SIPS) Uniform Resource Identifier (URI)
that points to a domain with a location service that can map the URI to another URI where
the user might be available. Typically, the location service is populated through
registrations. An AOR is frequently thought of as the public address of the user (RFC 3261:
Standards Track).
Advertised address The address that occurs in the Via header field’s sent-by production rule, including the port
number and transport (RFC 5923: Standards Track).
Agent The protocol implementation involved in the offer–answer exchange. There are two agents
involved in an offer–answer exchange (RFC 3264: Standards Track).
Alias Reusing an existing connection to send requests in the backwards direction; that is, A
opens a connection to B to send a request, and B uses that connection to send requests in
the backwards direction to A. This is also known as connection reuse (RFC 5923: Standards
Track).
Answer An SDP message sent by an answerer in response to an offer received from an offerer (RFC
3264: Standards Track).
Answerer An agent that receives a session description from another agent describing aspects of
desired media communication, and then responds to that with its own session
description.
Appearance number A positive integer associated with one or more dialogs of an AOR. Appearance numbers
are managed by an appearance agent, and displayed and rendered to the user by UAs that
support this specification. When an appearance number is assigned or requested,
generally the assigned number is the smallest positive integer that is not currently
assigned as an appearance number to a dialog for this AOR. This specification does not
define an upper limit on appearance numbers; however, using appearance numbers that
are not easily represented using common integer representations is likely to cause
failures (RFC 7463: Standards Track).
Authoritative proxy A proxy that handles non-REGISTER requests for a specific AOR, performs the logical
location server lookup described in RFC 3261, and forwards those requests to specific
Contact URIs. In RFC 3261, the role that is authoritative for REGISTER requests for a
specific AOR is a registration server (RFC 5626: Standards Track).
Back-to-back user agent A logical entity that receives a request and processes it as a user agent server (UAS). To
(B2BUA) determine how the request should be answered, it acts as a user agent client (UAC) and
generates requests. Unlike a proxy server, it maintains the dialog state and must participate in
all requests sent on the dialogs it has established. Since it is a concatenation of a UAC and
UAS, no explicit definitions are needed for its behavior (RFC 3261: Standards Track).
Call An informal term that refers to some communication between peers, generally set up for
the purposes of a multimedia conversation (RFC 3261: Standards Track).
Call leg Another name for a dialog (see below, this table) specified in RFC 2543, but is no longer
used in RFC 3261 specification (RFC 3261: Standards Track).
Call stateful A proxy is call stateful if it retains the state for a dialog from the initiating INVITE to the
terminating BYE request. A call stateful proxy is always transaction stateful, but the
converse is not necessarily true (RFC 3261: Standards Track).
Callee A destination of the original call, and a target of the Completion of Call (CC call) (RFC 6910:
Standards Track).
(Continued)
Basic Session Initiation Protocol ◾ 7
Callee’s monitor A logical component that implements the CC queue for destination user(s)/UA(s) and
performs the associated tasks, including sending CC recall events, analogous to the
destination local exchange’s role in Signaling System 7 (SS7) CC (RFC 6910: Standards
Track).
Caller Within the context of this specification, a caller refers to the user on whose behalf a UAC is
operating. It is not limited to a user whose UAC sends an INVITE request (RFC 3841:
Standards Track).
The initiator of the original call and the CC request. The user on whose behalf the CC call
is made (RFC 6910: Standards Track).
Caller’s agent A logical component that makes CC requests and responds to CC recall events on behalf of
originating user(s)/UA(s), analogous to the originating local exchange’s role in SS7 CC
(RFC 6910: Standards Track).
CC call A call from the caller to the callee, triggered by the CC service when it has determined that
the callee is available (RFC 6910: Standards Track).
Certificate A public key infrastructure using X.509 (PKIX) (RFC 5280) style certificate containing a public
key and a list of identities in the SubjectAltName that are bound to this key. The certi
ficates discussed in this document are generally self-signed and use the mechanisms in
the SIP Identity (RFC 4474, see Section 19.4.8) specification to vouch for their validity.
Certificates that are signed by a certification authority can also be used with all the
mechanisms in this document; however, they need not be validated by the receiver
(although the receiver can validate them for extra assurance) (RFC 6072: Standards Track).
Client Any network element that sends SIP requests and receives SIP responses. Clients may or
may not interact directly with a human user. UACs and proxies are clients (RFC 3261:
Standards Track).
Completion of Call (CC) The indication by the caller to the caller’s agent that the caller desires CC for a failed
activation original call; this implies an indication transmitted from the caller’s agent to the callee’s
monitor of the desire for CC processing (RFC 6910: Standards Track).
Completion of Call (CC) An indication in the CC call INVITE used to prioritize the call at the destination (RFC 6910:
indicator Standards Track).
Completion of Call (CC) The data in responses to the INVITE of the original call that indicate that CC is available for
possible indication the call (RFC 6910: Standards Track).
Completion of Call (CC) A buffer at the callee’s monitor that stores incoming calls that are targets for CC. Note: This
queue buffer may or may not be organized as a queue. The use of the term queue is analogous
to SS7 usage. CCE, or CC Entity: the representation of a CC request, or, equivalently, an
existing CC subscription within the queue of a callee’s monitor (RFC 6910: Standards
Track).
Completion of Call (CC) The action of the callee’s monitor selecting a particular CC request for initiation of a CC
recall call, resulting in an indication from the caller’s agent to the caller that it is now possible to
initiate a CC call (RFC 6910: Standards Track).
Completion of Call (CC) Event notifications of event package call-completion, sent by the callee’s monitor to the
recall events caller’s agent to inform it of the status of its CC request (RFC 6910: Standards Track).
Completion of Call (CC) Recall timer: the maximum time the callee’s monitor will wait for the caller’s response to a
request CC recall (RFC 6910: Standards Track).
(Continued)
8 ◾ Handbook on Session Initiation Protocol
Completion of Call (CC) The entry in the callee’s monitor queue representing the caller’s request for CC processing,
request that is, the caller’s CC subscription (RFC 6910: Standards Track).
CC service duration timer: maximum time a CC request may remain active within the
network (RFC 6910: Standards Track).
Completion of Calls A service that allows a caller who failed to reach a desired callee to be notified when the
(CC) callee becomes available to receive a call (RFC 6910: Standards Track).
Completion of Calls on A CC service provided when the initial failure was that the destination UA did not answer
No Reply (CCNR) (RFC 6910: Standards Track).
Completion of Calls on A CC service provided when the initial failure was that the destination UA was not
Not Logged-in (CCNL) registered (RFC 6910: Standards Track).
Completion of Calls to A CC service provided when the initial failure was that the destination UA was busy (RFC
Busy Subscriber (CCBS) 6910: Standards Track).
Conference A multimedia session (see below, this table) that contains multiple participants.
Core Designates the functions specific to a particular type of SIP entity, that is, specific to either
a stateful or a stateless proxy, a user agent (UA), or registrar. All cores, except those for the
stateless proxy, are transaction users (RFC 3261: Standards Track).
Credential The combination of a certificate and the associated private key. Password phrase: a
password used to encrypt and decrypt a Public Key Cryptographic System #8 (PKCS #8)
private key (RFC 6072: Standards Track).
Dialog A peer-to-peer SIP relationship between two UAs that persists for some time. A dialog is
established by SIP messages, such as a 2xx response to an INVITE request. A dialog is
identified by a call identifier, local tag, and a remote tag. A dialog was formerly known as a
call leg in RFC 2543 (RFC 3261: Standards Track).
Downstream A direction of message forwarding within a transaction that refers to the direction that
requests flow from the UAC to UAS (RFC 3261: Standards Track).
Edge proxy Any proxy that is located topologically between the registering UA and the authoritative
proxy. The first edge proxy refers to the first edge proxy encountered when a UA sends a
request (RFC 5626: Standards Track).
Event hard state The steady-state or default event state of a resource, which the ESC may use in the absence
of, or in addition to, soft-state publications (RFC 3903: Standards Track).
Event package An additional specification that defines a set of state information to be reported by a
notifier to a subscriber. Event packages also define further syntax and semantics that are
based on the framework defined by this document and are required to convey such state
information (RFC 6665: Standards Track).
Event Publication Agent The UAC that issues PUBLISH requests to publish event state (RFC 3903: Standards Track).
(EPA)
Event Soft State Event state published by an EPA using the PUBLISH mechanism. A protocol element (i.e., an
entity-tag) is used to identify a specific soft-state entity at the event state compositor. Soft
state has a defined lifetime and will expire after a negotiated amount of time (RFC 3903:
Standards Track).
(Continued)
Basic Session Initiation Protocol ◾ 9
Event State State information for a resource, associated with an event package and an AOR (RFC 3903:
Standards Track).
Event State Compositor The UAS that processes PUBLISH requests, and is responsible for compositing event state
(ESC) into a complete, composite event state of a resource (RFC 3903: Standards Track).
Event Template-Package A special kind of event package that defines a set of states that may be applied to all
possible event packages, including itself (RFC 6665: Standards Track).
Explicit preference A caller preference indicated explicitly in the Accept-Contact or Reject-Contact header
fields (RFC 3841: Standards Track).
Failed call A call that does not reach a desired callee, from the caller’s point of view. Note that a failed
call may be successful from the SIP point of view, for example, if the call reached the
callee’s voice mail but the caller desired to speak to the callee in real time, the INVITE
receives a 200 response, but the caller considers the call to have failed (RFC 6910:
Standards Track).
Feature preferences Caller preferences that describe desired properties of a UA to which the request is to be
routed. Feature preferences can be made explicit with the Accept-Contact and Reject-
Contact header fields (RFC 3841: Standards Track).
Final response A response that terminates a SIP transaction, as opposed to a provisional response that
does not. All 2xx, 3xx, 4xx, 5xx, and 6xx responses are final (RFC 3261: Standards Track).
Flow A transport-layer association between two hosts that is represented by the network address
and port number of both ends and by the transport protocol. For Transmission Control
Protocol (TCP), a flow is equivalent to a TCP connection. For User Datagram Protocol (UDP), a
flow is a bidirectional stream of datagrams between a single pair of IP addresses and ports of
both peers. With TCP, a flow often has a one-to-one correspondence with a single file
descriptor in the operating system (RFC 5626: Standards Track).
Flow token An identifier that uniquely identifies a flow that can be included in a SIP URI defined in
RFC 3986 (RFC 5626: Standards Track).
Focus/conference focus The focus is defined in RFC 4579 that hosts a SIP conference and maintains a SIP signaling
relationship with each participant in the conference. RFC 4579 also defines that an isfocus
feature tag (see Section 2.11) in a Contact header field will not cause interoperability
issues between a focus and a conference-unaware UA since it will be treated as an
unknown header parameter and ignored, as per standard SIP behavior.
General
The main design guidelines for the development of SIP extensions and conventions for
conferencing are to define the minimum number of extensions and to have seamless
backward compatibility with conference-unaware SIP UAs. The minimal requirement for
SIP is being able to express that a dialog is a part of a certain conference referenced to by
a URI. As a result of these extensions, it is possible to do the following using SIP:
• Create a conference
• Join a conference
• Invite a user to a conference
• Expel a user by third party
• Discover if a URI is a conference URI
• Delete a conference
(Continued)
10 ◾ Handbook on Session Initiation Protocol
The approach taken is to use the feature parameter isfocus to express that a SIP dialog
belongs to a conference. The use of feature parameters in Contact header fields to
describe the characteristics and capabilities of a UA is described in the User Agent
Capabilities document in RFC 3840 (see Section 2.11), which includes the definition of the
isfocus feature parameter.
Session Establishment
In session establishment, a focus must include the isfocus feature parameter in the Contact
header field unless the focus wishes to hide the fact that it is a focus. To a participant, the
feature parameter will be associated with the remote target URI of the dialog. It is an
indication to a conference-aware UA that the resulting dialog belongs to a conference,
identified by the URI in the Contact header field, and that the call control conventions
defined in this document can be applied.
By their nature, the conferences supported by this specification are centralized. Therefore,
typically, a conferencing system needs to allocate a SIP conference URI such that SIP
requests to this URI are not forked and are routed to a dedicated conference focus. For
example, a globally accessible SIP conference could be well constructed with a conference
URI using a Globally Routable User Agent URI (GRUU) defined in RFC 5627 (see Section
4.3), because of its ability to support the nonforking and global routability requirements.
Discovery
Using the mechanism described in this section, it is possible, given an opaque URI, to
determine if it belongs to a certain conference (i.e., meaning that it is a conference URI) or
not. This discovery function can be implemented in SIP using an OPTIONS request, and
can be done either inside an active dialog or outside a dialog. A focus must include the
isfocus feature parameter in a 200 OK response to an OPTIONS unless the focus wishes to
hide the fact that it is a focus (RFC 4579: Best Current Practice).
Header A component of a SIP message that conveys information about the message. It is structured
as a sequence of header fields.
Header field A component of the SIP message header. A header field can appear as one or more header
field rows. Header field rows consist of a header field name and zero or more header field
values. Multiple header field values on a given header field row are separated by commas.
Some header fields can only have a single header field value, and as a result, always
appear as a single header field row (RFC 3261: Standards Track).
Header field value A single value; a header field consists of zero or more header field values (RFC 3261:
Standards Track).
Home domain The domain providing service to a SIP user. Typically, this is the domain present in the URI
in the AOR of a registration (RFC 3261: Standards Track).
Identity An Identity, for the purposes of this document, is a sip:, sips:, or tel: URI, and optionally a
Display Name. The URI must be meaningful to the domain identified in the URI (in the
case of sip: or sips: URIs) or the owner of the E.164 number (in the case of tel: URIs), in the
sense that when used as a SIP Request-URI in a request sent to that domain/number range
owner, it would cause the request to be routed to the user/line that is associated with the
identity, or to be processed by service logic running on that user’s behalf.
If the URI is a sip: or sips: URI, then depending on the local policy of the domain identified
in the URI, the URI may identify some specific entity, such as a person. If the URI is a tel:
URI, then depending on the local policy of the owner of the number range within which
the telephone number remains, the number may identify some specific entity, such as a
telephone line. However, it should be noted that identifying the owner of the number
range is a less straightforward process than identifying the domain that owns a sip: or sips:
URI (RFC 3324: Informational).
(Continued)
Basic Session Initiation Protocol ◾ 11
Implicit preference A caller preference that is implied through the presence of other aspects of a request. For
example, if the request method is INVITE, it represents an implicit caller preference to
route the request to a UA that supports the INVITE method.
Informational response Same as a provisional response (see below, this table) (RFC 3261: Standards Track).
Initial session refresh The first session refresh request sent with a particular Call-ID value (RFC 4028: Standards
request Track).
Initiator, calling party, The party initiating a session (and dialog) with an INVITE request. A caller retains this role
caller from the time it sends the initial INVITE that established a dialog until the termination of
that dialog (RFC 3261: Standards Track).
Instance-id This specification uses the word instance-id to refer to the value of the sip.instance
media feature tag that appears as a +sip.instance Contact header field parameter. This is
a Uniform Resource Name (URN) that uniquely identifies this specific UA instance (RFC
5626: Standards Track).
Invitee, invited user, The party that receives an INVITE request for the purpose of establishing a new session. A
called party, callee callee retains this role from the time it receives the INVITE until the termination of the
dialog established by that INVITE (RFC 3261: Standards Track).
Location service A location service is used by a SIP redirect or proxy server to obtain information about a
callee’s possible location(s). It contains a list of bindings of AOR keys to zero or more
contact addresses. The bindings can be created and removed in many ways; this
specification defines a REGISTER method that updates the bindings (RFC 3261: Standards
Track).
Loop A request that arrives at a proxy, is forwarded, and later arrives back at the same proxy.
When it arrives the second time, its Request-URI is identical to the first time, and other
header fields that affect proxy operation are unchanged, so that the proxy would make the
same processing decision on the request it made the first time. Looped requests are
errors, and the procedures for detecting them and handling them are described by the
protocol (RFC 3261: Standards Track).
Loose routing A proxy is said to be loose routing if it follows the procedures defined in RFC 3261
specification for processing of the Route header field. These procedures separate the
destination of the request (present in the Request-URI) from the set of proxies that need
to be visited along the way (present in the Route header field). A proxy compliant to these
mechanisms is also known as a loose router (RFC 3261: Standards Track).
Media stream From RTSP specified in RFC 2336 (see Section 7.5), a media stream is a single media
instance, for example, an audio stream or a video stream as well as a single whiteboard or
shared application group. In SDP, a media stream is described by an m= line and its
associated attributes (RFC 3264: Standards Track).
Message Data sent between SIP elements as part of the protocol. SIP messages are either requests
or responses (RFC 3261: Standards Track).
Method The method is the primary function that a request is meant to invoke on a server. The
method is carried in the request message itself. Example methods are INVITE and BYE
(RFC 3261: Standards Track).
Minimum Timer Because of the processing load of mid-dialog requests, all elements (proxy, UAC, UAS) can
have a configured minimum value for the session interval that they are willing to accept.
This value is called the minimum timer (RFC 4028: Standards Track).
(Continued)
12 ◾ Handbook on Session Initiation Protocol
Network Asserted An identity derived by a SIP network entity as a result of an authentication process, which
Identity identifies the authenticated entity in the sense defined in Identity. In the case of a sip: or
sips: URI, the domain included in the URI must be within the Trust Domain. In the case of
a tel: URI, the owner of the E.164 number in the URI must be within the Trust Domain. The
authentication process used, or at least its reliability/strength, is a known feature of the
Trust Domain using the Network Asserted Identity mechanism, that is, in the language
described in the Trust Domain, as defined in Spec(T) (RFC 3324: Informational).
Notification The act of a notifier sending a NOTIFY request to a subscriber to inform the subscriber of
the state of a resource (RFC 6665: Standards Track).
Notifier A UA that generates NOTIFY requests for the purpose of notifying subscribers of the state
of a resource. Notifiers typically also accept SUBSCRIBE requests to create subscriptions
(RFC 6665: Standards Track).
The UA that generates NOTIFY requests for the purpose of notifying subscribers of the
callee’s availability; for the CC service, this is the task of the callee’s monitor (RFC 6910:
Standards Track).
ob Parameter A SIP URI parameter that has a different meaning depending on context. In a Path header
field value, it is used by the first edge proxy to indicate that a flow token was added to the
URI. In a Contact or Route header field value, it indicates that the UA would like other
requests in the same dialog to be routed over the same flow (RFC 5626: Standards Track).
Offerer An agent that generates a session description to create or modify a session (RFC 3264:
Standards Track).
Original call The initial call that failed to reach a desired destination (RFC 6910: Standards Track).
Outbound proxy A proxy that receives requests from a client, even though it may not be the server resolved
by the Request-URI. Typically, a UA is manually configured with an outbound proxy, or can
learn about one through autoconfiguration protocols (RFC 3261: Standards Track).
Outbound-proxy-set A set of SIP URIs that represent each of the outbound proxies (often edge proxies) with
which the UA will attempt to maintain a direct flow. The first URI in the set is often referred
to as the primary outbound proxy, and the second as the secondary outbound proxy. There
is no difference between any of the URIs in this set, nor does the primary/secondary
terminology imply that one is preferred over the other (RFC 5626: Standards Track).
Parallel search In a parallel search, a proxy issues several requests to possible user locations upon
receiving an incoming request. Rather than issuing one request and then waiting for the
final response before issuing the next request as in a sequential search, a parallel search
issues requests without waiting for the result of previous requests (RFC 3261: Standards
Track).
Persistent connection The process of sending multiple, possibly unrelated requests on the same connection, and
receiving responses on that connection as well. More succinctly, A opens a connection to
B to send a request, and later reuses the same connection to send other requests, possibly
unrelated to the dialog established by the first request. Responses will arrive over the
same connection. Persistent connection behavior is specified in RFC 3261 (see Section
3.13). Persistent connections do not imply connection reuse. The persistent connection is
also termed as the shared connection (RFC 5923: Standards Track).
Presence compositor A type of ESC that is responsible for compositing presence state for a presentity (RFC 3903:
Standards Track).
(Continued)
Basic Session Initiation Protocol ◾ 13
Provisional response A response used by the server to indicate progress, but that does not terminate a SIP
transaction. 1xx responses are provisional; other responses are considered final (RFC 3261:
Standards Track).
Proxy, proxy server An intermediary entity that acts as both a server and a client for the purpose of making
requests on behalf of other clients. A proxy server primarily plays the role of routing,
which means that its job is to ensure that a request is sent to another entity closer to the
targeted user. Proxies are also useful for enforcing policy (e.g., making sure a user is
allowed to make a call). A proxy interprets, and, if necessary, rewrites specific parts of a
request message before forwarding it (RFC 3261: Standards Track).
Public Service Identity A SIP URI that refers to a service instead of a user (RFC 5002: Informational).
Publication The act of an EPA sending a PUBLISH request to an ESC to publish event state (RFC 3903:
Standards Track).
Recipient URI The Request-URI of an outgoing request sent by an entity (e.g., a user agent or a proxy).
The sending of such a request may have been the result of a translation operation (RFC:
5360: Standards Track).
Recursion A client recurses on a 3xx response when it generates a new request to one or more of the
URIs in the Contact header field in the response (RFC 3261: Standards Track).
Redirect server A UAS that generates 3xx responses to requests it receives, directing the client to contact
an alternate set of URIs (RFC 3261: Standards Track).
REFER-Issuer The UA issuing the REFER request (RFC 4488: Standards Track).
REFER-Recipient The UA receiving the REFER request (RFC 4488: Standards Track).
Reg-id The value of a new header field parameter for the Contact header field. When a UA
registers multiple times, each for a different flow, each concurrent registration gets a
unique reg-id value (RFC 5626: Standards Track).
Registrar A server that accepts REGISTER requests and places the information it receives in those
requests into the location service for the domain it handles (RFC 3261: Standards Track).
Regular transaction Any transaction with a method other than INVITE, ACK, or CANCEL (RFC 3261: Standards
Track).
Relay Any SIP server, be it a proxy, B2BUA, or some hybrid, that receives a request, translates its
Request-URI into one or more next-hop URIs (i.e., recipient URIs), and delivers the
request to those URIs (RFC: 5360: Standards Track).
Request A SIP message sent from a client to a server, for the purpose of invoking a particular
operation (RFC 3261: Standards Track).
Request handling Caller preferences that describe desired request treatment at a server. These preferences
preferences are carried in the Request-Disposition header field (RFC 3841: Standards Track).
Resolved address The network identifiers (IP address, port, transport) associated with a UA as a result of
executing RFC 3263 (see Section 8.2.4) on a URI (RFC 5923: Standards Track).
Response A SIP message sent from a server to a client, for indicating the status of a request sent from
the client to the server (RFC 3261: Standards Track).
(Continued)
14 ◾ Handbook on Session Initiation Protocol
Retain option A characteristic of the CC service; if supported, CC calls that again encounter a busy callee
will not be queued again, but the position of the caller’s entry in the queue is retained.
Note that SIP CC always operates with the retain option active; a failed CC call does not
cause the CC request to lose its position in the queue (RFC 6910: Standards Track).
Ringback The signaling tone produced by the calling party’s application indicating that a called party
is being alerted (ringing) (RFC 3261: Standards Track).
Route set A collection of ordered SIP or SIPS URI that represent a list of proxies that must be
traversed when sending a particular request. A route set can be learned, through headers
like Record-Route, or it can be configured (RFC 3261: Standards Track).
Seizing An appearance can be reserved before a call being placed by seizing the appearance. An
appearance can be seized by communicating an artificial state of trying before actually
initiating a dialog (i.e., sending the INVITE), in order to appear as if it were already
initiating a dialog (RFC 7463: Standards Track).
Selecting (or not seizing) An appearance is merely selected (i.e., not seized) if there is no such communication of
artificial state of trying before initiating a dialog; that is, the state is communicated when
the dialog is actually initiated. The appearance number is learned after the INVITE is sent
(RFC 7463: Standards Track).
Sequential search In a sequential search, a proxy server attempts each contact address in sequence, proceeding
to the next only after the previous one has generated a final response. A 2xx or 6xx class final
response always terminates a sequential search (RFC 3261: Standards Track).
Server A network element that receives requests in order to service them and sends back
responses to those requests. Examples of servers are proxies, UASs, redirect servers, and
registrars (RFC 3261: Standards Track).
Session From the SDP specification (RFC 2327): “A multimedia session is a set of multimedia senders
and receivers and the data streams flowing from senders to receivers. A multimedia
conference is an example of a multimedia session.” (A session as defined for SDP can
comprise one or more RTP sessions.) As defined, a callee can be invited several times, by
different calls, to the same session. If SDP is used, a session is defined by the
concatenation of the SDP user name, session id, network type, address type, and address
elements in the origin field (RFC 3261: Standards Track).
Session expiration The time at which an element will consider the session timed out, if no successful session
refresh transaction occurs beforehand (RFC 4028: Standards Track).
Session interval The maximum amount of time that can occur between session refresh requests in a dialog
before the session will be considered timed out. The session interval is conveyed in the
Session-Expires header field, which is defined here. The UAS obtains this value from the
Session-Expires header field in a 2xx response to a session refresh request that it sends.
Proxies and UACs determine this value from the Session-Expires header field in a 2xx
response to a session refresh request that they receive (RFC 4028: Standards Track).
Session refresh request An INVITE or UPDATE request processed according to the rules of this specification. If the
request generates a 2xx response, the session expiration is increased to the current time
plus the session interval obtained from the response. A session refresh request is not to
be confused with a target refresh request defined in 3261 (see below, this table), which is a
request that can update the remote target of a dialog (RFC 4028: Standards Track).
Signaling System 7 (SS7) The signaling protocol of the public switched telephone network, defined by ITU-T
Recommendations Q.700 through Q.849 (RFC 6910: Standards Track).
(Continued)
Basic Session Initiation Protocol ◾ 15
SIP domain identity An identity of a domain (e.g., sip:example.com) that is contained in an X.509 certificate
bound to a subject that identifies the subject as an authoritative SIP server for a domain
(RFC 5922: Standards Track).
SIP transaction A SIP transaction occurs between a client and a server and comprises all messages from the
first request sent from the client to the server up to a final (non-1xx) response sent from
the server to the client. If the request is INVITE and the final response is a non-2xx, the
transaction also includes an ACK to the response. The ACK for a 2xx response to an INVITE
request is a separate transaction (RFC 3261: Standards Track).
Spec(T) An aspect of the definition of a Trust Domain is that all the elements in that domain are
compliant to a set of configurations and specifications generally referred to as Spec(T).
Spec(T) is not a specification in the sense of a written document; rather, it is an agreed-
upon set of information that all elements are aware of. Proper processing of the asserted
identities requires that the elements know what is actually being asserted, how it was
determined, and what the privacy policies are. All of that information is characterized by
Spec(T) (RFC 3324: Informational).
Spiral A spiral is a SIP request that is routed to a proxy, forwarded onwards, and arrives once
again at that proxy, but this time differs in a way that will result in a different processing
decision than the original request. Typically, this means that the request’s Request-URI
differs from its previous arrival. A spiral is not an error condition, unlike a loop. A typical
cause for this is call forwarding. A user calls [email protected]. The example.com proxy
forwards it to Joe’s personal computer (PC), which in turn, forwards it to bob@example.
com. This request is proxied back to the example.com proxy. However, this is not a loop.
Since the request is targeted at a different user, it is considered a spiral, and is a valid
condition (RFC 3261: Standards Track).
Stateful proxy A logical entity that maintains the client and server transaction state machines defined by
this specification during the processing of a request, also known as a transaction stateful
proxy. A (transaction) stateful proxy is not the same as a call stateful proxy (RFC 3261:
Standards Track).
Stateless proxy A logical entity that does not maintain the client or server transaction state machines defined in
this specification when it processes requests. A stateless proxy forwards every request it
receives downstream and every response it receives upstream (RFC 3261: Standards Track).
Strict routing A proxy is said to be strict routing if it follows the Route processing rules of RFC 2543. That
rule caused proxies to destroy the contents of the Request-URI when a Route header field
was present. Strict routing behavior is not used in RFC 3261, in favor of a loose routing
behavior. Proxies that perform strict routing are also known as strict routers (RFC 3261:
Standards Track).
Subscriber A UA that receives NOTIFY requests from notifiers; these NOTIFY requests contain
information about the state of a resource in which the subscriber is interested.
Subscribers typically also generate SUBSCRIBE requests and send them to notifiers to
create subscriptions (RFC 6665: Standards Track).
The UA that receives NOTIFY requests with information of the callee’s availability; for the
Completion of Call (CC) service, this is the task of the caller’s agent. Suspended CC request:
a CC request that is temporarily not to be selected for CC recall (RFC 6910: Standards Track).
Subscription A set of application state associated with a dialog. This application state includes a pointer
to the associated dialog, the event package name, and possibly identification token. Event
packages will define additional subscription state information. By definition, subscriptions
exist in both a subscriber and a notifier (RFC 6665: Standards Track).
(Continued)
16 ◾ Handbook on Session Initiation Protocol
Subscription migration The act of moving a subscription from one notifier to another notifier (RFC 6665: Standards
Track).
Subsequent session Any session refresh request sent with a particular Call-ID after the initial session refresh
refresh request request (RFC 4028: Standards Track).
Target refresh request A target refresh request sent within a dialog is defined as a request that can modify the
remote target of the dialog (RFC 3261: Standards Track).
Target set A set of candidate URIs to which a proxy or redirect server can send or redirect a request.
Frequently, target sets are obtained from a registration, but they need not be (RFC 3841:
Standards Track).
Target URI The Request-URI of an incoming request that arrives to a relay that will perform a
translation operation (RFC: 5360: Standards Track).
Transaction user (TU) The layer of protocol processing that resides above the transaction layer. TUs include the
UAC core, UAS core, and proxy core (RFC 3261: Standards Track).
Translation logic The logic that defines a translation operation at a relay. This logic includes the translation’s
target and recipient URIs (RFC: 5360: Standards Track).
Translation operation Operation by which a relay translates the Request-URI of an incoming request (i.e., the
target URI) into one or more URIs (i.e., recipient URIs) that are used as the Request-URIs
of one or more outgoing requests (RFC: 5360: Standards Track).
Trust Domains A Trust Domain for the purposes of Network Asserted Identity is a set of SIP nodes (UAC,
UAS, proxies, or other network intermediaries) that are trusted to exchange Network
Asserted Identity information in the sense described below. A node can be a member of a
Trust Domain, T, only if the node is known to be compliant to a certain set of
specifications, Spec(T), which characterize the handling of Network Asserted Identity
within the Trust Domain, T.
Trust Domains are constructed by human beings who know the properties of the
equipment they are using/deploying. In the simplest case, a Trust Domain is a set of
devices with a single owner/operator who can accurately know the behavior of those
devices. Such simple Trust Domains may be joined into larger Trust Domains by bilateral
agreements between the owners/operators of the devices. A node is trusted (with respect
to a given Trust Domain) if and only if it is a member of that domain. We say that a node,
A, in the domain is trusted by a node, B, (or “B trusts A”) if and only if
Note that B may or may not be a member of the Trust Domain. For example, B may be a UA
that trusts a given network intermediary, A (e.g., its home proxy). A secure connection in
this context means that messages cannot be read by third parties, cannot be modified by
third parties without detection, and that B can be sure that the message really did come
from A. The level of security required is a feature of the Trust Domain; that is, it is defined
in Spec(T). Within this context, SIP signaling information received by one node from a
node that it trusts is known to have been generated and passed through the network
according to the procedures of the particular specification set Spec(T), and therefore can
be known to be valid, or at least as valid as specified in the specifications Spec(T).
(Continued)
Basic Session Initiation Protocol ◾ 17
Equally, a node can be sure that signaling information passed to a node that it trusts will be
handled according to the procedures of Spec(T). For these capabilities to be useful,
Spec(T) must contain requirements as to how the Network Asserted Identity is generated,
how its privacy is protected, and how its integrity is maintained as it is passed around the
network. A reader of Spec(T) can then make an informed judgment about the authenticity
and reliability of Network Asserted Information received from the Trust Domain T. The
term trusted (with respect to a given Trust Domain) can be applied to a given node in an
absolute sense—it is just equivalent to saying the node is a member of the Trust Domain.
However, the node itself does not know whether another arbitrary node is trusted, even
within the Trust Domain. It does know about certain nodes with which it has secure
connections as described above (RFC 3324: Informational).
UAC core The set of processing functions required of a UAC that reside above the transaction and
transport layers (RFC 3261: Standards Track).
UAS core The set of processing functions required at a UAS that resides above the transaction and
transport layers (RFC 3261: Standards Track).
Upstream A direction of message forwarding within a transaction that refers to the direction that
responses flow from the UAS back to the UAC (RFC 3261: Standards Track).
User agent (UA) A logical entity that can act as both a UAC and UAS (RFC 3261: Standards Track)
User agent client (UAC) A logical entity that creates a new request, and then uses the client transaction state
machinery to send it. The role of UAC lasts only for the duration of that transaction. In
other words, if a piece of software initiates a request, it acts as a UAC for the duration of
that transaction. If it receives a request later, it assumes the role of a user agent server for
the processing of that transaction (RFC 3261: Standards Track).
User agent server (UAS) A logical entity that generates a response to a SIP request. The response accepts, rejects, or
redirects the request. This role lasts only for the duration of that transaction. In other
words, if a piece of software responds to a request, it acts as a UAS for the duration of that
transaction. If it generates a request later, it assumes the role of a UAC for the processing
of that transaction (RFC 3261: Standards Track).
Wildcarded Public A set of Public Service Identities that match a regular expression and share the same profile
Service Identity (RFC 5002: Informational).
applications, as shown in Figure 2.1. The establishment of methods among the call participants. Even multimedia files
a real-time multimedia communications session, especially may be shared or created through a collaboration among
with humans, needs a lot of intelligence. conference participants. In addition, each media has its own
Even an automat can be a conference participant. Multiple quality of service (QOS) requirements, and the QOS for
users located in different geographical locations may partici- each media needs to be guaranteed during the call setup if
pate in the same session. Each participant may play a differ- that is what the conference participants expect per service
ent role in the conference based on the conference policy. level agreement (SLA). Multimedia session security, which
Multimedia communications may consist of different kinds includes authentication, integrity, confidentiality, nonre-
of media, and each media needs to be negotiated with each pudiation, and authorization, is paramount for conference
participant before the establishment of the call. Different participants in relation to both signaling and media (audio,
codecs may be used for audio or video by each participant, video, and data/application sharing).
and negotiations may require agreeing on a common codec, Early media (e.g., audio, video) is another feature that
or transcoding services may need to be offered for dissimilar is used to indicate the progress of the multimedia ses-
codecs. Application sharing may require a variety of control sion before the call is accepted by the called party. It may
18 ◾ Handbook on Session Initiation Protocol
Transport layer
Network layer
Physical layer
be unidirectional or bidirectional, and can be generated by a conference recording server to listen to and may see what
the calling party, the called party, or both. For example, already had happened while the participant was not in the
early media generated by the caller can be ringing tone and conference. Even high-quality video with an eye-contact fea-
announcements (e.g., queuing status). Early media typically ture during the meeting may be needed in case human per-
may consist of voice commands or dual-tone multifrequency sonal interactions need to be known among the conference
(DTMF) tones to drive interactive voice response (IVR) participants as if they are in a face-to-face meeting.
systems. Early media cannot be declined, modified, or iden- The multimedia session establishment and tear down also
tified. Consequently, it becomes very problematic to accom- needs to support both user and terminal mobility. In case of
modate all complex functionalities of the early media in the user mobility, the session establishment mechanisms will have
call setup. to deal with the recent address where a user has moved, and
If more than two participants join in the conference it is the user who will work proactively to update one’s recent
call, media bridging is required. Although audio bridging address. The conferencing system and the session establish-
is straightforward, video bridging can be from simple func- ment mechanisms should have in-built schemes to deal with
tionalities like video switching of the loudest speaker to the this. In the case of terminal mobility, the terminal itself may
composition of very complex composite video of the confer- break and reestablish the network point of attachment as it
ence participants maintaining audio and video intermedia moves from one place to another while the session is going
synchronization so that lip synchronization can be main- on. However, a session object resides in the application layer
tained. It may so happen that the multipoint multimedia in Open Standard International (OSI) terminology and may
conference can be set up dynamically. For example, two par- not be aware of the lower network layer’s change in point of
ticipants are on a point-to-point conference call, and then a attachment. It is expected that it is the lower-layer protocols,
third party or more conference participants need to be added such as the transport layer, network layer, and media access
in the same conference call. In this situation, the call needs control layer, and the physical-layer protocols that will work
to be diverted to a conference bridge for bridging of audio, transparently mitigating any changes in the network point
video, or data dynamically without tearing down the two- of attachment. However, multimedia service portability that
party conference call. demands the end user’s ability to obtain subscribed services
More complex functionalities like a virtual meeting in a transparent manner regardless of the end user’s point of
room may be introduced to make the multimedia confer- attachment to the network needs the transparent support in
encing more resourceful and powerful. For example, there the application-layer session establishment.
may be opportunities to join in a virtual meeting room A host of new services features related to the same call
for conferencing, while there may be many virtual meet- may also need to be satisfied even after the session estab-
ing rooms available where many other conferences can be lishment. Subconferencing, side bars, call transfer, call con-
going on simultaneously, each of which being separate and sultation transfer, call conference out of consultation, call
independent. A participant may even have the option to be a diversion, call hold, call parking, call completion services
part of multiple conferences simultaneously while coming in for unsuccessful calls, prepaid call services, invoking of new
and out of each conference at each different instant of time. applications, and many others, are some example of this
Even a conference participant joining late may also dial-in category of services. The signaling protocol for the session
Basic Session Initiation Protocol ◾ 19
establishment needs to have all the intelligence to satisfy the that require much more embedded intelligence in the signal-
variety of requirements of the conference participants, such ing protocol architecture. For example, initially set up as a
as how each media/application will be sent, received, and point-to-point two-party conference call, the SIP cannot be
shared. In addition, the multimedia signaling protocol needs used to construct a multipoint conference call dynamically
to have the ability to offer many other services to the con- when a third party or more users join in the same confer-
ference participants within the same session even long after ence call that needs media bridging. As a result, knowing
the establishment of the call. Because of the complexity of the address of the conference server a priori, a centralized
the real-time multimedia conversational services, the session multipoint conferencing with star-like connectivity architec-
establishment signaling messages have to be separated from ture is set up where the conference is established between
the media (audio, video, and data/application sharing). The each user of the multipoint conference participants and the
path transferring of signaling messages, audio, video, and conference server in a point-to-point fashion. In this respect,
data/application sharing can be completely independent. the SIP is an application-layer protocol that has the capa-
The emergence of a new kind of feature-rich web-based bility for establishing and tearing down point-to-point and
communications for application visualization and sharing multipoint-to-multipoint sessions using unicast or multicast
has enriched users’ experiences over the Internet. It implies communication environments. Being an application-layer
that the Internet Protocol (IP) has not only emerged as the protocol, the SIP can support both user and terminal mobil-
choice universal communications protocol over which audio, ity because the application-layer mobility does not require
video, and data will be transferred having a single network; any changes to the operating system of any of the partici-
the users also want to keep the same experiences of com- pants, and thus can be deployed widely to other lower-layer
munications over the Internet even for the audio and video- protocols, including mobile IP, and can take care of mobility-
conferencing services for application sharing. These criteria like changes in network point of attachments transparently.
have made the web service applications as an integral part
of the application sharing for audio and video conferencing.
A secondary consequence of this has been the popularity
among the developers of using text encoding for multimedia
2.4 Session Initiation Protocol
conferencing services because text encoding has been used The SIP, as described earlier, is an application-layer multi
for web services for its simplicity to debug, modify, and inte- media signaling protocol that supports the establishment,
grate with many existing applications. management, and tear down of multimedia session between
The SIP has been standardized in the IETF as the call the conference participants, but does not provide services.
control protocol for the establishment of multimedia con- SIP only performs these specific functions and relies heavily
versational session between conference participants. The SIP on other protocols to describe the media sessions, transport
has embraced the simplicity of web-based communications the media, and provide the QOS. Figure 2.2 shows the rela-
protocol architecture as well as of text encoding. The SIP is a tionship between SIP and other protocols.
very attractive protocol for multimedia session establishment The SIP message consists of two parts: header and mes-
for time- and mission-critical point-to-point conference calls sage body. The header is primarily used to route the signaling
because of its human-understandable text encoding of sig- messages from the caller to the called party and contains a
naling messages and use of the Hypertext Transfer Protocol request line composed of the request type, the SIP Uniform
(HTTP)/Hypertext Markup Language (HTML)-based web Resource Identifier (URI) of the destination or next hop, and
services like a protocol architecture separating the signaling the version of SIP being used. The message body is optional
messages into two parts: header and body. This inherent in- depending on the type of message and where it falls within
built capability of SIP has been used to create an enormous the establishment process. A blank line is used separating the
amount of new application services not only for time- and header and the message-body part. If SIP invitations used to
mission-critical conversational audio and video services but create sessions carry session descriptions that allow partici-
also for integration of the non-time-critical web services pants to agree on a set of compatible media types and com-
defined within the framework of service-oriented architec- patible codecs, the message-body part will include all this
ture (SOA) primarily as a part of the application sharing information as described in the SDP.
under the same audio and video-conferencing session. It should be noted that the SIP application-layer protocol
The building of multipoint multimedia conferencing ser- provides services in setting up the sessions between confer-
vices dynamically using SIP is very difficult, if not impos- ence participants as directed by multimedia applications like
sible, unless the presence of the SIP architecture is drastically teleconferencing, video teleconferencing (VTC), video con-
changed. One of the observations is that the SIP architec- ferencing (VC), application sharing, and web conferencing.
ture is very weak in its conference negotiation capabilities Only audio is used in TC; both audio and video are used
in multipoint multimedia communications environments in VTC; while audio, video, and application(s) sharing are
20 ◾ Handbook on Session Initiation Protocol
IP
SONET/TDM, fiber
used in VC. Sometimes, application sharing may be used as should be used by each participant if there is no common
a part of web services integrated (or decoupling) with audio/ codec supported among the conferees. In this situation, the
video, and can be termed as web conferencing. Chat confer- conferees may use the transcoding services for preventing
encing deals with real-time text messaging between two or failures of the session establishment. Real-Time Transport
more parties. Control Protocol (RTCP) is based on the periodic trans-
SIP signaling messages can be sent using any transport mission of control packets to all participants in the session
protocol such as Transmission Control Protocol (TCP), and provides feedback on the quality of the data (e.g., RTP
User Datagram Protocol (UDP), or Stream Control packets of audio/video) distribution. This is an integral part
Transmission Protocol (SCTP). Of course, IP can run over of RTP’s role as a transport protocol and is related to the
Point-to-Point Protocol (PPP), Ethernet, asynchronous flow and congestion control functions of other transport
transfer mode (ATM), dense wavelength division multi- protocols.
plexing (DWDM), Wi-Fi, time division multiple access The Domain Name System (DNS) and Dynamic Host
(TDMA), code division multiple access (CDMA), orthog- Configuration Protocol (DHCP) are the integral tools for IP
onal frequency division multiple access (OFDMA), world- address resolution and allocation, respectively, for routing of
wide interoperability for microwave access (WiMAX), or the SIP messages between the conference participants over
long-term evolution (LTE) wireless networking protocol. the IP network. For example, a host can discover and contact
It may be worthwhile to mention that ATM can run over a DHCP server to provide it with an IP address as well as the
synchronous optical network (SONET)/time division addresses of the DNS server and default router that can be
multiplexing (TDM) network that runs over fiber, and used to route SIP messages over the IP network.
DWDM running over fiber increases the bandwidth by
combining and transmitting multiple signals simultane-
ously at different wavelengths on the same fiber.
2.4.1 Augmented Backus–Naur Form
Different audio (e.g., International Telegraph Union— for the SIP
Telecommunication [ITU-T] G-series) and video (e.g., The SIP uses augmented Backus–Naur Form (ABNF) for
Moving Picture Expert Group [MPEG], Joint Photographic its messages. However, the syntaxes that are described here
Expert Group [JPEG], and ITU-T H-series) codecs are contain the SIP messages from the base SIP RFC 3261 and
used in multimedia sessions by conference participants. RFCs that extend and update this SIP RFC. Certain basic
The bit streams of each codec are transferred over the RTP rules are in uppercase, such as SP (space), LWS (linear white
for transferring over UDP/IP. However, the common codec space), HTAB (horizontal tab), CRLF (control return line
type either for audio or for video is negotiated by the SIP feed), DIGIT, ALPHA, etc. Angle brackets are used within
signaling messages that contain the information for each definitions to clarify the use of rule names. The use of square
codec type that is proposed by the conference participants. brackets is redundant syntactically. It is used as a semantic
SIP does not mandate any audio codec for any media that hint that the specific parameter is optional to use.
Basic Session Initiation Protocol ◾ 21
Registered URNs and components thereof must Error-Info = Error-Info" HCOLON error-uri
"
be transmitted as registered *(COMMA error-uri)
; (including case). error-uri = LAQUOT absoluteURI RAQUOT
disp-param = handling-param/generic-param *(SEMI generic-param)
handling-param = "handling" EQUAL Expires = "Expires" HCOLON delta-seconds
("optional"/"required"/ CRLF = CR LF
other-handling) ; CRLF from RFC 5626
other-handling = token
disp-extension-token = token double-CRLF = CR LF CR LF
Content-Encoding = ("Content-Encoding"/"e") ; double-CRLF from RFC 5626
HCOLON
content-coding *(COMMA CR = %x0D
content-coding) LF = %x0A
Content-Language = "Content-Language" Flow-Timer = "Flow-Timer" HCOLON 1*DIGIT
HCOLON ; Flow-Timer; from RFC 5626
language-tag *(COMMA
language-tag) contact-params = /c-p-reg/c-p-instance
language-tag = primary-tag *("-" subtag) c-p-reg = "reg-id" EQUAL 1*DIGIT; 1 to (231
primary-tag = 1*8ALPHA − 1)
subtag = 1*8ALPHA ; The value of the reg-id must not be 0 and
Content-Length = ("Content-Length"/"l") must be less than 231 (from RFC 5626)
HCOLON 1*DIGIT
Content-Type = ("Content-Type"/"c") HCOLON c-p-instance = "+sip.instance" EQUAL
media-type DQUOTE "<" instance-val ">"
media-type = m-type SLASH m-subtype *(SEMI DQUOTE
m-parameter) instance-val = 1*uric; defined in RFC 3261
m-type = discrete-type/composite-type/ From = ("From"/"f") HCOLON from-spec
access-type from-spec = (name-addr/addr-spec)
; access-type from RFC 4483 *(SEMI from-param)
from-param = tag-param/generic-param
access-type = "URL"; URL from RFC 3986 tag-param = "tag" EQUAL token
discrete-type = "text"/"image"/"audio" In-Reply-To = "In-Reply-To" HCOLON callid
/"video" *(COMMA callid)
/"application"/ Max-Forwards = "Max-Forwards" HCOLON
extension-token 1*DIGIT
composite-type = "message"/"multipart"/ MIME-Version = "MIME-Version" HCOLON
extension-token 1*DIGIT "." 1*DIGIT
extension-token = ietf-token/x-token Min-Expires = "Min-Expires" HCOLON
ietf-token = token delta-seconds
x-token = "x-" token Organization = "Organization" HCOLON
m-subtype = extension-token/iana-token [TEXT-UTF8-TRIM]
iana-token = token Priority = "Priority" HCOLON priority-value
m-parameter = m-attribute EQUAL m-value priority-value = "emergency"/"urgent"
m-attribute = token /"normal"
m-value = token/quoted-string /"non-urgent"/
CSeq = " CSeq" HCOLON 1*DIGIT LWS other-priority
Method other-priority = token
Date = "Date" HCOLON SIP-date Proxy-Authenticate = "Proxy-Authenticate"
SIP-date = rfc1123-date HCOLON challenge
rfc1123-date = wkday "," SP date1 SP time challenge = ("Digest" LWS digest-cln
SP "GMT" *(COMMA digest-cln))
date1 = 2DIGIT SP month SP 4DIGIT /other-challenge
; day month year (e.g., 02 Jun other-challenge = auth-scheme LWS
1982) auth-param
time = 2DIGIT ":" 2DIGIT ":" 2DIGIT *(COMMA auth-param)
; 00:00:00 - 23:59:59 digest-cln = realm/
wkday = "Mon"/"Tue"/"Wed" domain/nonce
/"Thu"/"Fri"/"Sat"/"Sun" /opaque/stale/algorithm
month = "Jan"/"Feb"/"Mar"/"Apr" /qop-options/auth-param
/"May"/"Jun"/"Jul"/"Aug" realm = "realm" EQUAL realm-value
/"Sep"/"Oct"/"Nov"/"Dec" realm-value = quoted-string
Basic Session Initiation Protocol ◾ 29
WWW-Authenticate = WWW-Authenticate"
" one instance of any feature tag; in
HCOLON challenge feature-;param.
extension-header = header-name HCOLON
header-value Accept-Resource-Priority = Accept-Resource
"
header-name = token -Priority"
header-value = *(TEXT-UTF8char/UTF8-CONT/ HCOLON [r-value *(COMMA r-value)]
LWS) ; Accept-Resource-Priority from RFC 4412
message-body = *OCTET
Accept-Contact = ("Accept-Contact"/"a") Event = ("Event"/"o") HCOLON event-type
HCOLON ac-value *(COMMA *(SEMI event-param)
ac-value); Accept-Contact ; Event from RFC 6665
from RFC 3841 (see event-type = event-package *("."
Sections; 2.8 and 9.9) event-template)
Reject-Contact = ("Reject-Contact"/"j") event-package = token-nodot
HCOLON rc-value*(COMMA event-template = token-nodot
rc-value) token-nodot = 1*(alphanum/"-"/"!"/"%"/"*"
ac-value = "*" *(SEMI ac-params) /"_"/"+"/"'"/"’"/"~")
rc-value = "*" *(SEMI rc-params) ; The use of the "id" parameter is
ac-params = feature-param/req-param/ deprecated; it is included for backwards-
explicit-param/generic-param ; compatibility purposes only.
feature-param = enc-feature-
tag [EQUAL LDQUOT (tag-value- event-param = generic-param/("id" EQUAL
list/string-value) RDQUOT]; token)
feature-param from RFC 3840 Allow-Events = ("Allow-Events"/"u") HCOLON
event-type *(COMMA event-
enc-feature-tag = base-tags/other-tags type);Allow-Events from RFC
base-tags = "audio"/"automata"/"class"/ 6665
"duplex"/"data"/ Subscription-State = "Subscription-State"
"control"/"mobility"/"descript HCOLON substate-value
ion"/ *(SEMI subexp-params);
"events"/"priority"/"methods"/ Subscription-State
"schemes"/"application"/"vi from RFC 6665
deo"/ substate-value = "active"/"pending"/
"language"/"type"/"isfocus"/ "terminated"/extension-
"actor"/"text"/"extensions" substate
other-tags = "+" ftag-name extension-substatee = token
ftag-name = ALPHA *(ALPHA/ subexp-params = ("reason" EQUAL
DIGIT/"!"/"’"/"."/"-"/"%") event-reason-value)
tag-value-list = tag-value *("," /("expires" EQUAL
tag-value) delta-seconds)
tag-value = ["!"] (token-nobang/boolean/ /("retry-after" EQUAL
numeric) delta-seconds)
token-nobang =
1*(alphanum/"-"/"."/"%" /"*"/" /generic-param
ff"/"+"/"'"/"’"/"˜") event-reason-value = "deactivated"
boolean = "TRUE"/"FALSE" /"probation"/"rejected"
numeric = "#" numeric-relation number /"timeout"/"giveup"
numeric-relation = "> = "/"< = "/" = "/ /"noresource"/"invariant"
(number ":") /event-reason-extension
number = ["+"/"-"] 1*DIGIT ["." 0*DIGIT] event-reason-extension = token
string-value = "<" *(qdtext-no-abkt/quoted- message-header = /Geolocation-header
pair) ">" Geolocation-header = "Geolocation" HCOLON
qdtext-no-abkt = LWS/%x21/%x23-3B/%x3D locationValue
/%x3F-5B/%x5D-7E/ *(COMMA locationValue)
UTF8-NONASCII ; Geolocation and locationValue parameter
rc-params = feature-param/generic-param from RFC 6442
req-param = "require" locationValue = LAQUOT locationURI RAQUOT *
explicit-param = "explicit" (SEMI geoloc-param)
locationURI = sip-URI/sips-URI/pres-URI/
; Despite the ABNF, there must not be more http-URI/https-URI/cid-url;
than one req-param or explicit-param; in an (from RFC 2392)/absoluteURI
ac-;params. Furthermore, there can only be geoloc-param = generic-param
Basic Session Initiation Protocol ◾ 31
; The pres-URI is defined in RFC 3859. http- digest-string = ddr-spec "|" addr-spec
a
URI and https-URI are defined "|" callid "|"1*DIGIT SP
; according to RFC 2616 and RFC 2818, Method "|" SIP-date "|"
respectively. The cid-url is defined in [addr-spec] "|"
; RFC 2392 to locate message-body parts. This message-body
URI type is present in a SIP request Info-Package = "Info-Package" HCOLON
; when location is conveyed as a MIME body in Info-package-type
the SIP message. GEO-URIs ; Info-Package from RFC 6086
; defined in RFC 5870 are not appropriate for
usage in the SIP Geolocation header Recv-Info = "Recv-Info" HCOLON
; because it does not include retention and [Info-package-list]
retransmission flags as part of the ; Recv-Info from RFC 6086
; location information. Other URI schemes
used in the location URI must be Info-package-list =
nfo-package-type
I
; reviewed against the criteria defined in *(COMMA
RFC 3693 for a Using Protocol that uses Info-package-type)
; the location object (LO). Info-package-type = Info-package-name
*(SEMI
message-headere = /Georouting-header Info-package-param)
Georouting-headere = "Geolocation-Routing" Info-package-name = token
HCOLON ("yes"/"no"/ Info-package-param = generic-param
generic-value) Join = "Join" HCOLON callid *(SEMI
generic-valuee = generic-param join-param)
message-headere = /Geolocation-Error ; Join from RFC 3911
Geolocation-Errore = "Geolocation-Error"
HCOLON
locationErrorValue join-param = to-tag/from-tag/generic-param
; Geolocation-Error from RFC 6642 to-tag = "to-tag" EQUAL token
from-tag = "from-tag" EQUAL token
locationErrorValuee = location-error-code Max-Breadth = "Max-Breadth" HCOLON 1*DIGIT;
*(SEMI location- Max-Breadth from RFC 5393
error- params) Session-Expires = ("Session-Expires"/"x")
location-error-code = 1*3DIGIT HCOLON delta-seconds
location-error-params = location-error- *(SEMI se-params);
code-text/ Session-Expires from RFC
generic-param 5393
History-Info = "History-Info" HCOLON se-params = refresher-param/generic-param
hi-entry *(COMMA hi-entry) refresher-param = "refresher" EQUAL
; History-Info from RFC 4244 ("uas"/"uac")
Min-SE = "Min-SE" HCOLON delta-seconds
hi-entry = hi-targeted-to-uri *(SEMI *(SEMI generic-param)
hi-param) ; Min-SE from RFC 4028
hi-targeted-to-uri = name-addr
hi-param = hi-index/hi-extension P-Access-Network-Info =
P-Access-Network-
"
hi-index = "index" EQUAL 1*DIGIT *(DOT Info" HCOLON
1*DIGIT) access-net-spec
hi-extension = generic-param: *(COMMA
Identity = "Identity" HCOLON access-net-spec)
signed-identity-digest ; P-Access-Network-Info from RFC 7315
; Identity from RFC 4474
access-net-spec =
access-type/access-
(
signed-identity-digest = LDQUOT 32LHEX class) *(SEMI
RDQUOT access-info)
Identity-Info = "Identity-Info" HCOLON access-type =
"IEEE-802.11"/"IEEE-802.11a"/
ident-info *(SEMI "IEEE-802.11b"/"IEEE-802.11g"/
ident-info-params) "IEEE-802.11n"/
ident-info = LAQUOT absoluteURI RAQUOT "IEEE-802.3"/"IEEE-802.3a"/
ident-info-params = ident-info-alg/ "IEEE-802.3ab"/"IEEE-802.3ae"/
ident-info-extension "IEEE-802.3ak"/"IEEE-802.3ah"/
ident-info-alg = "alg" EQUAL token "IEEE-802.3aq"/"IEEE-802.3an"/
ident-info-extension = generic-param "IEEE-802.3e"/"IEEE-802.3i"/
32 ◾ Handbook on Session Initiation Protocol
P-Associated-URI =P-Associated-URI"
" transit-ioi/related-icid/
HCOLON (p-aso-uri-spec) related-icid-gen-addr/
*(COMMA p-aso-uri-spec) generic-param
; P-Associated-URI from RFC 7315 icid-value = "icid-value" EQUAL gen-value
icid-gen-addr = "icid-generated-at" EQUAL
p-aso-uri-spec = name-addr *(SEMI ai-param) host
ai-param = generic-param orig-ioi = "orig-ioi" EQUAL gen-value
term-ioi = "term-ioi" EQUAL gen-value
Path = Path" HCOLON path-value *(COMMA
" transit-ioi = "transit-ioi" EQUAL
path-value) transit-ioi-list
; Path from RFC 3327 transit-ioi-list = DQUOTE transit-ioi-param
*(COMMA transit-ioi-
path-value = name-addr *(SEMI rr-param) param) DQUOTE
transit-ioi-param = transit-ioi-indexed-
P-Called-Party-ID =P-Called-Party-ID"
" value/transit-
HCOLON ioi-void-value
called-pty-id-spec transit-ioi-indexed-value = transit-ioi-
; P-Called-Party-ID from RFC 7315 name "."transit-
ioi-index
called-pty-id-spec =
ame-addr *(SEMI
n transit-ioi-name = ALPHA *(ALPHA/DIGIT)
cpid-param) transit-ioi-index = 1*DIGIT
cpid-param = generic-param
transit-ioi-void-value = "void"
related-icid = "related-icid" EQUAL
P-Charging-Addresses = P-Charging-
" gen-value
Function-Addresses" related-icid-gen-addr = "related-icid-
HCOLON generated-at" EQUAL
host
charge-addr-params
; The P-Charging-Vector header field contains
*(COMMA charge-addr-params) icid-value as a mandatory
; parameter. The icid-value represents the
; P-Charging-Addresses from RFC 7315 IMS charging ID, and
; contains an identifier used for correlating
charge-addr-params harge-addr-param
c= charging records and
*(SEMI ; events. The first proxy that receives the
charge-addr-param) request generates this value.
charge-addr-param = ccf/ecf/ccf-2/ecf-2/ ; The icid-gen-addr parameter contains the
generic-param host name or IP address of
ccf = "ccf" EQUAL gen-value ; the proxy that generated the icid-value.
ecf = "ecf" EQUAL gen-value ; The orig-ioi and term-ioi parameters
ccf-2 = "ccf-2" EQUAL gen-value contain originating and
ecf-2 = "ecf-2" EQUAL gen-value ; terminating interoperator identifiers. They
are used to correlate
; The P-Charging-Function-Addresses header ; charging records between different
field contains one or two operators. The originating IOI
; addresses of the ECF (ecf and ecf-2) or CCF ; represents the network responsible for the
(ccf and ccf-2). The charging records in the
; first address of the sequence is ccf or ; originating part of the session or stand-
ecf. If the first address of alone request. Similarly,
; the sequence is not available, then the ; the terminating IOI represents the network
next address (ccf-2 or ecf-2) responsible for the
; must be used if available. ; charging records in the terminating part of
the session or stand-alone
P-Charging-Vector = P-Charging-Vector"
" ; request. The transit-ioi parameter contains
HCOLON icid-value values with each of them,
*(SEMI charge-params) ; respectively, representing a transit
interoperator identifier. It is
; P-Charging-Vector from RFC 7315 ; used to correlate charging records between
different networks. The
charge-params = cid-gen-addr/orig-ioi/
i ; transit-ioi represents the network
term-ioi/ responsible for the records in the
34 ◾ Handbook on Session Initiation Protocol
priv-value = header"/"session"/"user"/"non
" replaces-param =
o-tag/from-tag/early-
t
e"/"critical"/token flag/generic-param
P-Visited-Network-ID = "P-Visited-Network- to-tag = "to-tag" EQUAL token
ID" HCOLON vnetwork- from-tag = "from-tag" EQUAL token
spec *(COMMA early-flag = "early-only"
vnetwork-spec)
; P-Visited-Network-ID from RFC 7315 ; A Replaces header field must contain
exactly one to-tag and exactly one
vnetwork-spec = token/quoted-string)
( from-tag,
*(SEMI vnetwork-param) ; as they are required for unique dialog
vnetwork-param = generic-param matching.
; The syntax for the rport parameter (RFC entity-tagSIP-If-Match from RFC 3903token
3581) is
Suppress-If-MatchSIP-If-Match from RFC
response-port = "rport" [EQUAL 1*DIGIT]
3903"Suppress-If-Match" HCOLON
(entity-tag/"*")
; RFC 3581 ; Suppress-If-Match from RFC 5839
RSeq = "RSeq" HCOLON response-num
; RSeq from RFC 3262 Target-Dialog =
Target-Dialog" HCOLON
"
callid *(SEMI td-param)
security-client =
Security-Client" HCOLON
" ; Target-Dialog from RFC 4538
sec-mechanism *(COMMA
sec-mechanism) td-param emote-param/local-param/
r
=
; security-client and Security-Client from generic-param
RFC 3329 remote-param = "remote-tag" EQUAL token
local-param = "local-tag" EQUAL token
security-server Security-Server" HCOLON
"=
sec-mechanism *(COMMA current-status =
a = curr:" precondition-
"
sec-mechanism) type; from RFC 3312 SP
; security-server and Security-Server from status-type SP
RFC 3329 direction-tag
desired-status = "a = des:" precondition-
security-verify =
Security-Verify" HCOLON
" type SP strength-tag SP
sec-mechanism *(COMMA status-type SP
sec-mechanism) direction-tag
; security-verify and Security-Verify from confirm-status = "a = conf:" precondition-
RFC 3329 type SP status-type SP
direction-tag
precondition-type = "qos"/token
sec-mechanism =echanism-name *(SEMI
m
strength-tag =
("mandatory" | "optional" |
mech-parameters)
"none"
mechanism-name = ("digest"/"tls"/"ipsec-
= | "failure" |"unknown")
ike"/"ipsec-man"/token)
status-type = ("e2e" | "local" | "remote")
mech-parameters = (preference/digest-
direction-tag = ("none" | "send" | "recv" |
algorithm/digest-qop/
"sendrecv")
digest-verify/extension)
preference = "q" EQUAL qvalue
UUI = User-to-User" HCOLON uui-value
"
qvalue = ("0" ["." 0*3DIGIT])/("1" ["."
*(COMMA uui-value)
0*3("0")])
UUI/User-to-User from RFC 7433
digest-algorithm = "d-alg" EQUAL token
uui-value = uui-data *(SEMI uui-param)
digest-qop = "d-qop" EQUAL token
uui-data = token/quoted-string
digest-verify = "d-ver" EQUAL LDQUOT 32LHEX
uui-param = pkg-param/cont-param/enc-param/
RDQUOT
generic-param
extension = generic-param
pkg-param = "purpose" EQUAL pkg-param-value
Service-Route = "Service-Route" HCOLON
pkg-param-value = token
sr-value * (COMMA sr-value)
cont-param = "content" EQUAL
; Service-Route from RFC 3608
cont-param-value
cont-param-value = token
sr-value = name-addr *(SEMI rr-param)
enc-param = "encoding" EQUAL
Session-Expires = ("Session-Expires"/"x")
enc-param-value
HCOLON delta-seconds
enc-param-value = token/"hex
*(SEMI se-params)
; Session-Expires from RFC 4028
line indicating the end of the header fields, and an optional 2.4.2.2 Responses
message body.
SIP responses sent from the server to the client indicate the
generic-message = start-line status of the request and are distinguished from requests by
*message-header having a Status-Line as their start line. A Status-Line con-
CRLF sists of the protocol version followed by a numeric Status-
[message-body] Code and its associated textual phrase, with each element
start-line = Request-Line/Status-Line
separated by a single SP character. No CR or LF is allowed
except in the final CRLF sequence.
The request message is also known as method. The start
line, each message-header line, and the empty line must be Request-Line = Method SP Request-
terminated by a control return line feed (CRLF) sequence. URI SP SIP-Version CRLF
However, the empty line must be present even if the message
body is not. The Status-Code is a three-digit integer result code that
indicates the outcome of an attempt to understand and sat-
isfy a request. The Reason-Phrase is intended to give a short
2.4.2.1 Requests textual description of the Status-Code. The Status-Code is
SIP requests are sent for the purpose of invoking a particular intended for use by automata, whereas the Reason-Phrase is
operation by a client to a server. SIP requests are distinguished intended for the human user. SIP response classes are similar
by having a Request-Line for a start line. A Request-Line to those of HTTP, but have been defined in the context of
contains a method name, a Request-URI, and the protocol SIP. The first digit of the Status-Code defines the class of
version separated by a single space character. The Request- response. The last two digits do not have any categorization
Line ends with CRLF. No CR or LF is allowed except in the role. For this reason, any response with a status code between
end-of-line CRLF sequence. No LWS is allowed in any of 100 and 199 is referred to as a 1xx response, any response with
the elements. a status code between 200 and 299 as a 2xx response, and so
on. SIP/2.0 allows six values for the first digit shown in Table
2.2 (RFC 3261).
Request-Line = ethod SP Request-URI SP
M
SIP-Version CRLF
2.4.2.3 Headers
SIP has defined the following methods:
SIP header fields are similar to HTTP header fields in both
◾◾ REGISTER for registration of contact information of syntax and semantics, each carrying its own well-defined
the user information. Each header field is terminated by a CRLF at
◾◾ INVITE, ACK, and CANCEL for setting up sessions the end of the header. SIP specifies that multiple header fields
◾◾ BYE for terminating sessions of the same field name whose value is a comma-separated list
◾◾ OPTIONS for querying servers about their capabilities can be combined into one header field whose grammar is of
◾◾ MESSAGE for chat sessions the form
◾◾ REFER for call transfer header = "header-name" HCOLON header-value
◾◾ SUBSCRIBE and NOTIFY for SIP session-related *(COMMA header-value)
event management
◾◾ PUBLISH for publication of SIP-specific event state It allows for combining header fields of the same name
◾◾ UPDATE for updating the session parameters into a comma-separated list. The Contact header field allows
◾◾ INFO for mid-session information transfer a comma-separated list unless the header field value is “*.”
◾◾ PRACK for acknowledgement of provisional requests
2.4.2.4 Message Body
The most important method in SIP is the INVITE
method, which is used to establish a session between partici- The message body in SIP messages is an optional component.
pants. A session is a collection of participants, and streams SIP request messages may contain a message body, unless oth-
of media between them, for the purposes of communication. erwise noted, that can be read, created, processed, modified,
SIP extensions, documented in Standards Track RFCs, may or removed as necessary only by the SIP user agent (UA). The
define additional methods for accommodating more feature- SIP message body shall always be opaque to the SIP proxy/
rich multimedia sessions especially for the multipoint multi- redirect/registrar server. Requests, including new requests
media conferencing services. defined in extensions to this specification, may contain
38 ◾ Handbook on Session Initiation Protocol
message bodies unless otherwise noted. The interpretation basis. The SIP layer can be defined as follows: syntax and
of the body depends on the request method. For response encoding, transport, transaction, and transaction user (TU).
messages, the request method and the response status code In addition, SIP also defines dialog between the two UAs. In
determine the type and interpretation of any message body. syntax and encoding layer, SIP’s encoding is specified using
Regardless of the type of body that a request contains, cer- ABNF described earlier. The transport layer defines how a
tain header fields must be formulated to characterize the client sends requests and receives responses and how a server
contents of the body, such as (see Section 2.2) Allow, Allow- receives requests and sends responses over the network. All
Events, Content-Disposition, Content-Encoding, Content- SIP elements contain a transport layer. As explained, TCP,
Language, Content-Length, and Content-Type. All responses UDP, or SCTP can be used as the transport protocol in SIP,
may include a body. while TLS can be used as the security transport protocol over
TCP.
In the transaction layer, transactions are a fundamental
2.4.2.5 Framing SIP Messages
component of SIP. A transaction is a request sent by a client
Unlike HTTP, SIP implementations can use UDP or other transaction (using the transport layer) to a server transaction,
unreliable datagram protocols. Each such datagram carries along with all responses to that request sent from the server
one request or response. See Section 3.12 on constraints on transaction back to the client. The transaction layer has a cli-
usage of unreliable transports. Implementations processing ent component (referred to as a client transaction) and a server
SIP messages over stream-oriented transports MUST ignore component (referred to as a server transaction), each of which
any CRLF appearing before the start line. The Content- are represented by a finite state machine that is constructed
Length header field value is used to locate the end of each to process a particular request. The transaction layer handles
SIP message in a stream. It will always be present when SIP application-layer retransmissions, matching of responses to
messages are sent over stream-oriented transports. requests, and application-layer timeouts. Any task that a UA
client (UAC) accomplishes takes place using a series of trans-
actions with the user agent server (UAS) or the SIP server.
2.4.3 SIP Message Structure UAs and stateful proxy servers of SIP contain a transaction
SIP is described with some independent processing stages layer, while a stateless proxy does not contain one. The TU
with only a loose coupling between each stage. This protocol layer is above the transaction layer. All SIP entities, except
is structured in a way to be compliant with a set of rules for the stateless proxy, are transaction users (TUs). The TU cre-
operations in different stages of the protocol that provides an ates a client transaction instance and passes the request along
appearance of a layered protocol. However, it does not dic- with the destination IP address, port, and transport to which
tate an implementation in any way. Not every element speci- to send the request, and the TU that creates a transaction can
fied by the protocol contains every layer. Furthermore, the also cancel it by sending a CANCEL request. At this, a server
elements specified by SIP are logical elements, not physical stops further processing the request and reverts to the state
ones. A physical realization can choose to act as different log- that existed before the transaction was initiated, and gener-
ical elements, perhaps even on a transaction-by-transaction ates a specific error response to that transaction.
Basic Session Initiation Protocol ◾ 39
The SIP network also defines the core functional ele- present to separate the headers and the message body. Figure
ments that consist of UACs and UASs, stateless and stateful 2.4 shows an example of SIP 200 OK response message
proxies, and registrars. Cores, except for the stateless proxy, format. The response message contains the reason header
are TUs. Clearly, these processing functions reside above the indicating why this response has been sent. As a result,
transaction and transport layer. A dialog represents a context the response message can be quite large, explaining all the
in which a peer-to-peer SIP relationship is established between reasons.
two UAs that persists for some time to interpret SIP messages. The status line contains the protocol version, the status
It is another important concept in SIP. The dialog facilitates code, and the reason phrase. The reason phrase makes it easy
sequencing of messages and proper routing of requests between for the human users to understand it, while the protocol ver-
the UAs. The INVITE method is the only way to establish a sion and the status code are processed by the SIP network.
dialog. When a UAC sends a request that is within the context For example, status code 200 is a part of the 2xx response
of a dialog, it follows the common UAC rules. class (success), and specifically 200 responses are sent. At this
point, a dialog transitions to a confirmed state. When a UAC
does not want to continue with this dialog, it shall terminate
2.4.3.1 Request Message Format
the dialog sending a BYE request. Again, a header part will
The SIP request message format consists of three important contain different headers and each header, like response mes-
parts: request line, header, and message body. Figure 2.3 sage, will have its own specific information and each header
depicts an example of SIP INVITE method message format. is terminated by a CRLF at the end of the header. Like the
The request line consists of the request type, SIP URI of the request message, the message body is optional and is sepa-
destination or next hop, and SIP version being used. rated by a blank line from the header part. Figure 2.4 shows
The SIP header part contains a set of headers, and each the IP address and audio code type and its characteristics are
header carries its own well-defined information. However, provided in response to the request message.
each header is terminated by a CRLF at the end of the header.
SIP message body of the request method is optional depend-
ing on the type of message and where it falls within the call
2.4.4 SIP Network Functional Elements
establishment scheme. A blank line defines the boundary The networked multimedia services use SIP to establish,
between the header part and the message body. manage sessions, and tear down the multimedia sessions. As
a result, the capabilities of SIP need to be used in the context
of multimedia service networking context. SIP, being in the
2.4.3.2 Response Message Format
session control layer, is also a part of the application layer
SIP response message format has three major sections: status in OSI terminology. SIP has application-layer functional
line, header, and message body. Of course, an empty line is entities such as SIP UAs and SIP servers that are described
later. These entities communicate among themselves using and SIP application servers such TC/VTC/VC server, loca-
SIP application-/session-layer protocol termed as the SIP net- tion server, web conferencing server, media bridging server,
work. Figure 2.5 depicts the logical view of the SIP network chat server, and application sharing server are not shown here
and its functional entities. but are addressed later.
The functional elements of the SIP network are as fol-
lows: SIP UAs and SIP servers. SIP has defined three servers:
2.4.4.1 SIP User Agent
SIP proxy, SIP registrar, and SIP redirect. However, the loca-
tion servers and the different categories of application serv- SIP has defined some functional entities that can be catego-
ers, not shown in Figure 2.5, do not belong to SIP. In this rized broadly into two categories: SIP UAs and SIP servers.
context, SIP; media sessions using RTP/RTCP controlled by A UA works in a client server mode on behalf of the user:
SIP; SIP security protocols such as Transport Layer Security UAS and UAC. UAC generates the SIP request and sends
(TLS) protocol; and SIP transport protocols such as TCP, to the UAS directly if the address is known or sends to the
UDP, and SCTP are carried over the IP network. IP network SIP server that routes the request to UAS. UAS receives the
request, operates on them, and sends the responses back
SIP registrar to the UAC either directly if the address is known or via
server the SIP server that sends the responses to the UAC. From a
SIP proxy SIP redirect
server server conferencing perspective, a number of possible different SIP
components such as conference-unaware participant, confer-
ence-aware participant, and focus is specified by RFC 4579.
We describe those kinds of SIP UAs in the next section.
as part of normal SIP signaling by populating the Session any information about the conference obtained from the SIP
Information, URI, Email Address, and Phone Number SDP conference package.
fields. In order to support advanced features, where a session
established between two end points can migrate to a central-
2.4.4.2 SIP Back-to-Back User Agent
ized conference, a focus should support the Replaces header
field. A UA with focus capabilities could be implemented in A back-to-back UA is a concatenation of a UAC and UAS
end-user equipment and would be used for the creation of functional entity in SIP. That is, it is a logical entity that
ad hoc conferences. A dedicated conferencing server, whose receives a request and processes it as a UAS. It also acts as a
primary task is to simultaneously host conferences of arbi- UAC and generates requests in order to determine how the
trary type and size, may allocate and publish a conference request should be answered. Unlike a proxy server, it main-
factory URI (as defined in Section 4.2) for creating an arbi- tains dialog state and must participate in all requests sent on
trary number of ad hoc conferences (and subsequently their the dialogs it has established.
focuses) using SIP call control means. However, SIP is being designed as the end-to-end model
following principles of the Internet, and the use of B2BUA
in SIP breaks this model, making it less scalable as SIP
2.4.4.1.2 Conference-Unaware UA
intermediaries implementing B2BUA need to keep track of
The simplest UA can participate in a conference ignor- the transactional states for the duration of the transaction
ing all SIP conferencing-related information. The simplest as well as for the entire duration of the call. The integrity of
UA is able to dial in to a conference and to be invited to SIP messages is also being protected using encryption, and
a conference. Any conferencing information is option- these messages are subject to rejection because of integrity
ally conveyed to/from it using non-SIP means. Such a UA failures.
would not usually host a conference (at least, not using SIP In many situations, the B2BUA capability is being
explicitly). A conference-unaware UA needs only to support deployed over the SIP network in many intermediaries for
basic SIP capabilities specified in RFC 3261. Call flows for providing some services that may or may not directly relate
conference-unaware UAs would be identical to those in the to the session establishment: service control by SIP applica-
SIP call flows per specifications of RFC 3261. Note that tion servers, topology hiding, anonymization of call par-
the presence of an isfocus feature tag in a Contact header ties, crossing of network address translator (NAT) by SIP
field will not cause interoperability issues between a focus signaling messages and media traffic using application-level
and a conference-unaware UA since it will be treated as an gateway (ALG), and generation of call detail record (CDR).
unknown header parameter and ignored, as per standard For example, session border controller (SBC) defined in RFC
SIP behavior. 5853 (see Section 14.3) and SIP privacy service defined in
RFC 3323 (see Section 20.2) employ B2BUAs.
2.4.4.1.3 Conference-Aware UA
2.4.4.3 SIP Servers
A conference-aware UA supports SIP conferencing call con-
trol conventions defined in this document as a conference A SIP server uses the SIP to manage real-time communi-
participant, in addition to support of RFC 3261. A confer- cation among SIP clients. In fact, SIP servers are the key
ence-aware UA should be able to process SIP redirections entities that enable communications among SIP clients by
such as described in RFC 3261 (see Section 3.1.2.3). A con- routing SIP messages though resolution of addresses, and are
ference-aware UA must recognize the isfocus feature param- the core of the SIP network. SIP servers act on requests sent
eter. A conference-aware UA should support SIP REFER by SIP clients and process SIP messages and operate on rules
method (see Section 2.5), SIP events (see Section 5.2), and per technical standards defined in the call control protocol
the conferencing package (RFC 4575) A conference-aware SIP. SIP has defined SIP proxy server, SIP registration server,
UA should subscribe to the conference package if the isfocus and SIP redirect server.
parameter is in the remote target URI of a dialog and if the
conference package is listed by a focus in an Allow-Events
2.4.4.3.1 Proxy Server
header field. The SUBSCRIBE to the conference package
should be sent outside any INVITE-initiated dialog. A ter- A SIP proxy server receives all SIP request and response
mination of the INVITE dialog with a BYE does not neces- messages from UAs or other SIP servers such as proxies. It
sarily terminate the SUBSCRIBE dialog. A conference-aware may use registrars/location servers, DNS, or database servers
UA may render to the user any information about the confer- for routing of SIP messages for resolving addresses to other
ence obtained from the SIP header fields and SDP fields from UAs or proxies. A proxy is only allowed to forward SIP mes-
the focus. A conference-aware UA should render to the user sages except the generation of CANCEL and ACK message
42 ◾ Handbook on Session Initiation Protocol
described later. It should be noted that the proxy may have in the location server. The SIP servers communicate with the
access to a database server that will not use SIP. In this case, it location server for address resolutions in order to route SIP
is expected that a proxy may use a host of different protocols messages to users or other servers such as proxies. However,
other than SIP in its backend servers for address resolution or the communication protocol between the location server and
other purposes. These protocols are outside the scope of SIP. the SIP servers is not a part of SIP.
A proxy can be stateful or stateless; a stateless proxy does not
keep any state information of the call or transaction, while a
2.4.4.3.5 Application Server
stateful proxy keeps all state information of a call or a trans-
action for the duration of the call or transaction. A SIP application server acting as the SIP UA can send SIP
request and receive SIP response messages. In fact, SIP appli-
cation servers have emerged as the most important areas for
2.4.4.3.2 Redirect Server
the creation and offering of multimedia services using SIP.
A SIP redirect server receives a SIP request and, unlike a With the inception of SIP, new feature-rich multimedia ser-
proxy server, responds back to it. It usually provides a 3xx vices integrated with other services like web services have
(redirection class) response, described later, to a UA or proxy been enabled by SIP, and have opened up a new frontier for
indicating that the call should be tried at a different location. creating real-time multimedia services that are yet to come.
The main purpose has been to deal with the temporary or
permanent location change of a user.
2.5 SIP Request Messages
2.4.4.3.3 Registrar Server
SIP request messages are known as methods that specify
A SIP registration server keeps the contact and other infor- the purpose of SIP messages for taking actions by SIP UA
mation of UAs sent using REGISTER message. Registration or server: REGISTER, INVITE, BYE, ACK, CANCEL,
creates bindings in a location service for a particular domain OPTIONS, REFER, SUBSCRIBE, NOTIFY, PUBLISH,
that associates an address-of-record (AOR) URI with one MESSAGE, UPDATE, PRACK, and INFO. Table 2.3
or more contact addresses. A proxy for that domain receives describes each method briefly. The SIP method (or request
a request whose Request-URI matches the AOR; then, the message) names are case sensitive, and all uppercase letters
proxy will forward the request to the contact addresses regis- are used for distinguishing from the header fields, which can
tered to that AOR. It is the usual case to register an AOR at a be both a mixture of uppercase and lowercase letters. The
domain’s location service when requests for that AOR would SIP UAs are required to understand the SIP methods, while
be routed to that domain. In most cases, this means that the proxy servers are required to know the relevant header fields
domain of the registration will need to match the domain in for routing of SIP request messages keeping the intermediar-
the URI of the AOR. A registrar may store all the informa- ies of the SIP network simple, thereby making the SIP-based
tion including the contact sent via REGISTER message in a multimedia communications network more scalable. Note
location server. The protocol between the SIP registrar and that, in addition to RFC 3261, all SIP methods are discussed
the location server is outside the scope of SIP. throughout the whole book as appropriate, although we have
The registrar server along with location server offers a dis- provided a brief description for each method in Table 2.3.
covery capability in SIP. If a user wants to initiate a session with
another user, SIP must discover the current host(s) at which
the destination user is reachable. This discovery process is fre-
quently accomplished by SIP network elements such as proxy
2.6 SIP Response Messages
servers and redirect servers, which are responsible for receiving The SIP response message is generated by a SIP UAS or a
a request, determining where to send it based on knowledge of SIP server in response to a SIP UAC carrying the result of
the location of the user, and then sending it there. To do this, the request. The response message may contain the reason
SIP network elements consult the location service, which pro- phrases that are also usually understandable by humans.
vides address bindings for a particular domain. The response status code determines the type and inter-
pretation of the message body sent by the request message,
and all responses may include a body. Some warning codes
2.4.4.3.4 Location Server
also provide information supplemental to the status code in
A location server is envisioned in a SIP network that keeps SIP response messages when the failure of the transaction
the SIP contacts and other information in a database. The results from an SDP specified in RFC 4566 (see Section 7.7).
SIP registrar stores contacts and other information of users Table 2.4 provides the list of six SIP response classes:
Basic Session Initiation Protocol ◾ 43
ACK The ACK method is sent for the final acknowledgment of INVITEs. ACK is end-to-end for 2xx final
responses, but is hop-by-hop for all other final responses such as 3xx, 4xx, 5xx, or 6xx. ACK may contain
the message body if the initial INVITE does not contain a SDP message body. ACK also may not modify the
message body containing the media description containing in initial INVITE because a re-INVITE must be
used to modify the media description provided in SDP. In some exceptional cases, such as interworking
between H.323 and SIP where the media description may not be known priory, ACK may contain SDP. The
sequence number, CSeq, is never incremented in an ACK because a UAS needs to match the CSeq
number of the ACK with the number of the corresponding INVITE request. The mandatory header fields
in an ACK are Via, To, From, Call-ID, CSeq, and Max-Forwards (RFC 3261: Standards Track).
BYE The BYE method is used to tear down an already established session and can be sent only by the UAs, never
by any proxies or by third parties. It is an end-to-end method and can only be generated by participant UAs
of the session. It is not recommended that a BYE be used to cancel pending INVITEs because it will not be
forked like an INVITE and may not reach the same set of UAs as the INVITE. The mandatory header fields in
a BYE are Via, To, From, Call-ID, CSeq, and Max-Forwards (RFC 3261: Standards Track).
CANCEL The CANCEL method is used for termination of the pending SIP calls and can be generated by SIP UAs or
proxy servers provided that only 1xx like provisional responses containing a tag is received. CANCEL is a
hop-by-hop request and cannot be generated if final responses have been received. The mandatory
header fields in a CANCEL are Via, To, From, Call-ID, CSeq, and Max-Forwards (RFC 3261: Standards Track).
INFO The INFO method is used by a SIP UA to carry application-level information between end points, using
the SIP dialog signaling path. For example, DTMF tones can be conveyed during the established
session using the INFO message. It neither updates the characteristics of the SIP dialog or session nor
constitutes a separate dialog usage. It only allows the applications that use the SIP session to exchange
information that might update the state of those applications. INFO messages cannot be sent as part of
other dialog usages, or outside an existing dialog. The mandatory header fields in an INFO are Allow,
Call-ID, CSeq, Info-Package, From, Max-Forwards, Proxy-Authenticate, To, Via, and WWW-Authenticate
(RFC 6086: Standards Track).
INVITE The INVITE method is used for establishment of single media or multimedia sessions between UAs. It
may include SDP bodies that describe what type and the characteristics of each media that the caller is
prepared to receive. It may also carry more bodies describing other features (e.g., tunneled Integrated
Services Digital Network [ISDN] User Part [ISUP]/Q Signaling [QSIG] Public Branch Exchange [PBX]
signaling information) of the session. Every INVITE is confirmed by sending an ACK for reliability. If the
call is in progress and has not been established yet, the caller may cancel it using CANCEL method. The
CANCEL and ACK methods are only used in association with the INVITE request. If INVITE does not
contain the media information, the ACK contains the media information of the UAC. A media session is
considered established when the INVITE, 200 OK, and ACK messages have been exchanged between
the UAC and UAS. Multiple INVITEs can be sent within a session to change its status. The mandatory
header fields in an INVITE are Via, To, From, Call-ID, CSeq, Contact, and Max-Forwards (RFC 3261:
Standards Track).
MESSAGE The MESSAGE method that is described in Section 6.3.1 is used by SIP UA for transferring instant
messaging (IM) that is often used in conversational mode for fast transfer of messages in near real-time
using SIP. MESSAGE requests carry the content in the form of MIME body parts. MESSAGE requests do
not themselves initiate a SIP dialog; under normal usage, each IM stands alone, much like pager
messages. MESSAGE requests may be sent in the context of a dialog initiated by some other SIP
request. It may be noted that an IM session can also be established in an alternative way using
INVITE/200 OK/ACK with SDP body that describes that IM protocol will be used directly between the
SIP UAs and then sending the IM as a part of media (data) between UAs. However, it takes longer time
to set up the IM session. The mandatory header fields in a MESSAGE are Accept, Accept-Encoding,
Accept-Language, Allow, Call-ID, CSeq, From, Max-Forwards, Proxy-Authenticate, To, Via, and WWW-
Authenticate (RFC 3428: Standards Track).
(Continued)
44 ◾ Handbook on Session Initiation Protocol
NOTIFY NOTIFY, which is described in more detail in Section 5.2, is a method used by a SIP UA to inform about
the occurrence of a particular SIP-specific event asynchronously if the user subscribes to that specific
event. The notification occurs within a dialog while a subscription exists between the SIP UAs
(subscriber and notifier). The SIP request type REFER and other non-SIP means can also be used to
establish an implicit subscription for getting the notifications of the event with the NOTIFY method
without sending the SUBSCRIBE message. For example, the message store in a voice-mail server and
the status of the voice server can be notified if those service events are subscribed. The mandatory
header fields in a NOTIFY are Allow, Allow-Events, Call-ID, Contact, CSeq, Event, From, Max-Forwards,
Proxy-Authenticate, Subscription-Sate, To, Via, and WWW-Authenticate (RFC 6665: Standards Track).
OPTIONS The OPTIONS method allows a UA to query another UA or a proxy server as to its capabilities. The response
of the request lists the capabilities of the UA or server. As a result, this allows a client to discover
information about the supported methods, content types, extensions, codecs, etc. without ringing the
other party. For example, before a client inserts a Require header field into an INVITE listing an option that it
is not certain the destination UAS supports, the client can query the destination UAS with an OPTIONS to
see if this option is returned in a Supported header field. All UAs must support the OPTIONS method. The
OPTIONS method is an important mechanism of SIP for discovering of the capabilities of SIP entities. The
mandatory header fields in an OPTIONS are Via, To, From, Call-ID, CSeq, and Max-Forwards.
PRACK The SIP (RFC 3261) is a request/response protocol for initiating and managing communication sessions.
SIP defines two types of responses: provisional and final. Final responses convey the result of the
request processing, and are sent reliably. Provisional responses provide information on the progress of
the request processing, but are not sent reliably in RFC 3261. It was later observed that reliability was
important in several cases, including interoperability scenarios with the PSTN. Therefore, an optional
capability was needed to support reliable transmission of provisional responses. That capability is
provided in this specification. The reliability mechanism works by mirroring the current reliability
mechanisms for 2xx final responses to INVITE. Those requests are transmitted periodically by the
transaction user (TU) until a separate transaction, ACK, is received that indicates reception of the 2xx
by the UAC. The reliability for the 2xx responses to INVITE and ACK messages are end-to-end.
To achieve reliability for provisional responses, we do nearly the same thing. Reliable provisional responses
are retransmitted by the TU with an exponential backoff. Those retransmissions cease when a PRACK
message, which is defined in RFC 3262 and described here, is received. The PRACK request plays the same
role as ACK, but for provisional responses. There is an important difference, however. PRACK is a normal
SIP message, like BYE. As such, its own reliability is ensured hop-by-hop through each stateful proxy. Also
like BYE, but unlike ACK, PRACK has its own response. If this were not the case, the PRACK message could
not traverse proxy servers compliant to RFC 2543 (obsoleted by RFC 3261). Each provisional response is
given a sequence number, carried in the RSeq header field in the response. The PRACK messages contain
an RAck header field, which indicates the sequence number of the provisional response that is being
acknowledged. The acknowledgments are not cumulative, and the specifications recommend a single
outstanding provisional response at a time, for purposes of congestion control.
UAS Behavior
A UAS MAY send any non-100 provisional response to INVITE reliably, so long as the initial INVITE
request (the request whose provisional response is being sent reliably) contained a Supported header
field with the option tag 100rel. While this specification does not allow reliable provisional responses
for any method but INVITE, extensions that define new methods that can establish dialogs may make
use of the mechanism. The UAS must send any non-100 provisional response reliably if the initial
request contained a Require header field with the option tag 100rel. If the UAS is unwilling to do so, it
must reject the initial request with a 420 Bad Extension and include an Unsupported header field
containing the option tag 100rel. A UAS must not attempt to send a 100 Trying response reliably. Only
provisional responses numbered 101 to 199 responses may be sent reliably. If the request did not
include either a Supported or Require header field indicating this feature, the UAS must not send the
provisional response reliably. 100 Trying responses are hop-by-hop only. For this reason, the reliability
mechanisms described here, which are end-to-end, cannot be used.
(Continued)
Basic Session Initiation Protocol ◾ 45
An element that can act as a proxy can also send reliable provisional responses. In this case, it acts
as a UAS for purposes of that transaction. However, it must not attempt to do so for any request
that contains a tag in the To field. That is, a proxy cannot generate reliable provisional responses to
requests sent within the context of a dialog. Of course, unlike a UAS, when the proxy element
receives a PRACK that does not match any outstanding reliable provisional response, the PRACK
must be proxied. There are several reasons why a UAS might want to send a reliable provisional
response. One reason is if the INVITE transaction will take some time to generate a final response.
As discussed in RFC 3261 (see Section 3.7.3.1.1), the UAS will need to send periodic provisional
responses to request an extension of the transaction at proxies. The requirement is that a proxy
receive them every 3 minutes, but the UAS needs to send those more frequently (once a minute is
recommended) because of the possibility of packet loss. As a more efficient alternative, the UAS
can send the response reliably, in which case the UAS should send provisional responses once
every 2.5 minutes. Use of reliable provisional responses for extending transactions is
recommended.
The rest of this discussion assumes that the initial request contained a Supported or Require header
field listing 100rel, and that there is a provisional response to be sent reliably. The provisional response
to be sent reliably is constructed by the UAS core according to the procedures of RFC 3261 (see
Section 3.1.3.6). In addition, it must contain a Require header field containing the option tag 100rel, and
must include an RSeq header field. The value of the header field for the first reliable provisional
response in a transaction must be between 1 and 231−1. It is recommended that it be chosen uniformly
in this range. The RSeq numbering space is within a single transaction. This means that provisional
responses for different requests may use the same values for the RSeq number. The reliable
provisional response may contain a body. The usage of session descriptions is described later in the
context offer–answer. The reliable provisional response is passed to the transaction layer periodically
with an interval that starts at T1 seconds and doubles for each retransmission (T1 is defined in RFC
3261; see Section 3.12).
Once passed to the server transaction, it is added to an internal list of unacknowledged reliable
provisional responses. The transaction layer will forward each retransmission passed from the UAS
core. This differs from retransmissions of 2xx responses, whose intervals cap at T2 seconds. This is
because retransmissions of ACK are triggered on receipt of a 2xx, but retransmissions of PRACK take
place independently of reception of 1xx. Retransmissions of the reliable provisional response cease
when a matching PRACK is received by the UA core. PRACK is like any other request within a dialog,
and the UAS core processes it according to the procedures of RFC 3261 (see Sections 3.1.3 and 3.6.2.1).
A matching PRACK is defined as one within the same dialog as the response, and whose method,
CSeq-num, and response-num in the RAck header field match, respectively, the method from the
CSeq, the sequence number from the CSeq, and the sequence number from the RSeq of the reliable
provisional response. If a PRACK request is received by the UA core that does not match any
unacknowledged reliable provisional response, the UAS must respond to the PRACK with a 481 Dialog/
Transaction Does Not Exist response. If the PRACK does match an unacknowledged reliable provisional
response, it must be responded to with a 2xx response. The UAS can be certain at this point that the
provisional response has been received in order. It should cease retransmissions of the reliable
provisional response, and must remove it from the list of unacknowledged provisional responses.
(Continued)
46 ◾ Handbook on Session Initiation Protocol
If a reliable provisional response is retransmitted for 64*T1 seconds without reception of a corresponding
PRACK, the UAS should reject the original request with a 5xx response. If the PRACK contained a session
description, it is processed as described in Section 2.1 of this document. If the PRACK instead contained
any other type of body, the body is treated in the same way that body in an ACK would be treated. After
the first reliable provisional response for a request has been acknowledged, the UAS may send additional
reliable provisional responses. The UAS must not send a second reliable provisional response until the
first is acknowledged. After the first, it is recommended that the UAS not send an additional reliable
provisional response until the previous one is acknowledged. The first reliable provisional response
receives special treatment because it conveys the initial sequence number. If additional reliable
provisional responses were sent before the first was acknowledged, the UAS could not be certain these
were received in order. The value of the RSeq in each subsequent reliable provisional response for the
same request must be greater by exactly 1. RSeq numbers must not wrap around. Because the initial one
is chosen to be less than 231−1, but the maximum is 231−1, there can be up to 231 reliable provisional
responses per request, which is more than sufficient.
The UAS MAY send a final response to the initial request before having received PRACKs for all
unacknowledged reliable provisional responses, unless the final response is 2xx and any of the
unacknowledged reliable provisional responses contained a session description. In that case, it must
not send a final response until those provisional responses are acknowledged. If the UAS does send a
final response when reliable responses are still unacknowledged, it should not continue to retransmit
the unacknowledged reliable provisional responses, but it must be prepared to process PRACK
requests for those outstanding responses. A UAS must not send new reliable provisional responses (as
opposed to retransmissions of unacknowledged ones) after sending a final response to a request.
UAC Behavior
When the UAC creates a new request, it can insist on a reliable delivery of provisional responses for that
request. To do that, it inserts a Require header field with the option tag 100rel into the request. A Require
header with the value 100rel must not be present in any requests excepting INVITE, although extensions to
SIP may allow its usage with other request methods. If the UAC does not wish to insist on usage of reliable
provisional responses, but merely indicate that it supports them if the UAS needs to send one, a Supported
header must be included in the request with the option tag 100rel. The UAC should include this in all
INVITE requests. If a provisional response is received for an initial request, and that response contains a
Require header field containing the option tag 100rel, the response is to be sent reliably. If the response is a
100 Trying (as opposed to 101 to 199), this option tag must be ignored, and the procedures below must not
be used. The provisional response must establish a dialog if one is not yet created. Assuming the response
is to be transmitted reliably, the UAC must create a new request with the PRACK method. This request is
sent within the dialog associated with the provisional response (indeed, the provisional response may have
created the dialog). PRACK requests may contain bodies, which are interpreted according to their type and
disposition. Note that the PRACK is like any other non-INVITE request within a dialog. In particular, a UAC
should not retransmit the PRACK request when it receives a retransmission of the provisional response
being acknowledged, although doing so does not create a protocol error.
Once a reliable provisional response is received, retransmissions of that response must be discarded. A
response is a retransmission when its dialog ID, CSeq, and RSeq match the original response. The UAC
must maintain a sequence number that indicates the most recently received in-order reliable
provisional response for the initial request. This sequence number must be maintained until a final
response is received for the initial request. Its value must be initialized to the RSeq header field in the
first reliable provisional response received for the initial request. Handling of subsequent reliable
provisional responses for the same initial request follows the same rules as above, with the following
difference: reliable provisional responses are guaranteed to be in order. As a result, if the UAC receives
another reliable provisional response to the same request, and its RSeq value is not one higher than
the value of the sequence number, that response must not be acknowledged with a PRACK, and must
not be processed further by the UAC. An implementation may discard the response, or may cache the
response in the hopes of receiving the missing responses. The UAC may acknowledge reliable
provisional responses received after the final response or may discard them.
(Continued)
Basic Session Initiation Protocol ◾ 47
PUBLISH The PUBLISH method (RFC 3903, see Section 5.2) is used by a SIP UA that is responsible for compositing the
SIP-specific event state for publication with an intention for distributing it to interested parties through
the SIP event mechanisms (SUBSCRIBE/NOTIFY). For instance, an application of SIP events for message
waiting indications might choose to collect the statuses of voice-mail boxes across a set of UAs using the
PUBLISH mechanism. Similarly, a presence UA may also PUBLISH the presence states for distribution to
the interested parties. The mandatory header fields in a PUBLISH are Accept, Accept-Encoding, Accept-
Language, Allow, Allow-Events, Event, Call-ID, CSeq, From, Max-Forwards, Min-Expires, Proxy-Authenticate,
SIP-ETag, To, Via, and WWW-Authenticate (RFC 3903: Standards Track).
REFER RFC 3415 defines the REFER method that extends the basic SIP defined in RFC 3261. It is a new request
message that the recipient REFER to a resource provided in the request. It provides a mechanism
allowing the party sending the REFER to be notified of the outcome of the referenced request. In
addition to the REFER method, RFC 3515 defines the refer event package and the Refer-To request
header (see Section 2.8). This can be used to enable many applications, including Call Transfer. For
instance, if Alice is in a call with Bob, and decides Bob needs to talk to Carol, Alice can instruct her SIP
UA to send a SIP REFER request to Bob’s UA providing Carol’s SIP Contact information. Assuming Bob
has given it permission, Bob’s UA will attempt to call Carol using that contact. Bob’s UA will then report
whether it succeeded in reaching the contact to Alice’s UA.
(Continued)
48 ◾ Handbook on Session Initiation Protocol
REFER Method
The REFER method indicates that the recipient (identified by the Request-URI) should contact a third
party using the contact information provided in the request. Unless stated otherwise, the protocol for
emitting and responding to a REFER request are identical to those for a BYE request in RFC 3261. The
behavior of SIP entities not implementing the REFER (or any other unknown) method is explicitly
defined in RFC 3261. A REFER request implicitly establishes a subscription to the refer event. Event
subscriptions are defined in RFC 6665 (see Section 5.2). A REFER request may be placed outside the
scope of a dialog created with an INVITE. REFER creates a dialog, and may be Record-Routed; hence, it
must contain a single Contact header field value. REFERs occurring inside an existing dialog must
follow the Route/Record-Route logic of that dialog. The mandatory header fields in a REFER are Allow,
Call-ID, Contact, CSeq, From, Max-Forwards, Proxy-Authenticate, Refer-To To, Via, and WWW-
Authenticate (see Section 2.8).
Message-Body Inclusion
A REFER method may contain a body. This specification assigns no meaning to such a body. A receiving
agent may choose to process the body according to its Content-Type.
Behavior of SIP User Agents
Forming a REFER Request
REFER is a SIP request and is constructed as defined in RFC 3261 (see Section 16.2). A REFER request
must contain exactly one Refer-To header field value.
Processing a REFER Request
A UA accepting a well-formed REFER request should request approval from the user to proceed (this
request could be satisfied with an interactive query or through accessing configured policy). If
approval is granted, the UA must contact the resource identified by the URI in the Refer-To header field
value, as discussed below. If the approval sought above for a well-formed REFER request is immediately
denied, the UA may decline the request. An agent responding to a REFER method MUST return a 400
Bad Request if the request contained zero or more than one Refer-To header field values. An agent
(including proxies generating local responses) may return a 100 Trying or any appropriate 4xx-6xx class
response as prescribed by RFC 3261.
Care should be taken when implementing the logic that determines whether or not to accept the REFER
request. A UA not capable of accessing non-SIP URIs should not accept REFER requests to them. If no
final response has been generated according to the rules above, the UA must return a 202 Accepted
response before the REFER transaction expires. If a REFER request is accepted (i.e., a 2xx class response
is returned), the recipient MUST create a subscription and send notifications of the status of the refer
as described below.
Accessing the Referred-to Resource
The resource identified by the Refer-To URI is contacted using the normal mechanisms for that URI
type. For example, if the URI is a SIP URI indicating INVITE (e.g., using a method=INVITE URI
parameter), the UA would issue a new INVITE using all of the normal rules for sending an INVITE
defined in RFC 3261.
Using SIP Events to Report the Results of the Reference
The NOTIFY mechanism defined in RFC 6665 (see Section 5.2) must be used to inform the agent sending
the REFER of the status of the reference. The dialog identifiers (To, From, and Call-ID) of each NOTIFY
must match those of the REFER as they would if the REFER had been a SUBSCRIBE request. Each
NOTIFY must contain an Event header field with a value of refer and possibly an id parameter
described below. Each NOTIFY must contain a body of type message/sipfrag defined in RFC 3420 (see
Section 2.8.2). The creation of a subscription as defined by RFC 6665 (see Section 5.2) always results in
an immediate NOTIFY. Analogous to the case for SUBSCRIBE described in that document, the agent
that issued the REFER must be prepared to receive a NOTIFY before the REFER transaction completes.
(Continued)
Basic Session Initiation Protocol ◾ 49
The implicit subscription created by a REFER is the same as a subscription created with a SUBSCRIBE
request. The agent issuing the REFER can terminate this subscription prematurely by unsubscribing
using the mechanisms described in RFC 6665 (see Section 5.2). Terminating a subscription, either by
explicitly unsubscribing or rejecting NOTIFY, is not an indication that the referenced request should be
withdrawn or abandoned. In particular, an agent acting on a REFER request should not issue a CANCEL
to any referenced SIP requests because the agent sending the REFER terminated its subscription to the
refer event before the referenced request completes. The agent issuing the REFER may extend its
subscription using the subscription refresh mechanisms described in RFC 6665 (see Section 5.2). REFER
is the only mechanism that can create a subscription to event refer. If a SUBSCRIBE request for event
refer is received for a subscription that does not already exist, it must be rejected with a 403 Forbidden.
Notice that unlike SUBSCRIBE, the REFER transaction does not contain a duration for the subscription
in either the request or the response. The lifetime of the state being subscribed to is determined by
the progress of the referenced request. The duration of the subscription is chosen by the agent
accepting the REFER, and is communicated to the agent sending the REFER in the subscription’s initial
NOTIFY (using the Subscription-State expires header parameter). Note that agents accepting REFER
and not wishing to hold subscription state can terminate the subscription with this initial NOTIFY.
Body of NOTIFY
Each NOTIFY must contain a body of type message/sipfrag defined in RFC 3420 (see Section 2.8.2). The
body of a NOTIFY must begin with a SIP Response Status-Line as defined in RFC 3261. The response
class in this status line indicates the status of the referred action. The body may contain other SIP
header fields to provide information about the outcome of the referenced action. This body provides a
complete statement of the status of the referred action. The refer event package does not support state
deltas. If a NOTIFY is generated when the subscription state is pending, its body should consist only of
a status line containing a response code of 100 Trying. A minimal, but complete, implementation can
respond with a single NOTIFY containing either the body
SIP/2.0 100 Trying
if the subscription is pending, the body
SIP/2.0 200 OK
if the reference was successful, the body
SIP/2.0 503 Service Unavailable
if the reference failed, or the body
SIP/2.0 603 Declined
if the REFER request was accepted before approval to follow if the reference could be obtained and that
approval was subsequently denied as described below.
An implementation may include more of a SIP message in that body to convey more information.
Warning header field values received in responses to the referred action are good candidates. In fact, if
the reference was to a SIP URI, the entire response to the referenced action could be returned
(perhaps to assist with debugging). However, doing so could have grave security repercussions (see
Section 19.2). Implementers must carefully consider what they choose to include. Note that if the
reference was to a non-SIP URI, status in any NOTIFY to the referrer must still be in the form of SIP
Response Status-Lines. The minimal implementation discussed above is sufficient to provide a basic
indication of success or failure. For example, if a client receives a REFER to a HTTP URL, and is
successful in accessing the resource, its NOTIFY to the referrer can contain the message/sipfrag body
of SIP/2.0 200 OK. If the notifier wishes to return additional non-SIP-specific information about the
status of the request, it may place it in the body of the sipfrag message.
(Continued)
50 ◾ Handbook on Session Initiation Protocol
REGISTER The REGISTER method is used to register the Contact URI information with the AOR of the user by the
UA to notify the SIP network where the user can be reached for a given period of time. In addition to
the self-registration by the user, a third-party registration of the user can also be done. For security, the
challenge and response can be used for authentication of the registration. The mandatory header
fields in a REGISTER are Via, To, From, Call-ID, CSeq, and Max-Forwards (RFC 3261: Standards Track).
SUBSCRIBE The SUBSCRIBE method (RFC 6665, see Chapter 3) is used by a SIP UA for subscribing to an event specific to
SIP to another SIP UA, for getting the notifications in an asynchronous fashion for a certain period of time.
A successful subscription establishes a dialog between UAC and UAS. The mandatory header fields in a
SUBSCRIBE are Allow, Allow-Events, Call-ID, Contact, CSeq, Expires, Event, From, Max-Forwards, Min-
Expires, Proxy-Authenticate, To, Via, and WWW-Authenticate (RFC 6665: Standards Track).
UPDATE The UPDATE method (RFC 3311, see Chapter 3) is used by a SIP UA to update parameters of a session (such
as the set of media streams and their codecs) but has no impact on the state of a dialog. It is like re-INVITE,
but unlike re-INVITE, UPDATE can be sent before the initial INVITE has been completed. This makes it very
useful for updating session parameters within early dialogs where re-INVITE cannot be sent because the
session establishment has not been completed yet with sending of the initial INVITE. The mandatory
header fields in an UPDATE are Allow, Call-ID, Contact, CSeq, From, Max-Forwards, Proxy-Authenticate, To,
Unsupported, Via, and WWW-Authenticate (RFC 3311: Standards Track).
Provisional (1xx), Success (2xx), Redirection (3xx), Client the subsequent chapters. We will provide a review of SIP in
Error (4xx), Server Error (5xx), and Global Failure (6xx). a nutshell—how registration is performed, and how a ses-
Table 2.4 describes the SIP response messages briefly. sion is created, updated, and terminated between the par-
ticipants. Figure 2.6a illustrates a SIP network with two SIP
UAs known as Alice and Bob, and two SIP proxies desig-
2.7 SIP Call and Media nated as outgoing proxy and incoming proxy.
SIP signaling flows between the UAs and the proxies, and
Trapezoid Operation media flows directly between the UAs as shown in Figure 2.6a
We like to briefly introduce here the SIP operation at a high are frequently referred to as the SIP trapezoid. We are assum-
level, although the actual protocol details are provided in ing that Alice being in the atlanta.com domain is trying to
52 ◾ Handbook on Session Initiation Protocol
Provisional (1xx): The provisional response classes indicate that the server contacted is performing some further action
and does not yet have a definitive response considering the fact the final response will take more time (say, around
200 ms). This class of responses is not transmitted reliably. They never cause the client to send an ACK. These
responses may contain message bodies, including session descriptions. However, if the reliability of the provisional
responses is desired, a client may send PRACK to the server.
100 Trying This response indicates that the request has been received by the next-hop server and that some
unspecified action is being taken on behalf of this call (e.g., a database is being consulted). This
response, like all other provisional responses, stops retransmissions of an INVITE by a UAC.
This response is different from other provisional responses, in that it is never forwarded
upstream by a stateful proxy (RFC 3261: Standards Track).
180 Ringing The UA receiving the INVITE is trying to alert the user. This response may be used to initiate
local ringback (RFC 3261: Standards Track).
181 Call Is Being A server may use this status code to indicate that the call is being forwarded to a different set of
Forwarded destinations (RFC 3261: Standards Track).
182 Call Queued The called party is temporarily unavailable, but the server has decided to queue the call rather
than reject it. When the callee becomes available, it will return the appropriate final status
response. The reason phrase MAY give further details about the status of the call, for example,
“five calls queued; expected waiting time is 15 minutes.” The server may issue several 182
Queued responses to update the caller about the status of the queued call (RFC 3261: Standards
Track).
183 Session The 183 Session Progress response is used to convey information about the progress of the
Progress call that is not otherwise classified. The reason phrase, header fields, or message body MAY
be used to convey more details about the call progress (RFC 3261: Standards Track).
199 Early Dialog RFC 6228 (see Section 3.6.6 for more details) defines a new SIP response code, 199 Early Dialog
Terminated Terminated, that a SIP forking proxy and a UAS can use to indicate to upstream SIP entities,
including the UAC that an early dialog has been terminated, before a final response is sent
toward the SIP entities. A UAS can send a 199 response code, before sending a non-2xx final
response, for the same purpose. SIP entities that receive the 199 response can use it to trigger
the release of resources associated with the terminated early dialog. In addition, SIP entities
might also use the 199 response to make policy decisions related to early dialogs. For example,
a media gate controlling a SIP entity might use the 199 response when deciding for which early
dialogs media will be passed (RFC 6228: Standards Track).
Success (2xx): The Success class of responses is sent by a server to indicate that the request has been succeeded.
200 OK The request has succeeded. The information returned with the response depends on the
method used in the request (RFC 3261: Standards Track).
202 Accepted This response indicates that the request has been accepted and understood by the server, but
the request may have not been authorized or processed yet by the server (RFC 6665: Standards
Track).
204 No Notification The 204 No Notification response code indicates that the request was successful, but the
notification associated with the request will not be sent. It is valid only in response to a
SUBSCRIBE message sent within an established dialog (RFC 5839: Standards Track).
(Continued)
Basic Session Initiation Protocol ◾ 53
Redirection (3xx): The Redirection class of responses gives information about the user’s new location, or about
alternative services that might be able to satisfy the call.
300 Multiple The address in the request resolved to several choices, each with its own specific location, and
Choices the user (or UA) can select a preferred communication end point and redirect its request to that
location. The response may include a message body containing a list of resource characteristics
and location(s) from which the user or UA can choose the one most appropriate, if allowed by
the Accept request header field. However, no Multipurpose Internet Mail Extensions (MIME)
types have been defined for this message body. The choices should also be listed as Contact
fields in RFC 3261 (see Section 2.8).
Unlike HTTP, the SIP response may contain several Contact fields or a list of addresses in a Contact
field. UAs may use the Contact header field value for automatic redirection or may ask the user to
confirm a choice. However, this specification does not define any standard for such automatic
selection. This status response is appropriate if the callee can be reached at several different
locations, and the server cannot or prefers not to proxy the request (RFC 3261: Standards Track).
301 Moved The user can no longer be found at the address in the Request-URI, and the requesting client
Permanently should retry at the new address given by the Contact header field. The requestor should update
any local directories, address books, and user location caches with this new value and redirect
future requests to the address (or addresses) listed (RFC 3261: Standards Track).
302 Moved The requesting client should retry the request at the new address (or addresses) given by the
Temporarily Contact header field. The Request-URI of the new request uses the value of the Contact header
field in the response.
The duration of the validity of the Contact URI can be indicated through an Expires header field
or an Expires parameter in the Contact header field. Both proxies and UAs may cache this URI
for the duration of the expiration time. If there is no explicit expiration time, the address is only
valid once for recurring, and must not be cached for future transactions.
If the URI cached from the Contact header field fails, the Request-URI from the redirected request
may be tried again a single time. The temporary URI may have become out of date sooner than
the expiration time, and a new temporary URI may be available (RFC 3261: Standards Track).
305 Use Proxy The requested resource must be accessed through the proxy given by the Contact field. The Contact
field gives the URI of the proxy. The recipient is expected to repeat this single request via the proxy.
The 305 Use Proxy responses must only be generated by UASs (RFC 3261: Standards Track).
380 Alternative 380 Alternative Service: The call was not successful, but alternative services are possible. The
Service alternative services are described in the message body of the response. Formats for such
bodies are not defined here, and may be the subject of future standardization (RFC 3261:
Standards Track).
Client Error (4xx): The Client Error class of responses are definite failure responses from a particular server. The client
should not retry the same request without modification (e.g., adding appropriate authorization). However, the same
request to a different server might be successful.
400 Bad Request The request could not be understood because of malformed syntax. The Reason-Phrase should
identify the syntax problem in more detail, for example, Missing Call-ID header field (RFC 3261:
Standards Track).
401 Unauthorized The request requires user authentication. This response is issued by UASs and registrars, while
407 Proxy Authentication Required is used by proxy servers (RFC 3261: Standards Track).
402 Payment This response is reserved for future use of SIP calls such as call completion charges (RFC 3261:
Required Standards Track).
(Continued)
54 ◾ Handbook on Session Initiation Protocol
403 Forbidden The server understood the request but is refusing to fulfill it. Authorization will not help, and the
request should not be repeated (RFC 3261: Standards Track).
404 Not Found The server has definitive information that the user does not exist at the domain specified in the
Request-URI. This status is also returned if the domain in the Request-URI does not match any
of the domains handled by the recipient of the request (RFC 3261: Standards Track).
405 Method Not The method specified in the Request-Line is understood but not allowed for the address
Allowed identified by the Request-URI. The response must include an Allow header field containing a
list of valid methods for the indicated address (RFC 3261: Standards Track).
406 Not Acceptable The resource identified by the request is only capable of generating response entities that have
content characteristics not acceptable according to the Accept header field sent in the request
(RFC 3261: Standards Track).
407 Proxy This code is similar to 401 Unauthorized, but indicates that the client must first authenticate
Authentication itself with the proxy. This status code can be used for applications where access to the
Required communication channel (e.g., a telephony gateway) rather than the callee requires
authentication (RFC 3261: Standards Track).
408 Request The server could not produce a response within a suitable amount of time, for example, if it
Timeout could not determine the location of the user in time. The client may repeat the request without
modifications at any later time (RFC 3261: Standards Track).
409 Conflict This response code indicates that the request has created a conflict and cannot be processed.
However, RFC 3261 that supersedes RFC 2543 has removed this response code (RFC 3261:
Standards Track).
410 Gone The requested resource is no longer available at the server, and no forwarding address is known.
This condition is expected to be considered permanent. If the server does not know, or has no
facility to determine, whether or not the condition is permanent, the status code 404 Not Found
should be used instead (RFC 3261: Standards Track).
411 Length This response code can be generated by a SIP proxy that switches the transport protocol from
Required UDP to TCP if a request message contains a message body but not the Content-Length header
because the Content-Length is more critical to TCP requests. However, RFC 3261 that
supersedes RFC 2543 has removed this response code (RFC 3261: Standards Track).
412 Conditional The 412 Conditional Request Failed response code is added to the Client-Error header field
Request Failed definition. 412 Conditional Request Failed is used to indicate that the precondition given for the
request has failed. That is, if there is no matching event state, for example, the event state to be
refreshed has already expired, the Event Publication Agent, acting as a SIP UAC in issuing the
PUBLISH request, receives a 412 Conditional Request Failed response to the PUBLISH request
(RFC 3903: Standards Track).
413 Request Entity The server is refusing to process a request because the request entity-body is larger than the
Too Large server is willing or able to process. The server may close the connection to prevent the client
from continuing the request. If the condition is temporary, the server should include a Retry-
After header field to indicate that it is temporary and after what time the client may try again
(RFC 3261: Standards Track).
414 Request-URI The server is refusing to service the request because the Request-URI is longer than the server is
Too Long willing to interpret (RFC 3261: Standards Track).
(Continued)
Basic Session Initiation Protocol ◾ 55
415 Unsupported The server is refusing to service the request because the message body of the request is in a
Media Type format not supported by the server for the requested method. The server must return a list of
acceptable formats using the Accept, Accept-Encoding, or Accept-Language header field,
depending on the specific problem with the content (RFC 3261: Standards Track).
416 Unsupported The server cannot process the request because the scheme of the URI in the Request-URI is
URI Scheme unknown to the server (RFC 3261: Standards Track).
417 Unknown This response code is used by a SIP functional entity that acts as a resource-priority (RP) actor
Resource-Priority when it does not understand any of the resource values in the request. However, the treatment
depends on the presence of the Require resource-priority option tag:
1. Without the option tag, the RP actor treats the request as if it contained no Resource-Priority
header field and processes it with default priority. Resource values that are not understood
must not be modified or deleted.
2. With the option tag, it must reject the request with a 417 Unknown Resource-Priority
response code.
Making case 1 the default is necessary since otherwise, there would be no way to successfully
complete any calls in the case where a proxy on the way to the UAS shares no common
namespaces with the UAC, but the UAC and UAS do have such a namespace in common. In
general, as noted, a SIP request can contain more than one Resource-Priority header field. This
is necessary if a request needs to traverse different administrative domains, each with its own
set of valid resource values. For example, the ETS namespace might be enabled for US
government networks that also support the Defense Switched Network (DSN) or Defense
Switched Network RED (DRSN) namespaces for most individuals in those domains. A 417
Unknown Resource-Priority response may, according to local policy, include an Accept-
Resource-Priority header field enumerating the acceptable resource values.
SIP UACs supporting RFC 4412 must be able to generate the Resource-Priority header field for
requests that require elevated resource access priority. The UAC should be able to generate
more than one resource value in a single SIP request. Upon receiving a 417 Unknown
Resource-Priority response, the UAC may attempt a subsequent request with the same or
different resource value. If available, it should choose authorized resource values from the set
of values returned in the Accept-Resource-Priority header field (RFC 4412: Standards Track).
420 Bad Extension The server did not understand the protocol extension specified in a Proxy-Require or Require
header field. The server must include a list of the unsupported extensions in an Unsupported
header field in the response (RFC 3261: Standards Track).
421 Extension The UAS needs a particular extension to process the request; however, this extension is not
Required listed in a Supported header field in the request. Responses with this status code must contain
a Require header field listing the required extensions. A UAS should not use this response
unless it truly cannot provide any useful service to the client. Instead, if a desirable extension is
not listed in the Supported header field, servers should process the request using baseline SIP
capabilities and any extensions supported by the client (RFC 3261: Standards Track).
422 Session Timer This response code is used by a server to reject a request containing the Session-Expires
Interval Too Small header field that is too short an interval in order to prevent excess traffic of SIP signaling
messages, especially by re-INVITE and UPDATE. If the session duration needs to be updated
frequently due to the short time, it will create excessive traffic (RFC 4028: Standards Track).
423 Interval Too The server is rejecting the request because the expiration time of the resource refreshed by
Brief the request is too short. This response can be used by a registrar to reject a registration
whose Contact header field expiration time was too small (RFC 3261: Standards Track).
(Continued)
56 ◾ Handbook on Session Initiation Protocol
424 Bad Location The 424 Bad Location Information response code is a rejection of the request due to its location
Information contents, indicating location information that was malformed or not satisfactory for the
recipient’s purpose, or could not be dereferenced. The 424 response message should be
included in the response as a MIME message body (i.e., a location value) rather than as a URI;
however, in cases where the intermediary is willing to share location with recipients but not
with a UA, a reference might be necessary. A SIP intermediary can also reject a location it
receives from a Target when it understands the Target to be in a different location.
SIP intermediaries that are forwarding (as opposed to generating) a 424 response must not add,
modify, or delete any location appearing in that response. This specifically applies to
intermediaries that are between the 424 response generator and the original UAC. The
Geolocation and Geolocation-Error header fields and Presence Information Data Format
(PIDF)–Location Object (LO) (PIDF–LO) body parts must remain unchanged, never added to, or
deleted.
The Geolocation-Error header field must be included in the 424 response. It is only appropriate
to generate a 424 response when the responding entity needs a locationValue and there are no
values in the request that are usable by the responder, or when the responder has additional
location information to provide.
A 424 response must not be sent in response to a request that lacks a Geolocation header
entirely, as the UA in that case may not support this extension at all. If a SIP intermediary
inserted a locationValue into a SIP request where one was not previously present, it must take
any and all responsibility for the corrective action if it receives a 424 response to a SIP request it
sent.
A 424 (Bad Location Information) response is a final response within a transaction and must not
terminate an existing dialog (RFC 6442: Standards Track).
428 Use Identity This response is sent when a verifier receives a SIP request that lacks an Identity header to indicate
Header that the request should be re-sent with an Identity header (RFC 4474: Standards Track).
429 Provide This response is used by a server to indicate that a Referred-By header field is to be re-sent with
Referror Identity a valid Referred-By security token and the security token is carried by S/MIME message body
(RFC 3892: Standards Track).
430 Flow Failed This response code is used by an edge proxy to indicate to the authoritative proxy that a specific
flow to a UA instance has failed. Other flows to the same instance could still succeed. The
authoritative proxy should attempt to forward to another target (flow) with the same instance-id
and AOR. End points should never receive a 430 Flow Failed response. If an end point receives a
430 Flow Failed response, it should treat it as a 400 Bad Request per normal procedures, as in of
RFC 3261 (RFC 5626: Standards Track).
433 Anonymity This response indicates that the server refused to fulfill the request because the requestor was
Disallowed anonymous. Its default reason phrase is Anonymity Disallowed (RFC 5079: Standards Track).
436 Bad This response is sent when the Identity-Info header contains a URI that cannot be dereferenced
Identity-Info by the verifier (either the URI scheme is unsupported by the verifier, or the resource
designated by the URI is otherwise unavailable) (RFC 4474: Standards Track).
437 Unsupported This response is sent when the verifier cannot validate the certificate referenced by the URI of
Certificate the Identity-Info header, because, for example, the certificate is self-signed, or signed by a root
certificate authority for which the verifier does not possess a root certificate (RFC 4474:
Standards Track).
438 Invalid Identity This response is sent when the verifier receives a message with an Identity signature that does not
Header correspond to the digest-string calculated by the verifier (RFC 4474: Standards Track).
(Continued)
Basic Session Initiation Protocol ◾ 57
439 First Hop Lacks This response code is used by a registrar to indicate that it supports the outbound feature
Outbound described in this specification, but that the first outbound proxy that the user is attempting to
Support register through does not. It should be noted that this response code is only appropriate in the
case that the registering UA advertises support for outbound processing by including the
outbound option tag in a Supported header field. Proxies must not send a 439 response to any
requests that do not contain a reg-id parameter and an outbound option tag in a Supported
header field (RFC 5626: Standards Track).
440 Max-Breadth If there is insufficient value set in the Max-Breadth header field to carry out a desired parallel
Exceeded forking, a proxy sends this response code. A client receiving a 440 Max-Breadth Exceeded response
can infer that its request did not reach all possible destinations (RFC 5393: Standards Track).
469 Bad Info If a UA receives an INFO request associated with an Info Package that the UA has not indicated
Package willingness to receive, the UA must send a 469 Bad Info Package response, which contains a
Recv-Info header field with Info Packages for which the UA is willing to receive INFO requests. The
UA must not use the response to update the set of Info Packages, but simply to indicate the current
set. In the terminology of multiple dialog usages defined in RFC 5057 (see Sections 3.6.6 and 16.2),
this represents a Transaction-Only failure, and does not terminate the invite dialog usage. If a UA
receives an INFO request associated with an Info Package, and the message-body part with
Content-Disposition Info-Package has a MIME type that the UA supports but not in the context of
that Info Package, it is recommended that the UA send a 415 Unsupported Media Type response.
The UA may send other error responses, such as Request Failure (4xx), Server Failure (5xx), and
Global Failure (6xx), in accordance with the error-handling procedures defined in RFC 3261.
Otherwise, if the INFO request is syntactically correct and well structured, the UA must send a
200 OK response. It should be noted that if the application needs to reject the information that
it received in an INFO request, it needs to be done on the application level. That is, the
application needs to trigger a new INFO request that contains information that the previously
received application data was not accepted. Individual Info Package specifications need to
describe the details for such procedures (RFC 6086: Standards Track).
480 Temporarily The callee’s end system was contacted successfully but the callee is currently unavailable (e.g., is
Unavailable not logged in, logged in but in a state that precludes communication with the callee, or has
activated the do not disturb feature). The response may indicate a better time to call in the
Retry-After header field. The user could also be available elsewhere (unbeknownst to this
server). The reason phrase should indicate a more precise cause as to why the callee is
unavailable. This value should be settable by the UA. Status 486 (Busy Here) may be used to
more precisely indicate a particular reason for the call failure. This status is also returned by a
redirect or proxy server that recognizes the user identified by the Request-URI, but does not
currently have a valid forwarding location for that user (RFC 3261: Standards Track).
481 Dialog/ This status indicates that the UAS received a request that does not match any existing dialog or
Transaction Does transaction (RFC 3261: Standards Track).
Not Exist
482 Loop Detected The server has detected a loop. It means that the request has been routed back to the server that
previously forwarded the same request message (RFC 3261: Standards Track).
483 Too Many Hops The server received a request that contains a Max-Forwards header field with the value zero
(RFC 3261: Standards Track).
484 Address The server received a request with a Request-URI that was incomplete. Additional information
Incomplete should be provided in the reason phrase. This status code allows overlapped dialing. With
overlapped dialing, the client does not know the length of the dialing string. It sends strings of
increasing lengths, prompting the user for more input, until it no longer receives a 484 Address
Incomplete status response (RFC 3261: Standards Track).
(Continued)
58 ◾ Handbook on Session Initiation Protocol
485 Ambiguous The Request-URI was ambiguous. The response may contain a listing of possible unambiguous
addresses in Contact header fields. Revealing alternatives can infringe on the privacy of the
user or the organization. It must be possible to configure a server to respond with status 404
Not Found or to suppress the listing of possible choices for ambiguous Request-URIs. Example
response to a request with the Request-URI:
The e-mail and voice-mail systems provide this functionality. A status code separate from 3xx is
used since the semantics are different: for 300, it is assumed that the same person or service
will be reached by the choices provided. While an automated choice or sequential search
makes sense for a 3xx response, user intervention is required for a 485 Ambiguous response
(RFC 3261: Standards Track).
486 Busy Here The callee’s end system was contacted successfully, but the callee is currently not willing or
able to take additional calls at this end system. The response may indicate a better time to
call in the Retry-After header field. The user could also be available elsewhere, such as
through a voice-mail service. Status 600 Busy Everywhere should be used if the client knows
that no other end system will be able to accept this call (RFC 3261: Standards Track).
470 Consent A 470 Consent Needed response indicates that the request that triggered the response
Needed Response contained a URI list with at least one URI for which the relay had no permissions. A UAS
generating a 470 Consent Needed response should include a Permission-Missing header field
in it. This header field carries the URI or URIs for which the relay had no permissions. A UAC
receiving a 470 Consent Needed response without a Permission-Missing header field needs to
use an alternative mechanism, for example, eXtension Markup Language (XML) Configuration
Access Protocol (XCAP), to discover for which URI or URIs there were no permissions. A client
receiving a 470 Consent Needed response uses a manipulation mechanism (e.g., XCAP) to add
those URIs to the relay’s list of URIs. The relay will obtain permissions for those URIs as usual
(RFC 5360: Standards Track).
487 Request The request was terminated by a BYE or CANCEL request. This response is never returned for a
Terminated CANCEL request itself (RFC 3261: Standards Track).
488 Not Acceptable The response has the same meaning as 606 Not Acceptable, but only applies to the specific
resource addressed by the Request-URI and the request may succeed elsewhere. A message
body containing a description of media capabilities may be present in the response, which is
formatted according to the Accept header field in the INVITE (or application/SDP if not
present), the same as a message body in a 200 OK response to an OPTIONS request (RFC 3261:
Standards Track).
489 Bad Event This response is used by a server to reject a subscription request or notification containing a
Event package that is unknown or not supported by the server (RFC 6665: Standards Track).
491 Request The request was received by a UAS that had a pending request within the same dialog. This
Pending response can be used to resolve the glare situations (RFC 3261: Standards Track).
493 Request The request was received by a UAS that contained an encrypted MIME body for which the
Undecipherable recipient does not possess or will not provide an appropriate decryption key. This response
may have a single body containing an appropriate public key that should be used to encrypt
MIME bodies sent to this UA (RFC 3261: Standards Track).
(Continued)
Basic Session Initiation Protocol ◾ 59
494 Security A server receiving an unprotected request that contains a Require or Proxy-Require header field
Agreement with the value sec-agree must respond to the client with a 494 Security Agreement Required
Required response. The server must add a Security-Server header field to this response listing the
security mechanisms that the server supports. The server must add its list to the response even
if there are no common security mechanisms in the client’s and server’s lists. The server’s list
must not depend on the contents of the client’s list.
If digest is chosen, the 494 Security Agreement Required response will contain an HTTP Digest
authentication challenge. The client must use the algorithm and quality-of-protection (qop)
parameters in the Security-Server header field to replace the same parameters in the HTTP
Digest challenge. The client must also use the digest-verify parameter in the Security-Verify
header field to protect the Security-Server header field as specified in RFC 3329 (see Section
19.3) (RFC 3329: Standards Track).
Server Error (5xx): The Server Error (5xx) class of responses consists of sent failure responses given when a server itself
has erred. The Retry-After header may be used with this class of response, indicating that the request may be sent
after a certain period of time.
500 Server Internal The server encountered an unexpected condition that prevented it from fulfilling the request.
Error The client may display the specific error condition and may retry the request after several
seconds. If the condition is temporary, the server may indicate when the client may retry the
request using the Retry-After header field (RFC 3261: Standards Track).
501 Not The server does not support the functionality required to fulfill the request. This is the
Implemented appropriate response when a UAS does not recognize the request method and is not capable
of supporting it for any user. However, proxies forward all requests regardless of method. Note
that a 405 Method Not Allowed is sent when the server recognizes the request method, but that
method is not allowed or supported (RFC 3261: Standards Track).
502 Bad Gateway The server, while acting as a gateway or proxy, received an invalid response from the downstream
server it accessed in attempting to fulfill the request (RFC 3261: Standards Track).
503 Service The server is temporarily unable to process the request due to a temporary overloading or
Unavailable maintenance of the server. The server may indicate when the client should retry the request in
a Retry-After header field. If no Retry-After is given, the client must act as if it had received a 500
Server Internal Error response. A client (proxy or UAC) receiving a 503 Service Unavailable
should attempt to forward the request to an alternate server. It should not forward any other
requests to that server for the duration specified in the Retry-After header field, if present.
Servers may refuse the connection or drop the request instead of responding with 503 Service
Unavailable (RFC 3261: Standards Track).
504 Gateway The server did not receive a timely response from an external server it accessed in attempting to
Timeout process the request. 408 Request Timeout should be used instead if there was no response
within the period specified in the Expires header field from the upstream server (RFC 3261:
Standards Track).
505 Version Not The server does not support, or refuses to support, the SIP version that was used in the request.
Supported The server is indicating that it is unable or unwilling to complete the request using the same
major version as the client, other than with this error message (RFC 3261: Standards Track).
513 Message Too The server was unable to process the request since the message length exceeded its capabilities
Long (RFC 3261: Standards Track).
580 Precondition When a UAS acting as an answerer cannot or is not willing to meet the preconditions in the
Failure offer, it should reject the offer by returning a 580 Precondition-Failure response (RFC 3312:
Standards Track).
(Continued)
60 ◾ Handbook on Session Initiation Protocol
Global Failure (6xx): The Global Failure (6xx) class of responses indicates that a server has definitive information about a
particular user, not just the particular instance indicated in the Request-URI.
600 Busy The callee’s end system was contacted successfully but the callee is busy and does not wish to
Everywhere take the call at this time. The response may indicate a better time to call in the Retry-After
header field. If the callee does not wish to reveal the reason for declining the call, the callee
uses status code 603 Decline instead. This status response is returned only if the client knows
that no other end point (such as a voice-mail system) will answer the request. Otherwise, 486
Busy Here should be returned (RFC 3261: Standards Track).
603 Decline The callee’s machine was successfully contacted but the user explicitly does not wish to or
cannot participate. The response may indicate a better time to call in the Retry-After header
field. This status response is returned only if the client knows that no other end point will
answer the request (RFC 3261: Standards Track).
604 Does Not Exist The server has authoritative information that the user indicated in the Request-URI does not
Anywhere exist anywhere (RFC 3261: Standards Track).
606 Not Acceptable The UA was contacted successfully but some aspects of the session description such as the
requested media, bandwidth, or addressing style were not acceptable. A 606 Not Acceptable
response means that the user wishes to communicate, but cannot adequately support the
session described. The 606 Not Acceptable response may contain a list of reasons in a
Warning header field describing why the session described cannot be supported.
A message body containing a description of media capabilities may be present in the response,
which is formatted according to the Accept header field in the INVITE (or application/sap if not
present), the same as a message body in a 200 OK response to an OPTIONS request.
It is hoped that negotiation will not frequently be needed, and when a new user is being invited
to join an already existing conference, negotiation may not be possible. It is up to the invitation
initiator to decide whether or not to act on a 606 Not Acceptable response.
This status response is returned only if the client knows that no other end point will answer the
request. Received requests must adhere to the following guidelines for creation of a realm
string for their server:
• Realm strings must be globally unique. It is recommended that a realm string contain a host
name or domain name, following the recommendation in RFC 2617 (see Section 19.4.5).
• Realm strings should present a human-readable identifier that can be rendered to a user.
For example:
Generally, SIP authentication is meaningful for a specific realm, a protection domain. Thus, for
Digest authentication, each such protection domain has its own set of user names and
passwords. If a server does not require authentication for a particular request, it may accept a
default user name, anonymous, which has no password (password of “”). Similarly, UACs
representing many users, such as PSTN gateways, may have their own device-specific user
name and password, rather than accounts for particular users, for their realm.
While a server can legitimately challenge most SIP requests, there are two requests defined by
this document that require special handling for authentication: ACK and CANCEL (RFC 3261:
Standards Track).
Basic Session Initiation Protocol ◾ 61
Outbound Inbound
Alice proxy (P1) proxy (P2) Bob
F1 REGISTER F3 REGISTER
F2 200 OK F4 200 OK
Outbound Inbound
proxy proxy
(P1) (P2) F5 INVITE
SIP
F6 INVITE
F7 100 Trying F8 INVITE
SIP SIP
F9 100 Trying
IP network
media F10 100 Trying
F23 200 OK
(b)
F24 200 OK
F25 200 OK
(c)
Figure 2.6 SIP network with trapezoid operation with signaling and media: (a) SIP network with two UAs and two proxies
with SIP trapezoid operation, (b) URIs and IP addresses for SIP entities, and (c) SIP session establishment and termination.
establish the call via the proxy (P1) residing in its own admin- incoming proxy. In the beginning, Alice registers with the
istrative domain with Bob who is residing in the biloxi.com proxy of her administrative domain sending a REGISTER
administrative domain. Consequently, we designate the proxy (F1) request, and SIP server (P1) confirms the request with
server of the atlanta.com domain as the outgoing proxy (P1) positive response of 200 OK (F2). In this way, the SIP server
and the proxy of the biloxi.com as the incoming proxy (P2). (P1) knows about the addresses of all phones in its admin-
First, the SIP signaling path is created between Alice’s phone, istrative domain through registration of all users. Similarly,
outgoing proxy, incoming proxy, and Bob’s phone for estab- Bob also registers his phone with the proxy server (P2) of his
lishment of the session. Then, the media is passed between the administrative domain sending the REGISTER (F3) request
phones directly and the human users, Alice and Bob, commu- and the servers confirms his registration sending 200 OK
nicate among themselves using their phones. We are assuming (F4) response. We have discussed about the location server
that the proxy servers will also act as the registration servers in Section 2.4.4.3.4; however, this is not shown in Figure
for their users in the respective domains. Figure 2.6b shows 2.6 for simplicity. It is quite logical that the SIP servers of
the URIs and IP addresses of the SIP entities over the network. both administrative domains can store all addresses of the
Figure 2.6c shows an example of SIP call flows for reg- phones in the location server that acts as the global data-
istration of users along with establishment and termination base for all addresses of the phones if a business relationship
of a session between two SIP UAs using an outgoing and between these domains exists. In this way, both P1 and P2
62 ◾ Handbook on Session Initiation Protocol
proxy servers can resolve the addresses of the cross-domain At the end of the session, either party can decide to
phone numbers when calls need to be routed between differ- terminate the session by sending the BYE request. In this
ent administrative domains. example, the callee (Bob) sends the BYE (F20) request to
In the SIP signaling protocol, if Alice needs to place a P2, and P2 forwards the BYE (F21) request to P1, and then
call to Bob, she has to send an INVITE (F5) request that P1 forwards the BYE (F22) request to the caller (Alice). Like
contains Bob’s phone URI among other information to her ACK, the BYE message can also be sent directly end-to-end
outgoing proxy (P2). The proxy server (P2) examines the des- between the UAs without going via the proxies. In turn, the
tination address after querying the location server database, caller (Alice) sends the confirmation with the 200 OK (F23)
as the destination address does not reside in its local registra- response to P1, and P1 forwards the 200 OK (F24) response
tion database (not shown in Figure 2.6 for simplicity), and to P2, and finally P2 forwards it to the callee (Bob). Now the
finds that the callee (Bob) remains in a separate atlanta.com session has been terminated and media does not flow any-
administrative domain. Thus, the proxy server (P1) forwards more. The call flows containing the SIP signaling message
the INVITE (F6) to the incoming proxy server (P2), and it path and media path complete as the logical trapezoid path,
also confirms the receipt of the INVITE (F5) sending a pro- as shown in Figure 2.6a.
visional 100 Trying (F7) hop-by-hop response to the caller The SIP session establishment and termination shown
(Alice). Receiving the INVITE (F6) message, the incoming here is a simple one. We have not shown how a session can
proxy (P2) consults its local registration database and finds be modified and updated, nor the many other features for
that the callee (Bob) remains in its administrative domain each of these audio, video, or data applications that a rich
and forwards the INVITE (F8) to the callee (Bob), and also multimedia session can have. We have not dealt with any
sends a provisional 100 Trying (F9) hop-by-hop response to security features or QOS issues in this example. A call may
the incoming proxy (P1). fail due to failures in the network during the call setup or
Receiving the INVITE (F8) message, the callee (Bob) after establishment of the session, and the paths may be
immediately sends a provisional 100 Trying (F10) hop-by- shaped differently than a trapezoid. Moreover, this is only
hop response to the incoming proxy (P2). If 100 Trying a point-to-point call where only two users are involved. A
responses (F7, F9, and F10) are not sent, the senders will con- multipoint conference call with multiple users and with mul-
tinue to retransmit the request after certain time intervals tiple media of audio, video, or data applications will be much
when timers expire, as the process of an INVITE message more complex. In the real world, even a point-to-point call is
usually takes a substantial amount of time. In the meantime, much more complicated, where a call may traverse over many
the phone of the callee (Bob) generates the ringtone for alter- more administrative domains with their different security
ing the callee (Bob) and sends the 180 Ringing (F11) mes- and QOS features, a host of different administrative security
sage to the incoming proxy (P2). In turn, P2 forwards the policies, and middle boxes like network address translators,
180 Ringing (F12) message to P1; P1 then sends the 180 not to speak about different kinds of networks with different
Ringing (F13) message to the caller (Alice); and the caller call control protocols. In the subsequent chapters, we will be
(Alice) starts to hear that the phone of the callee (Bob) is describing many of those functional features.
ringing.
In this example, we have shown that the callee (Bob) has
accepted the call without further negotiations with respect
to audio/video codecs or data applications and their corre- 2.8 SIP Header Fields
sponding performances, and the callee (Bob) answers the
call. The callee (Bob) chooses the audio/video codec or data
2.8.1 Overview
application parameter used in the call by the caller (Alice) SIP header fields are mostly constructed following the
and sends a final 200 OK (F14) response back to the incom- HTTP/1.1 specifications in RFC 2616, although not all
ing proxy (P2), and then P2 forwards the 200 OK (F15) headers are used in SIP. We have described the general syn-
response to P1, and P1 forwards the 200 OK (F16) response tax for header fields in Section 2.4.1.2. Table 2.5 lists the full
to the caller (Alice). The caller (Alice) acknowledges this for set of header fields along with notes on syntax, meaning, and
reliability purposes by sending the ACK (F17) message to usage.
P1, and P1 forwards the ACK (F18) to P2, and P2 forwards The Where column of Table 2.5 describes the request and
the ACK (19) to the callee (Bob). It should be noted that the response types in which the header field can be used. Values
ACK message can be sent directly end-to-end between the in this column are given in Table 2.6.
UAs without going via the proxies. At this point in time, The Proxy column of Table 2.5 describes the operations
the session is established and both phones begin to exchange a proxy may perform on a header field shown in Table 2.7.
media (audio/video over RTP or data applications using The next six columns of Table 2.5 relate to the presence of
respective application protocols). a header field in a method shown in Table 2.8.
Table 2.5 SIP Header Fields
Header Field Where Proxy ACK BYE CANCEL INVITE OPTIONS REGISTER REFER UPDATE SUBSCRIBE NOTIFY PUBLISH MESSAGE PRACK INFO
Accept R ar – o – o m* o o o o o o – o o
Accept 2xx – – – o m* o – o – – – – – –
Accept 415 – c – c c c c c o o m* m* c o
Accept- R ar o o o o o – o o o o o o
Contact
Accept- R – o – o o o o o o o o – o o
Encoding
Accept- 2xx – – – o m* o – o – – – – – o
Encoding
Accept- 415 – c – c c c c c o o m* m* c c
Encoding
Accept- R – o – o o o o o o o – o o
Language
Accept- 2xx – – – o m* o – o – – – – – o
Language
Accept- 415 – c – c c c c c o o m* m* c o
Language
Alert-Info R ar – – – o – – – – – – – – –
Alert-Info 180 ar – – – o – – – – – – – – –
Alert-Info –
Allow R – o – o o o o o o o o o o o
Allow 2xx – o – m* m* m* o o o o o –
Allow r – o – o o o o o o o o o o o
(Continued)
Basic Session Initiation Protocol ◾ 63
Table 2.5 (Continued) SIP Header Fields
Header Field Where Proxy ACK BYE CANCEL INVITE OPTIONS REGISTER REFER UPDATE SUBSCRIBE NOTIFY PUBLISH MESSAGE PRACK INFO
Allow 405 – m – m m m m m m m m m m m
Allow-Events R o o – o o o o o o o o
Allow-Events 2xx – o – o o o o o o o
Allow-Events 489 – – – – – – – – m m m
Allow-Events (1) –
Authentication- 2xx – o – o o o o o o o o o o o
Info
Authorization R o o o o o o o o o o o o o o
Call-ID c r m m m m m m m m m m m m m m
Call-Info ar – – – o o o – o o o – o
64 ◾ Handbook on Session Initiation Protocol
Contact R o – – m o o m m m m – – – –
Contact 1xx – – – o – – – o o o – – –
Contact 2xx – – – m o o m m m o – – –
Contact 3xx d – o – o o o o m m o o o
Contact 3xx–6xx o
Contact 485 – o – o o o o o o o o o
Content- o o – o o o o o o o o o o o
Disposition
Content- o o – o o o o o o o o o o o
Encoding
Content- o o – o o o o o o o o o o o
Language
Content- ar t t t t t t o t t t t t t o
Length
Content-Type * * – * * * * * * * * * * *
CSeq c c m m m m m m m m m m m m m m
(Continued)
Table 2.5 (Continued) SIP Header Fields
Header Field Where Proxy ACK BYE CANCEL INVITE OPTIONS REGISTER REFER UPDATE SUBSCRIBE NOTIFY PUBLISH MESSAGE PRACK INFO
Date a o o o o o o o o o o o o o o
Encryption o
(+)
Error-Info 300- a – o o o o o o o o o o o o
699
Event R – – – – – – – – m m m
Expires – – – o – o – o – o – –
Expires R o o
Expires 2xx m –
Flow-Timer r o
From c r m m m m m m m m m m m m m m
Geolocation R o o o o o o o o o o o o o o
Geolocation- o o o o o o o o o o o o o
Routing
Geolocation- r o o o o o o o o o o o o o o
Error
Hide (+) R o o o o o o
History-Info admr – – – o o o o – o o – o –
Identity R a o o o – o o o o o o o
Identity-Info R a o o o – o o o o o o o
Info-Package R – – – – – – – – – – – – – – m1
In-Reply-To R – – – o – – – – – – o –
In-Reply-To –
Join R – – – o – – – – – – – – – –
Max-Breadth R –
Max-Forwards R amr m m m m m m m m m m m m m m
(Continued)
Basic Session Initiation Protocol ◾ 65
Table 2.5 (Continued) SIP Header Fields
Header Field Where Proxy ACK BYE CANCEL INVITE OPTIONS REGISTER REFER UPDATE SUBSCRIBE NOTIFY PUBLISH MESSAGE PRACK INFO
Min-Expires 423 – – – – – m – – m – m – –
Min-SE R amr – – – o – – – o – – – – –
Min-SE 422 – – – m – – – m – – – – –
MIME-Version o o – o o o o o o o o o
Organization ar – – – o o o o o o – o o – –
P-Access- ad – – o – o o o o o o o o
Network-
Info
P-Asserted- adr – o – o o – o – o o –
Identity
P-Asserted- R admr – – – o o – o – o – o o –
66 ◾ Handbook on Session Initiation Protocol
Service
P-Associated- 2xx – – – – – o – – – – – –
URI
Path R ar – – – – – o
Path 2xx – – – – – o
P-Called- R amr – – – o o – o – o – o –
Party-ID
P-Charging- adr – o – o o o o o o o o o
Function-
Addresses
P-Charging- admr – o – o o o o o o o o o
Vector
P-Early-Media R amr – – – o – – o o
(Continued)
Table 2.5 (Continued) SIP Header Fields
Header Field Where Proxy ACK BYE CANCEL INVITE OPTIONS REGISTER REFER UPDATE SUBSCRIBE NOTIFY PUBLISH MESSAGE PRACK INFO
P-Preferred- adr – o – o o – o – o o –
Identity
P-Preferred- R dr – – – o o – o – o – o o –
Service
P-Profile-Key R admr o
P-Served- R admr o
User
P-User- R admr o
Database
Priority R ar – – – o – – – – o – o o – –
Privacy admr o o o o o o o o o o o o o o
Proxy- 407 ar – m – m m m m m m m m m m m
Authenticate
Proxy- 401 ar – o o o o o o o o o o o
Authenticate
Proxy- R dr o o – o o o o o o o o o o o
Authorization
Proxy-Require R ar – o – o o o o o o o o o o o
P-Visited- R ad – – – o o o o – o – o –
Network-ID
RAck R – – – – – – – – – – m
Reason R a o o o
Recv-Info R – – – – m – o – o – – – – o –
Recv-Info 2xx – – – – o2 – – – o2 – – – – o3 –
Recv-Info 1xx – – – – o2 – – – – – – – – –
Recv-Info 469 – – – – – – – – – – – – – – m1
Recv-Info r – – – – o – – – o – – – – o –
(Continued)
Basic Session Initiation Protocol ◾ 67
Table 2.5 (Continued) SIP Header Fields
Header Field Where Proxy ACK BYE CANCEL INVITE OPTIONS REGISTER REFER UPDATE SUBSCRIBE NOTIFY PUBLISH MESSAGE PRACK INFO
Record-Route R ar o o o o o – o o o o o o
Record-Route 2xx, mr – o o o o – o o o o o
18x
Record-Route –
Record-Route ar –
Refer-Sub R, – – – – – o – – – – – –
2xx
Refer-To R – – – – – – m – o –
Referred-By R ar – o – o o o o
Reject- R o o o o o – o o o o o
68 ◾ Handbook on Session Initiation Protocol
Contact
Replaces R – – – o – – – – – – – – – –
Reply-To – – – o – – – – – – –
Request- R ar o o o o o o o o o o o o
Disposition
Require ar – c – c c c c c o o o c c o
Resource- R admr o o o o o o o o o o o o o o
Priority
Response-Key
Retry-After 404, – o o o o o o o o o o o o o
413,
480,
486
500, – o o o o o o o o o o o o o
503
600, – o o o o o o o o o o o o o
603
(Continued)
Table 2.5 (Continued) SIP Header Fields
Header Field Where Proxy ACK BYE CANCEL INVITE OPTIONS REGISTER REFER UPDATE SUBSCRIBE NOTIFY PUBLISH MESSAGE PRACK INFO
Route R adr c c c c c c c c c c c o o
RSeq – – – o – – – – o o
Security- R adr o o o o o o o o o
Client
Security- 421, o o o o o o o o o
Server 494
Security- R adr o o o o o o o o o
Verify
Server r – o o o o o o o o o o o
Service-Route 2xx ar o
Session- R amr – – – o – – – o – – – – –
Expires
Session- 2xx ar – – – o – – – o – – – – –
Expires
SIP-ETag 2xx – – – – – – – – – – m – – –
SIP-If-Match R – – – – – – – – – – o – – –
Subject R – – – o – – – – – – o o – o
Subscription- R – – – – – – – – – m
State
Supported R – o o m* o o o o o o o o o
Supported 2xx – o o m* m* o o o o o o o o
Suppress-If- R o
Match
Target-Dialog R – – – – o – – o – o – – – –
Timestamp o o o o o o o o o o o o o
To c(1) r m m m m m m m m m m m m m m
(Continued)
Basic Session Initiation Protocol ◾ 69
Table 2.5 (Continued) SIP Header Fields
Header Field Where Proxy ACK BYE CANCEL INVITE OPTIONS REGISTER REFER UPDATE SUBSCRIBE NOTIFY PUBLISH MESSAGE PRACK INFO
Trigger- R amr o o o o o
Consent
Unsupported 420 – m – m m m o m o o o m o
User-Agent o o o o o o o o o o o o o
70 ◾ Handbook on Session Initiation Protocol
Via c m m m m m
Via c(2) m
Via R amr m m m m m m m m m m
Via rc dr m m m m m m m m m
Warning r – o o o o o o o o o o o o o
WWW- 401 ar – m – m m m m m m m m m m m
Authenticate
WWW- 407 ar – o – o o o o o o o o
Authenticate
Basic Session Initiation Protocol ◾ 71
Table 2.6 Notation and Description of the Where Column of Table 2.5
Notation Description
2xx, 4xx, etc. A numerical value or range indicating response codes with which the header field can be used.
Table 2.7 Notation and Description of the Proxy Column of Table 2.5
Notation Description
r A proxy must be able to read the header field, and thus this header
field cannot be encrypted.
c Conditional; requirements on the header field depend on the context of the message.
m* The header field should be sent, but clients/servers need to be prepared to receive
messages without that header field.
t The header field should be sent, but clients/servers need to be prepared to receive
messages without that header field.
1 Not applicable to INFO requests and responses associated with legacy INFO usages.
2 Mandatory in at least one reliable 18x/2xx response, if sent, to the INVITE request, if the
associated INVITE request contained a Recv-Info header field.
If a stream-based protocol (such as TCP) is used as a response header field must be present in the response, and
transport, then the header field must be sent. Some special the header field must be understood by the UAC process-
notations used in Table 2.5 are described in Table 2.9. ing the response. Not applicable means that the header field
Optional means that an element may include the header must not be present in a request. If one is placed in a request
field in a request or response, and a UA may ignore the by mistake, it must be ignored by the UAS receiving the
header field if present in the request or response. The excep- request. Similarly, a header field labeled not applicable for
tion to this rule is the Require header field. A mandatory a response means that the UAS must not place the header
header field must be present in a request, and must be field in the response, and the UAC must ignore the header
understood by the UAS receiving the request. A mandatory field in the response.
72 ◾ Handbook on Session Initiation Protocol
Table 2.9 Special Notation and Description of All Columns of Table 2.5
Notation Description
(+) Used by RFC 2543 but not supported by RFC 3261 that obsoletes RFC 2543.
A UA should ignore extension header parameters that are established from a single request. When a tag is generated by
not understood. a UA for insertion into a request or response, it must be glob-
A compact form of some common header field names is ally unique and cryptographically random with at least 32
also defined for use when overall message size is an issue. bits of randomness. A property of this selection requirement
The Contact, From, and To header fields contain a URI. is that a UA will place a different tag into the From header
If the URI contains a comma, question mark, or semico- of an INVITE than it would place into the To header of the
lon, the URI must be enclosed in angle brackets (< and >). response to the same INVITE. This is needed in order for a
Any URI parameters are contained within these brackets. If UA to invite itself to a session, a common case for hairpinning
the URI is not enclosed in angle brackets, any semicolon- of calls in public switched telephone network (PSTN) gate-
delimited parameters are header parameters, not URI ways. Similarly, two INVITEs for different calls will have
parameters. different From tags, and two responses for different calls will
have different To tags.
Besides the requirement for global uniqueness, the algo-
2.8.2 Header-Field Descriptions rithm for generating a tag is implementation specific. Tags
SIP header fields that are used in different SIP methods that are helpful in fault-tolerant systems, where a dialog is to
may or may not contain message bodies can be categorized be recovered on an alternate server after a failure. A UAS
on the basis of their usages as follows: can select the tag in such a way that a backup can recog-
nize a request as part of a dialog on the failed server, and
1. Request and Response therefore determine that it should attempt to recover the
2. Request dialog and any other state associated with it. SIP has some
3. Response support for expression of capabilities. The Allow, Accept,
4. Message Body Accept-Language, and Supported header fields convey some
information about the capabilities of a UA. However, these
In addition, some of the header fields can be modified header fields convey only a small part of the information
including insertion by the SIP proxy. Table 2.10 describes the that is needed. They do not provide a general framework for
SIP header fields briefly. However, ABNF syntaxes of these expression of capabilities. Furthermore, they only specify
headers are defined earlier in Section 2.4.1.2. capabilities indirectly; the header fields really indicate the
capabilities of the UA as they apply to this request. SIP also
has no ability to convey characteristics, that is, information
that describes a UA.
2.9 SIP Tags
The tag parameter is used in the To and From header fields
of SIP messages. It serves as a general mechanism to identify
a dialog, which is the combination of the Call-ID along with
2.10 SIP Option Tags
two tags, one from each participant in the dialog. When a Option tags are unique identifiers used to designate new
UA sends a request outside of a dialog, it contains a From options (extensions) in SIP. These tags are used in the
tag only, providing half of the dialog ID. The dialog is Require, Proxy-Require, Supported, and Unsupported
completed from the response(s), each of which contributes header fields defined in RFC 3261 (see Section 2.8). Note
the second half in the To header field. The forking of SIP that these options appear as parameters in those header fields
requests means that multiple dialogs can be established from in an option-tag = token form for the definition of token
a single request. This also explains the need for the two-sided defined in RFC 3261 (see Section 2.4.1). Option tags are
dialog identifier; without a contribution from the recipients, defined in Standards Track RFCs. This is a change from past
the originator could not disambiguate the multiple dialogs practice, and is instituted to ensure continuing multivendor
Basic Session Initiation Protocol ◾ 73
Accept/RFC 3261 (Standards The Accept header field follows the syntax defined in Section 2.4.1.2. The semantics are
Track)/Request and also identical, with the exception that if no Accept header field is present, the server
Response should assume a default value of application/SDP. An empty Accept header field
means that no formats are acceptable.
Accept: application/sdp;level=1, application/x-private, text/html
Accept-Contact/RFC 3841 The Accept-Contact header field allows the UAC to specify that a UA should be
(Standards Track)/Request contacted if it matches some or all of the values of the header field. Each value of the
Accept-Contact header field contains a *, and is parameterized by a set of feature
parameters. Any UA whose capabilities match the feature set described by the feature
parameters matches the value. In fact, it defines some additional parameters for
Contact header fields such as media, duplex, and language, indicating some
preferences of the caller. Despite the ABNF, there must not be more than one req-
param or explicit-param in an ac-params. Furthermore, there can only be one instance
of any feature tag in feature-param. Example:
Accept-Contact: *;audio;require
Accept-Contact: *;video;explicit
Accept-Contact: *;methods="BYE";class="business";q=1.0
Accept-Encoding/RFC 3261 The Accept-Encoding header field is similar to Accept, but restricts the content-codings
(Standards Track)/Request defined in Section 3.5 of RFC 2616 (obsoleted by RFCs 7230–7235) that are acceptable
and Response in the response. See Section 14.3 of RFC 2616. The semantics in SIP are identical to
those defined in Section 14.3 of RFC 2616. An empty Accept-Encoding header field is
permissible. It is equivalent to Accept-Encoding: identity, that is, only the identity
encoding, meaning no encoding, is permissible. If no Accept-Encoding header field is
present, the server should assume a default value of identity. This differs slightly from
the HTTP definition, which indicates that when not present, any encoding can be
used, but the identity encoding is preferred. Example:
Accept-Encoding: gzip
Accept-Language/RFC 3261 The Accept-Language header field is used in requests to indicate the preferred
(Standards Track)/Request languages for reason phrases, session descriptions, or status responses carried as
and Response message bodies in the response. If no Accept-Language header field is present, the
server should assume all languages are acceptable to the client. The Accept-Language
header field follows the syntax defined in Section 14.4 of RFC 2616 (obsoleted by RFCs
7230–7235). The rules for ordering the languages based on the q parameter apply to
SIP as well. Example:
Accept-Resource-Priority/ The Accept-Resource-Priority response header field enumerates the resource values
RFC 4412 (Standards Track)/ (r-values) a SIP UAS is willing to process. (This does not imply that a call with such
Response values will find sufficient resources and succeed.) Some administrative domains may
choose to disable the use of the Accept-Resource-Priority header for revealing too
much information about that domain in responses. However, this behavior is not
recommended, as this header field aids in troubleshooting.
(Continued)
74 ◾ Handbook on Session Initiation Protocol
Alert-Info/RFC 3261 When present in an INVITE request, the Alert-Info header field specifies an alternative
(Standards Track)/Request ringtone to the UAS. When present in a 180 Ringing response, the Alert-Info header
and Response field specifies an alternative ringback tone to the UAC. A typical usage is for a proxy to
insert this header field to provide a distinctive ring feature. The Alert-Info header field
can introduce security risks. These risks and the ways to handle them are discussed in
Section 19.6.4, which discusses the Call-Info header field since the risks are identical.
In addition, a user should be able to disable this feature selectively. This helps prevent
disruptions that could result from the use of this header field by untrusted elements.
Example:
Alert-Info: http://www.example.com/sounds/moo.wav
RFC 7463 registers the SIP header field defining a new parameter as shown below
through IANA registration:
2. Use Cases
This section describes some use cases for which the alert URN mechanism is needed
today.
2.1. PBX Ringtones
This section defines some commonly encountered ringtones on PBX or business
phones. They are as listed in the following subsections.
2.1.1. Normal
This tone indicates that the default or normal ringtone should be rendered. This is
essentially a no-operation alert URN and should be treated by the UA as if no alert
URN is present. This is most useful when Alert-Info header field parameters are being
used. For example, in RFC 7463 (see Section 16.2.11), an Alert-Info header field needs
to be present containing the appearance parameter, but no special ringtone needs to
be specified.
2.1.2. External
This tone is used to indicate that the caller is external to the enterprise or PBX system.
This could be a call from the PSTN or from a SIP trunk.
2.1.3. Internal
This tone is used to indicate that the caller is internal to the enterprise or PBX system.
The call could have been originated from another user on this PBX or on another PBX
within the enterprise.
2.1.4. Priority
A PBX tone needs to indicate that a priority level alert should be applied for the type of
alerting specified (e.g., internal alerting).
2.1.5. Short
In this case, the alerting type specified (e.g., internal alerting) should be rendered
shorter than normal. In contact centers, this is sometimes referred to as abbreviated
ringing or a zip tone.
2.1.6. Delayed
In this case, the alerting type specified should be rendered after a short delay. In some
bridged-line/shared-line-appearance implementations, this is used so that the bridged
line does not ring at exactly the same time as the main line but is delayed a few
seconds.
2.2. Service Tones
These tones are used to indicate specific PBX and public network telephony services.
2.2.1. Call Waiting
The call-waiting service [TS24.615] permits a callee to be notified of an incoming call
while the callee is engaged in an active or held call. Subsequently, the callee can either
accept, reject, or ignore the incoming call. There is an interest on the caller side to be
informed about the call-waiting situation on the callee side. Having this information,
the caller can decide whether to continue waiting for callee to pick up or better to call
some time later when it is estimated that the callee could have finished the ongoing
conversation. To provide this information, a callee’s UA (or proxy) that is aware of the
call-waiting condition can add the call-waiting indication to the Alert-Info header field
in the 180 Ringing response.
(Continued)
76 ◾ Handbook on Session Initiation Protocol
2.2.2. Forward
This feature is used in a 180 (Ringing) response when a call-forwarding feature has been
initiated on an INVITE. Many PBX systems implement a forwarding beep followed by
normal ringing to indicate this. Note that a 181 response can be used in place of this URN.
2.2.3. Transfer Recall
This feature is used when a blind transfer (RFC 5589, see Section 16.2) has been
performed by a server on behalf of the transferor and fails. Instead of failing the call,
the server calls back the transferor, giving them another chance to transfer or
otherwise deal with the call. This service tone is used to distinguish this INVITE from a
normal incoming call.
2.2.4. Auto Callback
This feature is used when a user has utilized a server to implement an automatic
callback service (RFC 6910, see Section 16.2.12). When the user is available, the server
calls back the user and utilizes this service tone to distinguish this INVITE from a
normal incoming call.
2.2.5. Hold Recall
This feature is used when a server implements a call hold timer on behalf of an end
point. After a certain period of time of being on hold, the user who placed the call on
hold is alerted to either retrieve the call or otherwise dispose of the call. This service
tone is used to distinguish this case from a normal incoming call.
2.3. Country-Specific Ringback Tone Indications for the PSTN
In the PSTN, different tones are used in different countries. End users are accustomed
to hear the callee’s country ringback tone and would like to have this feature for SIP.
urn:alert:<alert-category>:<alert-indication>.
(Continued)
Basic Session Initiation Protocol ◾ 77
The ABNF for the alert URNs is shown below defined by RFC 7462:
<alert-label>s must comply with the syntax for Non-reserved LDH labels (RFC 5890).
Registered URNs and components thereof must be transmitted as registered
(including case).
Relevant ancillary documentation: RFC 7462
Namespace considerations:
This specification defines a URN namespace alert for URNs representing signals or
renderings that are presented to users to inform them of events and actions. The initial
usage is to specify ringtones and ringback tones when dialogs are established in SIP, but
they can also be used for other communication-initiation protocols (e.g., H.323), and more
generally, in any situation (e.g., web pages or end-point device software configurations) to
describe how a user should be signaled.
An alert URN does not describe a complete signal, but rather it describes a particular
characteristic of the event it is signaling or a feature of the signal to be presented. The
complete specification of the signal is a sequence of alert URNs specifying the desired
characteristics/significance of the signal in priority order, with the most important
aspects specified by the earlier URNs. This allows the sender of a sequence of URNs to
compose very detailed specifications from a restricted set of URNs, and to clearly
specify which aspects of the specification it considers most important.
The initial scope of usage is in the Alert-Info header field, in initial INVITE requests (to
indicate how the called user should be alerted regarding the call) and non-100
provisional (1xx) responses to those INVITE requests (to indicate the ringback, how
the calling user should be alerted regarding the progress of the call).
To ensure widespread adoption of these URNs for indicating ringtones and ringback
tones, the scheme must allow replication of the current diversity of these tones.
Currently, these tones vary between the PSTNs of different nations and between
equipment supplied by different vendors. Thus, the scheme must accommodate
national variations and proprietary extensions in a way that minimizes the information
that is lost during interoperation between systems that follow different national
variations or that are supplied by different vendors.
The scheme allows definition of private-extension URNs that refine and extend the
information provided by standard URNs. Private-extension URNs can also refine and
extend the information provided by other private-extension URNs. Private extensions
can also define entirely new categories of information about calls. We expect these
extensions to be used extensively when existing PBX products are converted to
support SIP operation.
(Continued)
78 ◾ Handbook on Session Initiation Protocol
The device that receives an Alert-Info header field containing a sequence of alert URNs
provides to the user a rendering that represents the semantic content of the URNs.
The device is given great leeway in choosing the rendering, but it is constrained by
rules that maximize interoperability between systems that support different sets of
private extensions. In particular, earlier URNs in the sequence have priority of
expression over later URNs in the sequence, and URNs that are not usable in their
entirety (because they contain unknown extensions or are incompatible with previous
URNs) are successively truncated in attempt to construct a URN that retains some
information and is renderable in the context.
Owing to the practical importance of private extensions for the adoption of URNs for
alerting calls, and the very specific rules for private extensions and the corresponding
processing rules that allow quality interoperation in the face of private extensions, the
requirements of the alert URN scheme cannot be met by a fixed enumeration of URNs
and corresponding meanings. In particular, the existing namespace urn:ietf:params does
not suffice (unless the private-extension apparatus is applied to that namespace).
There do not appear to be other URN namespaces that uniquely identify the semantic
of a signal or rendering feature. Unlike most other currently registered URN
namespaces, the alert URN does not identify documents and protocol objects (e.g.,
RFCs 3044, 3120, 3187, 3188, 4179, 4195, and 4198), types of telecommunications
equipment (RFC 4152), people, or organizations (RFC 3043).
The <alert-URN>s are hierarchical identifiers. An <alert-URN> asserts some fact or
feature of the offered SIP dialog, or some fact or feature of how it should be presented
to a user, or of how it is being presented to a user. Removing an <alert-ind-part> from
the end of an <alert-URN> (which has more than one <alertind-part>) creates a
shorter <alert-URN> with a less specific meaning; the set of dialogs to which the
longer <alert-URN> applies is necessarily a subset of the set of dialogs to which the
shorter <alert-URN> applies. (If the starting <alert-URN> contains only one <alert-ind-
part>, and thus the <alert-ind-part> cannot be removed to make a shorter <alert-
URN>, we can consider the set of dialogs to which the <alert-URN> applies to be a
subset of the set of all dialogs.)
The specific criteria defining the subset to which the longer <alert-URN> applies,
within the larger set of dialogs, is considered to be the meaning of the final <alert-ind-
part>. This meaning is relative to and depends on the preceding <alert- category> and
<alert-ind-part>s (if any). The meanings of two <alert-ind-part>s that are textually the
same but are preceded by different <alert-category>s or <alert-ind-part>s have no
necessary connection. (An <alert-category> considered alone has no meaning in this
sense.)
The organization owning the <provider> within a <private-name> specifies the
meaning of that <private-name> when it is used as an <alert-ind-part>. (The
organization owning a <provider> is specified by the IANA registry.)
The organization owning the <provider> within a <private-name> (in either an <alert-
category> or an <alert-ind-part>) specifies the meaning of each <alert-ind-part>,
which is an <alert-label> that follows that <private-name> and that precedes the next
<alert-indpart>, which is a <private-name> (if any).
The meaning of all other <alert-ind-part>s (i.e., those that are not <private-name>s and
do not follow a <private-name>) is defined by standardization.
(Continued)
Basic Session Initiation Protocol ◾ 79
Community considerations:
The alert URNs are relevant to a large cross section of Internet users, namely those that
initiate and receive communication connections via the SIP. These users include both
technical and nontechnical users, on a variety of devices and with a variety of
perception capabilities. The alert URNs will allow Internet users to receive more
information about offered calls and enable them to better make decisions about
accepting an offered call, and to get better feedback on the progress of a call they
have made.
User interfaces that utilize alternative sensory modes can better render the ring and
ringback tones based on the alert URNs because the URNs provide more detailed
information regarding the intention of communications than is provided by current
SIP mechanisms.
Process of identifier assignment:
The assignment of standardized alert URNs is by insertion into the IANA registry. This
process defines the meanings of <alert-ind-part>s that have standardized meanings, as
described in “Namespace Considerations.”
A new URN must not be registered if it is equal by the comparison rules to an already
registered URN.
Private extensions are alert URNs that include <alert-ind-part>s that are <private-
name>s and <alert-label>s that appear after a <private-name> (either as an <alert-
category> or an <alertindication>). If such an <alert-ind-part> is a <private-name>, its
meaning is defined by the organization that owns the <provider> that appears in the
<private-name>. If the <alert-indpart> is an <alert-label>, its meaning is defined by the
organization that owns the <provider> that appears in the closest <private-name>
preceding the <alert-label>. The organization owning a <provider> is specified by the
IANA registry.
Identifier uniqueness and persistence considerations:
An alert URN identifies a semantic feature of a call or a sensory feature of how the call
alerting should be a rendered at the caller’s or callee’s end device. For standardized
<alert-ind-part>s in URNs, uniqueness and persistence of their meanings is
guaranteed by the fact that they are registered with IANA; the feature identified by a
particular alert URN is distinct from the feature identified by any other standardized
alert URN.
Assuring uniqueness and persistence of the meanings of private extensions is
delegated to the organizations that define private extension <alert-ind-part>s. The
organization responsible for a particular <alert-ind-part> in a particular alert URN is
the owner of a syntactically determined <provider> part within the URN.
An organization should use only one <provider> value for all of the <private-name>s it
defines.
Process for identifier resolution:
The process of identifier resolution is the process by which a rendering device chooses
a rendering to represent a sequence of alert URNs. The device is allowed great leeway
in making this choice, but the process must obey the rules defined this specification
(RFC 7462). The device is expected to provide renderings that users associate with the
meanings assigned to the URNs within their cultural context. A nonnormative example
resolution algorithm is given in RFC 7462. Rules for lexical equivalence: alert URNs are
compared according to case-insensitive string equality.
(Continued)
80 ◾ Handbook on Session Initiation Protocol
Examples: <urn:alert:duration:short>.
4.2.5. <alert-indication> Values for the <alert-category> delay: none (default), yes, and
<private-name>
Examples: <urn:alert:delay:yes>.
4.2.6. <alert-indication> Values for the <alert-category> locale: default (default),
country:<ISO 3166-1 country code>, and <private-name>
The ISO 3166-1 country code [ISO3166-1] is used to inform the renderer on the other
side of the call that a country-specific rendering should be used. For example, to
indicate ringback tones from South Africa, the following URN would be used:
<urn:alert:locale:country:za>.
[ISO3166-1] ISO, “English country names and code elements,” ISO 3166-1. Available at
http://www.iso.org/iso/english_country_names_and_code_elements.
[TS24.615] 3GPP, “Communication Waiting (CW) using IP Multimedia (IM) Core
Network (CN) subsystem; Protocol Specification,” 3GPP TS 24.615, September 2015.
Allow/RFC 3261 (Standards The Allow header field lists the set of methods supported by the UA generating the
Track)/Message-Body message. All methods, including ACK and CANCEL, understood by the UA MUST be
included in the list of methods in the Allow header field, when present. The absence
of an Allow header field must not be interpreted to mean that the UA sending the
message supports no methods. Rather, it implies that the UA is not providing any
information on what methods it supports. Supplying an Allow header field in
responses to methods other than OPTIONS reduces the number of messages needed.
Example:
Allow-Events/RFC 6665 The Allow-Events header field indicates a list of SIP event packages supported by a SIP
(Standards Track)/Request UA that can be subscribed. SIP SUBSCRIBE/NOTIFY messages are used by SIP UAs for
and Response subscriptions and then notifications of those SIP events.
Answer-Mode and Priv- RFC 5373 extends SIP with two header fields and associated option tags that can be
Answer-Mode/RFC 5373 used in INVITE requests to convey the requester’s preference for user-interface
(Standards Track)/Request handling related to answering of that request. The first header, Answer-Mode,
and Response expresses a preference as to whether the target node’s user interface waits for user
input before accepting the request or, instead, accepts the request without waiting on
user input. The second header, Priv-Answer-Mode, is similar to the first, except that it
requests administrative-level access and has consequent additional authentication and
authorization requirements. These behaviors have applicability to applications such as
push-to-talk and to diagnostics like loop-back. Usage of each header field in a
response to indicate how the request was handled is also defined.
The conventional model for session establishment using SIP involves (i) sending a
request for a session (a SIP INVITE) and notifying the user receiving the request,
(ii) acceptance of the request and of the session by that user, and (iii) the sending of a
response (SIP 200 OK) back to the requester before the session is established. Some
usage scenarios deviate from this model, specifically with respect to the notification
and acceptance phase. While it has always been possible for the node receiving the
request to skip the notification and acceptance phases, there has been no standard
mechanism for the party sending the request to specifically indicate a desire (or
requirement) for this sort of treatment. This document defines a SIP extension header
field that can be used to request specific treatment related to the notification and
acceptance phase.
(Continued)
82 ◾ Handbook on Session Initiation Protocol
The first usage scenario is the requirement for diagnostic loop-back calls. In this sort of
scenario, a testing service sends an INVITE to a node being tested. The tested node
accepts and a dialog is established. However, rather than establishing a two-way media
flow, the tested node loops back or echoes media received from the testing service
back toward the testing service. The testing service can then analyze the media flow
for quality and timing characteristics. Session Description Protocol (SDP) usage for
this sort of flow is described in [LOOPBACK]. In this sort of application, it might not be
necessary that the human using the tested node interact with the node in any way for
the test to be satisfactorily executed. In some cases, it might be appropriate to alert
the user to the ongoing test, and in other cases it might not be.
The second scenario is that of push-to-talk applications, which have been specified by
the Open Mobile Alliance. In this sort of environment, SIP is used to establish a dialog
supporting asynchronous delivery of unidirectional media flow, providing a user
experience like that of a traditional two-way radio. It is conventional for the INVITES
used to be automatically accepted by the called UA, and the media is commonly
played out on a loudspeaker. The called party’s UA’s microphone is not engaged until
the user presses the local talk button to respond. A third scenario is the Private Branch
Exchange (PBX) attendant. Traditional office PBX systems often include intercom
functionality. A typical use for the intercom function is to allow a receptionist to
activate a loudspeaker on a desk telephone in order to announce a visitor. Not every
caller can access the loudspeaker, only the receptionist or operator, and it is not
expected that these callers will always want intercom functionality—they might instead
want to make an ordinary call.
There are presumably many more use cases for the extensions defined in this
specification; however, this document was developed to specifically meet the
requirements of these scenarios, or others with essentially similar properties. These
sorts of mechanisms are not required to provide the functionality of an answering
machine or voice-mail recorder. Such a device knows that it is expected to answer and
does not require a SIP extension to support its behavior. Much of the discussion of
this topic in working group meetings and on the mailing list dealt with differentiating
answering mode from alerting mode. Some early work did not make this distinction.
We therefore proceed with the following definitions:
RFC 5373 deals only with Answering Mode. Issues relating to Alerting Mode are outside
its scope. This document defines two SIP extension header fields: Answer-Mode and
Priv-Answer-Mode. These two extensions take the same parameters and operate in the
same general way. The distinction between Answer-Mode and Priv-Answer-Mode
relates to the level of authorization claimed by the UAC and verified and policed by the
UAS. Requests are usually made using Answer-Mode. Requests made using Priv-
Answer-Mode request privileged treatment from the UAS. Priv-Answer-Mode is not an
assertion of privilege. Instead, it is a request for privileged treatment. This is similar to
the UNIX model, where a user might run a command normally or use sudo to request
administrative privilege for the command. Including Priv- is equivalent to prefixing a
UNIX command with sudo. In other words, a separate policy table (like /etc/sudoers) is
consulted to determine whether the user may receive the requested treatment.
Option Tags:
This option tag is for support of the Answer-Mode and Priv-Answer-Mode extensions
used to negotiate automatic or manual answering of a request (see Section 2.10).
Usage of the Answer-Mode and Priv-Answer-Mode Header Fields:
RFC 5373 defines usage of the Answer-Mode and Priv-Answer-Mode header fields in
initial (dialog-forming) SIP INVITE requests and in 200 OK responses to those requests.
This document specifically does not define usage in any other sort of request or
response, including but not limited to ACK, CANCEL, or any mid-dialog usage. This
limitation stems from the intended usage of this extension, which is to affect the way
that users interact with communications devices when requesting new
communications sessions and when responding to such requests. This sort of
interaction occurs only during the formation of a dialog and its initial usage, not
during subsequent operations such as re-INVITE. However, the security aspects of the
session initiation must be applied to changes in media description introduced by
re-INVITES or similar requests.
Examples of Usage:
The following examples show Bob registering a contact that supports the negotiation
of answering mode. Alice then calls Bob with an INVITE request, asking for automatic
answering and explicitly asking that the request not be routed to contacts that have
not indicated support for this extension. Furthermore, Alice requires that the request
be rejected if Bob’s UA does not support the negotiation of answering mode. Bob
replies with a 200 OK response indicating that the call was answered automatically.
The Content-Length header field shown in the examples contains a placeholder “...”
instead of a valid Content-Length. Furthermore, the SDP bodies that would be
expected in the INVITE requests and 200 OK responses are not shown.
REGISTER Request:
In the following example, Bob’s UA is registering and indicating that it supports the
answermode extension.
In this example, Alice is calling Bob and asking Bob’s UA to answer automatically.
However, Alice is willing for Bob to answer manually if Bob’s policy is to prefer manual
answer, so Alice does not include a ;require modifier on Answer-Mode: Auto.
Here, Bob has accepted the call and his UA has answered automatically, which it
indicates in the 200 OK response.
SIP/2.0 200 OK
Via: SIP/2.0/TCP client-alice.example.com:5060; branch=
z9hG4bK74b43
From: Alice <sip:[email protected]>;tag=9fxced76sl
To: Bob <sip:[email protected]>;tag=8321234356
Call-ID: [email protected]
CSeq: 1 INVITE
Contact: <sip:[email protected];transport=tcp>
Answer-Mode: Auto
Content-Type: application/sdp
Content-Length:...
The extensions described in this document provide mechanisms by which a UAC can
request that a UAS not deploy two of the five defensive mechanisms listed below: user
alerting and user acceptance. For this not to produce undue risk of insertion attacks or
increased risk of interception attacks, we are therefore forced to rely on the remaining
defensive mechanisms. This document defines a minimum threshold for satisfactory
security. Certainly, more restrictive policies might reasonably be used; however, any
policy less restrictive than the approach described below is very likely to result in
significant security issues. From the previous discussion of risks, attacks, and
vulnerabilities, we can derive five defensive mechanisms available at the application
level:
(Continued)
Basic Session Initiation Protocol ◾ 85
Since SIP and related work already provide several mechanisms (including SIP Digest
Authentication [see Section 19.4], the SIP Identity mechanism [see Section 19.4.8], and
the SIP mechanism for asserted identity within private networks [see Section 20.3], in
networks for which they are suitable) for establishing the identity of the originator of a
request, we presume that an appropriately selected mechanism is available for UAs
implementing the extensions described in this document. In short, UAs implementing
these extensions must be equipped with and must exercise a request-identity
mechanism. The analysis below proceeds from an assumption that the identity of the
sender of each request is either known or is known to be unknown, and can therefore
be considered in related policy considerations. Failure to meet this identity
requirement either opens the door to a wide range of attacks or requires operational
policy so tight as to make these extensions useless. We previously established a class
distinction between inbound and outbound media flows, and can model bidirectional
flows as worst-case sums of the risks of the other two classes. Given this distinction, it
seems reasonable to provide separate directionality policy classes for
For each directionality policy class, we can divide the set of request identities into three
classes:
Note that not all combinations of policies possible in this decomposition are generally
useful. Specifically, a policy of inbound media denied, outbound media allowed
equates to a bug my phone attack, and is disallowed by the minimal policy described
below, which as written excludes all cases of outbound media explicitly authorized.
(Continued)
86 ◾ Handbook on Session Initiation Protocol
Priv-Answer-Mode require No
Authentication-Info/RFC The Authentication-Info header field provides for mutual authentication with HTTP
3261 (Standards Track)/ Digest. A UAS may include this header field in a 2xx response to a request that was
Response successfully authenticated using digest based on the Authorization header field.
Syntax and semantics follow those specified in RFC 2617. Example:
Authentication-Info: nextnonce=”47364c23432d2e131a5fb210812c”
Authorization/RFC 3261 The Authorization header field contains authentication credentials of a UA. Section
(Standards Track)/Response 2.4.1 describes the syntax and semantics when used with HTTP authentication. This
header field, along with Proxy-Authorization, breaks the general rules about multiple
header field values. Although not a comma-separated list, this header field name may
be present multiple times, and must not be combined into a single header line using
the usual rules. In the example below, there are no quotes around the Digest
parameter:
Authorization: Digest username=”Alice”, realm=”atlanta.com”,
nonce=”84a4cc6f3082121f32b42a2187831a9e”,
response=”7587245234b3434cc3412213e5f113a5432”
(Continued)
Basic Session Initiation Protocol ◾ 87
Call-ID/RFC 3261 (Standards The Call-ID header field uniquely identifies a particular invitation or all registrations of
Track)/Request and a particular client. A single multimedia conference can give rise to several calls with
Response different Call-IDs, for example, if a user invites a single individual several times to the
same (long-running) conference. Call-IDs are case sensitive and are simply compared
byte by byte. The compact form of the Call-ID header field is i. Examples:
Call-ID: [email protected]
i:[email protected]
Call-Info/RFC 3261 The Call-Info header field provides additional information about the caller or callee,
(Standards Track)/Request depending on whether it is found in a request or response. The purpose of the URI is
described by the purpose parameter. The icon parameter designates an image suitable
as an iconic representation of the caller or callee. The info parameter describes the
caller or callee in general, for example, through a web page. The card parameter
provides a business card, for example, in vCard specified in RFC 6350 or Lightweight
Directory Access (LDAP) Data Interchange Format (LDIF) described in RFC 2849
formats. Additional tokens can be registered using IANA.
Use of the Call-Info header field can pose a security risk. If a callee fetches the URIs
provided by a malicious caller, the callee may be at risk for displaying inappropriate or
offensive content, dangerous or illegal content, and so on. Therefore, it is
recommended that a UA only render the information in the Call-Info header field if it
can verify the authenticity of the element that originated the header field and trusts
that element. This need not be the peer UA; a proxy can insert this header field into
requests. Example:
Call-Info: <http://wwww.example.com/alice/photo.jpg>;
purpose=icon,
<http://www.example.com/alice/>;purpose=info
Contact/RFC 3261 (Standards A Contact header field value provides a URI whose meaning depends on the type of
Track)/Request and request or response it is in. A Contact header field value can contain a display name, a
Response URI with URI parameters, and header parameters. This document defines the Contact
parameters q and expires. These parameters are only used when the Contact is
present in a REGISTER request or response, or in a 3xx response. Additional
parameters may be defined in other specifications. When the header field value
contains a display name, the URI including all URI parameters is enclosed in < and >. If
no < and > are present, all parameters after the URI are header parameters, not URI
parameters. The display name can be tokens, or a quoted string, if a larger character
set is desired.
Even if the display-name is empty, the name-addr form MUST be used if the addr-spec
contains a comma, semicolon, or question mark. There may or may not be LWS
between the display-name and the <. These rules for parsing a display name, URI and
URI parameters, and header parameters also apply for the header fields To and From.
The Contact header field has a role similar to the Location header field in HTTP.
However, the HTTP header field only allows one address, unquoted. Since URIs can
contain commas and semicolons as reserved characters, they can be mistaken for
header or parameter delimiters, respectively. The compact form of the Contact header
field is m (for moved). Examples:
Content-Disposition/RFC The Content-Disposition header field describes how the message body or, for multipart
3261 (Standards Track)/ messages, a message-body part is to be interpreted by the UAC or UAS. This SIP
Message-Body header field extends the MIME Content-Type defined in RFC 2183. Several new
disposition-types of the Content-Disposition header are defined by SIP. The value
session indicates that the body part describes a session, for either calls or early
(precall) media. The value render indicates that the body part should be displayed or
otherwise rendered to the user. Note that the value render is used rather than inline
to avoid the connotation that the MIME body is displayed as a part of the rendering of
the entire message (since the MIME bodies of SIP messages oftentimes are not
displayed to users). For backward compatibility, if the Content-Disposition header
field is missing, the server should assume bodies of Content-Type application/SDP are
the disposition session, while other content types are render.
The disposition type icon indicates that the body part contains an image suitable as an
iconic representation of the caller or callee that could be rendered for information by
a UA when a message has been received, or persistently while a dialog takes place.
The value alert indicates that the body part contains information, such as an audio
clip, that should be rendered by the UA in an attempt to alert the user to the receipt of
a request, generally a request that initiates a dialog; this alerting body could, for
example, be rendered as a ringtone for a phone call after a 180 Ringing provisional
response has been sent.
Any MIME body with a disposition-type that renders content to the user should only be
processed when a message has been properly authenticated. The handling parameter,
handling-param, describes how the UAS should react if it receives a message body
whose content type or disposition type it does not understand. The parameter has
defined values of optional and required. If the handling parameter is missing, the
value required should be assumed. The handling parameter is described in MIME
Media Type RFC 3204. If this header field is missing, the MIME type determines the
default content disposition. If there is none, render is assumed. Example:
Content-Disposition: session
RFC 3873 defines a new MIME Content-Disposition disposition-type value of aib. This
value is reserved for MIME bodies that contain an authenticated identity. Example:
(Continued)
Basic Session Initiation Protocol ◾ 89
Content-Encoding/RFC 3261 The Content-Encoding header field is used as a modifier to the media-type. When
(Standards Track)/ present, its value indicates what additional content codings have been applied to the
Message-Body entity body, and thus what decoding mechanisms MUST be applied in order to obtain
the media-type referenced by the Content-Type header field. Content-Encoding is
primarily used to allow a body to be compressed without losing the identity of its
underlying media type. If multiple encodings have been applied to an entity body, the
content codings must be listed in the order in which they were applied.
All content-coding values are case insensitive. IANA acts as a registry for content-
coding value tokens. RFC 2616 that is obsoleted by RFCs 7230–7235 (see Section 2.4.1)
provides the definition of the syntax for content coding. Clients may apply content
encodings to the body in requests. A server may apply content encodings to the
bodies in responses. The server must only use encodings listed in the Accept-
Encoding header field in the request. The compact form of the Content-Encoding
header field is e. Examples:
Content-Encoding: gzip
e: tar
RFC 6140 defined a new parameter for the Contact header with no predefined value as
follows:
Parameter name: temp-gruu-cookie
Predefined values: No
Content-Language/RFC 3261 This header field is defined per Section 14.12 of RFC 2616 (obsoleted by RFCs 7230–
(Standards Track)/ 7235). The Content-Language entity-header field describes the natural language(s) of
Message-Body the intended audience for the enclosed entity. Note that this might not be equivalent
to all the languages used within the entity body. The primary purpose of Content-
Language is to allow a user to identify and differentiate entities according to the user’s
own preferred language. Thus, if the body content is intended only for a Danish-
literate audience, the appropriate field is
Content-Language: da
If no Content-Language is specified, the default is that the content is intended for all
language audiences. This might mean that the sender does not consider it to be
specific to any natural language, or that the sender does not know for which language
it is intended. Multiple languages may be listed for content that is intended for
multiple audiences. For example, a rendition of the “Treaty of Waitangi,” presented
simultaneously in the original Maori and English versions, would call for
Content-Language: mi, en
However, just because multiple languages are present within an entity does not mean
that it is intended for multiple linguistic audiences. An example would be a beginner’s
language primer, such as “A First Lesson in Latin,” which is clearly intended to be used
by an English-literate audience. In this case, the Content-Language would properly
only include en. Content-Language may be applied to any media type—it is not limited
to textual documents.
(Continued)
90 ◾ Handbook on Session Initiation Protocol
Content-Length/RFC 3261 The Content-Length header field indicates the size of the message body, in decimal
(Standards Track)/ number of octets, sent to the recipient. Applications should use this field to indicate the
Message-Body size of the message body to be transferred, regardless of the media type of the entity. If a
stream-based protocol (such as TCP) is used as transport, the header field must be used.
The size of the message body does not include the CRLF separating header fields and
body. Any Content-Length greater than or equal to zero is a valid value. If no message
body is present in a message, then the Content-Length header field value must be set to
zero. The ability to omit Content-Length simplifies the creation of cgi-like scripts that
dynamically generate responses. The compact form of the header field is l. Examples:
Content-Length: 349
l: 173
Content-Type/RFC 3261 The Content-Type header field indicates the media type of the message body sent to
(Standards Track)/ the recipient. If the body has undergone any encoding such as compression, then this
Message-Body must be indicated by the Content-Encoding header field; otherwise, Content-
Encoding must be omitted. If applicable, the character set of the message body is
indicated as part of the Content-Type header-field value. The multipart MIME type
defined in RFC 2046 may be used within the body of the message. Implementations
that send requests containing multipart message bodies must send a session
description as a non-multipart message body if the remote implementation requests
this through an Accept header field that does not contain multipart. SIP messages may
contain binary bodies or body parts. When no explicit charset parameter is provided
by the sender, media subtypes of the text type are defined to have a default charset
value of UTF-8. If the Content-Disposition header field is missing, bodies of Content-
Type application/sdp imply the disposition session, while other content types imply
render. The presence or absence of a parameter might be significant to the processing
of a media-type, depending on its definition within the media-type registry.
The Content-Type header field must be present if the body is not empty. If the body is
empty, and a Content-Type header field is present, it indicates that the body of the
specific type has zero length (e.g., an empty audio file). RFC 4483 defines an extension
to the URL MIME External-Body access-type to satisfy the content indirection
requirements for the SIP, while the access-type parameter is specified in the syntax of
the Content-Type header field (see Section 2.4.1). These extensions are aimed at
allowing any MIME part in a SIP message to be referred to indirectly via a URI. There
are numerous reasons why it might be desirable to specify the content of the SIP
message body indirectly. For bandwidth-limited applications such as cellular wireless,
indirection provides a means to annotate the (indirect) content with meta-data, which
may be used by the recipient to determine whether or not to retrieve the content over
a resource-limited link. Similarly, there are many other reasons for relieving the SIP
signaling entities not to be overwhelmed with media contents.
A UAC/UAS indicates support for content indirection by including the message/
external-body MIME type in the Accept header. The UAC/UAS may supply additional
values in the Accept header to indicate the content types that it is willing to accept,
either directly or through content indirection. UAs supporting content indirection
must support content indirection of the application/sdp MIME type. Applications that
use this content indirection mechanism must support the HTTP URI scheme.
Additional URI schemes may be used, but a UAC/UAS must support receiving a HTTP
URI for indirect content if it advertises support for content indirection. The UAS may
advertise alternate access schemes in the schemes parameter of the Contact header in
the UAS response to the UAC’s session establishment request (e.g., INVITE,
SUBSCRIBE), as described in RFC 3840 (see Section 3.4).
(Continued)
Basic Session Initiation Protocol ◾ 91
If a UAS receives a SIP request that contains a content indirection payload and the UAS
cannot or does not wish to support such a content type, it must reject the request
with a 415 Unsupported Media Type response. In particular, the UAC should note the
absence of the message/external-body MIME type in the Accept header of this
response to indicate that the UAS does not support content indirection, or the
absence of the particular MIME type of the requested comment to indicate that the
UAS does not support the particular media type. Applications that use this content
indirection mechanism MUST support the HTTP URI scheme. Additional URI schemes
may be used, but a UAC/UAS must support receiving a HTTP URI for indirect content if
it advertises support for content indirection. The UAS may advertise alternate access
schemes in the schemes parameter of the Contact header in the UAS response to the
UAC’s session establishment request (e.g., INVITE, SUBSCRIBE), as described in RFC
3840 (see Section 3.4).
Some content is not critical to the context of the communication if there is a fetch or
conversion failure. The content indirection mechanism uses the Critical-Content
mechanism described in RFC 5389 (see Section 14.3).
In particular, if the UAS is unable to fetch or render an optional body part, then the
server must not return an error to the UAC. To determine whether the content
indirectly referenced by the URI has changed, a Content-ID entity header is used. The
Content-ID and Message-ID syntax for the URLs specified in RFC 2392 are as follows:
content-id = url-addr-spec
message-id = url-addr-spec
url-addr-spec = addr-spec
; URL encoding of RFC 5322 addr-spec
cid-url = "cid" ":" content-id
mid-url = "mid" ":" message-id
["/" content-id]
Note that in Internet mail messages, the addr-spec in a Content-ID defined in RFC 2045
or Message-ID specified in RFC 5322 header is enclosed in angle brackets (<>). Since
addr-spec in a Message-ID or Content-ID might contain characters not allowed within
a URL, any such character (including /, which is reserved within the mid scheme) must
be hex-encoded using the %hh escape mechanism in RFCs 4248 and 4266.
A mid URL with only a message-id refers to an entire message. With the appended
content-id, it refers to a body part within a message, as does a cid URL. The
Content-ID of a MIME body part is required to be globally unique. However, in many
systems that store messages, body parts are not indexed independently according to
their content (message). The mid URL long form was designed to supply the context
needed to support interoperability with such systems.
Content-ID values must be generated to be world-unique. The Content-ID value may
be used for uniquely identifying MIME entities in several contexts, particularly for
caching data referenced by the message/external-body mechanism. Changes in the
underlying content referred to by a URI must result in a change in the Content-ID
associated with that URI. Multiple SIP messages carrying URIs that refer to the same
content should reuse the same Content-ID, to allow the receiver to cache this content
and to avoid unnecessary retrievals. The Content-ID is intended to be globally unique
and should be temporally unique across SIP dialogs. For example:
Content-ID: <[email protected]>
(Continued)
92 ◾ Handbook on Session Initiation Protocol
The URI supplied by the Content-Type header is not required to be accessible or valid
for an indefinite period of time. Rather, the supplier of the URI must specify the time
period for which this URI is valid and accessible. This is done through an EXPIRATION
parameter of the Content-Type. The format of this expiration parameter is an RFC 1123
date–time value. This is further restricted in this application to use only GMT time,
consistent with the Date: header in SIP. This is a mandatory parameter. Note that the
date–time value can range from minutes to days or even years.
If the sender knows the specific content being referenced by the indirection, and if the
sender wishes the recipient to be able to validate that this content has not been
altered from that intended by the sender, the sender includes a SHA-1 specified in
RFC 3174 hash of the content. If it is included, the hash is encoded by extending the
MIME syntax defined in RFC 2046 to include a hash parameter for the content type
message/external-body, whose value is a hexadecimal encoding of the hash. One may
use the Content-Description entity header to provide optional, freeform text to
comment on the indirect content. This text may be displayed to the end user but must
not be used by other elements to determine the disposition of the body. One may also
see the Content-Description entity header to provide optional, freeform text to
comment on the indirect content. This text may be displayed to the end user but must
not be used by other elements to determine the disposition of the body.
SIP defines Call-Info, Error-Info, and Alert-Info headers that supply additional
information with regard to a session, a particular error response, or alerting. All three
of these headers allow the UAC or UAS to indicate additional information through a
URI. They may be considered a form of content indirection. The content indirection
mechanism defined in this document is not intended as a replacement for these
headers. Rather, the headers defined in SIP must be used in preference to this
mechanism, where applicable, because of the well-defined semantics of those headers.
The compact form of the header field is c. Examples:
Content-Type: application/SDP
c: text/html; charset=ISO-8859-4
Content-Type: message/external-body;
access-type="URL";
expiration="Mon, 24 June 2002 09:00:00 GMT";
URL="http://www.example.com/the-indirect-content
.au";
size=52723;
hash=10AB568E91245681AC1B
<CRLF>
Content-Description: Multicast Gaming
Content-Disposition: render
The message/sip MIME Message-Body Type:
RFC 3261 registers the message/sip MIME media type in order to allow SIP messages to
be tunneled as bodies within SIP, primarily for end-to-end security purposes. This
media type is defined by the following information:
The following ABNF rule describes a message/sipfrag part using the SIP grammar
elements defined in RFC 3261 (see Section 2.4.1). The expansion of any element is
subject to the restrictions on valid SIP messages defined there.
sipfrag = [start-line]
*message-header
[CRLF [message-body]]
If the message/sipfrag part contains a body, it must also contain the appropriate header
fields describing that body (such as Content-Length) and the null-line separating the
header from the body. We are providing some valid message/sipfrag message-body
examples using a vertical bar and a space to the left of each example to illustrate the
example’s extent. Each line of the message/sipfrag element begins with the first
character after the “|” pair. The first two examples show that a message/sipfrag part can
consist of only a start line.
The next two show that Subsets of a full SIP message may be represented.
A message/sipfrag part does not have to contain a start line. This example shows a part
that might be signed to make assertions about a particular message.
The next two examples show message/sipfrag parts that contain bodies.
| SIP/2.0 200 OK
| Content-Type: application/sdp
| Content-Length: 247
|
| v=0
| o=alice 2890844526 2890844526 IN IP4 host.anywhere.com
| s=
| c=IN IP4 host.anywhere.com
| t=0 0
| m=audio 49170 RTP/AVP 0
| a=rtpmap:0 PCMU/8000
| m=video 51372 RTP/AVP 31
| a=rtpmap:31 H261/90000
| m=video 53000 RTP/AVP 32
| a=rtpmap:32 MPV/90000
| Content-Type: text/plain
| Content-Length: 11
| Hi There!
CSeq/RFC 3261 (Standards A CSeq header field in a request contains a single decimal sequence number and the
Track)/Request and request method. The sequence number MUST be expressible as a 32-bit unsigned
Response integer. The method part of CSeq is case sensitive. The CSeq header field serves to
order transactions within a dialog, to provide a means to uniquely identify
transactions, and to differentiate between new requests and request retransmissions.
Two CSeq header fields are considered equal if the sequence number and the request
method are identical. Example:
Date/RFC 3261 (Standards The Date header field contains the date and time. Unlike HTTP/1.1, SIP only supports
Track)/Request and the most recent RFC 1123 format for dates. However, SIP restricts the time zone in
Response SIP-date to GMT, while RFC 1123 allows any time zone. An RFC 1123 date is case
sensitive. The Date header field reflects the time when the request or response is first
sent. The Date header field can be used by simple end systems without a battery-
backed clock to acquire a notion of current time. However, in its GMT form, it requires
clients to know their offset from GMT. Example:
Date: Sat, 13 Nov 2010 23:29:00 GMT
Encryption/RFC 2543 The Encryption header field defined in RFC 2543 obsoleted by RFC 3261 indicates that
(Standards Track)/Request the content has been encrypted, but is not included in RFC 3261 that obsoletes RFC
and Response 2543. Instead, RFC 3261 defined encryption using S/MIME.
(Continued)
Basic Session Initiation Protocol ◾ 95
Error-Info/RFC 3261 The Error-Info header field provides a pointer to additional information about the error
(Standards Track)/Response status response. SIP UACs have user interface capabilities ranging from pop-up
windows and audio on PC soft clients to audio-only on black phones or end points
connected via gateways. Rather than forcing a server generating an error to choose
between sending an error status code with a detailed reason phrase and playing an
audio recording, the Error-Info header field allows both to be sent.
The UAC then has the choice of which error indicator to render to the caller. A UAC
may treat a SIP or SIPS URI in an Error-Info header field as if it were a Contact in a
redirect and generate a new INVITE, resulting in a recorded announcement session
being established. A non-SIP URI may be rendered to the user. Examples:
SIP/2.0 404 The number you have dialed is not in service
Error-Info: <sip:[email protected]>
Event/RFC 6665 (Standards The Event header field is used by SIP UAs in SUBSCRIBE (or NOTIFY) method indicating
Track)/Request to which event or class of events they are subscribing. The Event header will contain a
token that indicates the type of state for which a subscription is being requested. This
token will be registered with the IANA and will correspond to an event package that
further describes the semantics of the event or event class. The Event header may also
contain an id parameter. This id parameter, if present, contains an opaque token that
identifies the specific subscription within a dialog. An id parameter is only valid within
the scope of a single dialog.
For the purposes of matching responses and NOTIFY messages with SUBSCRIBE
messages, the event-type portion of the Event header is compared byte by byte, and
the id parameter token (if present) is compared byte by byte. An Event header
containing an id parameter never matches an Event header without an id parameter.
No other parameters are considered when performing a comparison.
RFC 7463 registers the SIP header field defining a new parameter as shown below
through IANA registration:
Expires/RFC 3261 (Standards The Expires header field gives the relative time after which the message (or content)
Track)/Request and expires. The precise meaning of this is method dependent. The expiration time in an
Response INVITE does not affect the duration of the actual session that may result from the
invitation. Session description protocols may offer the ability to express time limits on
the session duration, however. The value of this field is an integral number of seconds
(in decimal) between 0 and (232 − 1), measured from the receipt of the request.
Example:
Expires: 5
(Continued)
96 ◾ Handbook on Session Initiation Protocol
Flow-Timer/RFC 5626 The Flow-Timer header field defined in RFC 5626 (see Section 13.2) indicates the
(Standards Track)/Response amount of time remaining for a registered flow with the registration server before
considering it dead if no keep-alive message is sent by the UA to the registrar. This
header field is very important for maintaining the outbound connections, which are
usually considered of long duration especially for real-time teleconferencing or video
conferencing, managed by SIP proxies for the SIP request messages that may
frequently be disconnected or disturbed by middle boxes like NATs and firewalls. A
UA may not have any clue if those outbound connections are disconnected. The
Flow-Timer header field contains parameters like reg-id and instance-id that are used
to identify the uniqueness of flow even if a UA or proxy fails and reboots. The same is
also used by the REGISTER message in the Contact header field if a UA and registrar
supports the outbound connection management specified in RFC 5626.
To set up connections between the clients by an outbound proxy or outbound-proxy-
set, a lot of processing and communications are done, which can easily make the SIP
connection setup nonscalable even for a moderately large network. RFC 3263 (see
Section 8.2.4) has mandated some IP connection setup procedures to make the SIP
network scalable where millions of calls need to be handled for the large-scale
network. In view of the disconnections by middle boxes like NATs and firewalls that
remain in the path, the connection setup in the SIP network needs to be further
scaled. The client-initiated connection management defined by RFC 5626 that uses
reg-id and instance-id parameters in REGISTER message and Time-Flow header field
has optimized the connection and setup, and hence the management, further making
the SIP network connection highly scalable. The detail of the registration using RFC
5626 is described in Section 13.2. Example:
Flow-Timer: 3600
From/RFC 3261 (Standards The From header field indicates the initiator of the request. This may be different from
Track)/Request and the initiator of the dialog. Requests sent by the callee to the caller use the callee’s
Response address in the From header field. The optional display-name is meant to be rendered
by a human user interface. A system should use the display name Anonymous if the
identity of the client is to remain hidden. Even if the displayname is empty, the
name-addr form must be used if the addr-spec contains a comma, question mark, or
semicolon. Syntax issues are discussed in Section 2.4.1.2.
Two From header fields are equivalent if their URIs match, and their parameters match.
Extension parameters in one header field, not present in the other, are ignored for the
purposes of comparison. This means that the display name and presence or absence
of angle brackets do not affect matching. See Section 4.2 (RFC 3261) for the rules for
parsing a display name, URI and URI parameters, and header field parameters. The
compact form of the From header field is f. Examples:
Geolocation/RFC 6442 The Geolocation header field in SIP conveys the location information of the SIP
(Standards Track)/Request functional entities on end-to-end, and SIP entities in the SIP network may use this
location information for making routing decisions.
pres-URI is defined in RFC 3859 (see Section 6.2.2). http-URI and https-URI are defined
according to RFC 2616 (obsoleted by RFCs 7230–7235) and RFC 2818, respectively. The
cid-url is defined in RFC 2392 to locate message-body parts. This URI type is present in
a SIP request when location is conveyed as a MIME body in the SIP message. GEO-
URIs defined in RFC 5870 are not appropriate for usage in the SIP Geolocation header
because it does not include retention and retransmission flags as part of the location
information. Other URI schemes used in the location URI must be reviewed against
the criteria defined in RFC 3693 for a Using Protocol that uses the location object (LO).
The generic-param in the definition of locationValue is included as a mechanism for
future extensions that might require parameters. This document defines no
parameters for use with locationValue. If a Geolocation header field is received that
contains generic-params, each parameter should be ignored and should not be
removed when forwarding the locationValue. If a need arises to define parameters for
use with locationValue, a revision/extension to this document is required.
The Geolocation header field must have at least one locationValue. A SIP intermediary
should not add location to a SIP request that already contains location. This will quite
often lead to confusion within location recipients (LRs). However, if a SIP intermediary
adds location, even if location was not previously present in a SIP request, that SIP
intermediary is fully responsible for addressing the concerns of any 424 Bad Location
Information SIP response it receives about this location addition and must not pass on
(upstream) the 424 Bad Location Information response.
A SIP intermediary that adds a locationValue must position the new locationValue as
the last locationValue within the Geolocation header field of the SIP request. The
Geolocation header field is valid in the following SIP requests: INVITE, REGISTER,
OPTIONS, BYE, UPDATE, INFO, MESSAGE, REFER, SUBSCRIBE, NOTIFY, and PUBLISH.
The Geolocation header field may be included in any one of the above-listed requests
by a UA and a 424 response to any one of the requests sent above. Fully appreciating
the caveats/warnings mentioned above, a SIP intermediary may add the Geolocation
header field. A SIP intermediary may add a Geolocation header field if one is not
present, for example, when a UA does not support the Geolocation mechanism but
their outbound proxy does and knows the Target’s location, or any of a number of
other use cases.
The Geolocation header field may be present in a SIP request or response without the
presence of a Geolocation-Routing header. The default value of Geolocation-Routing
header-value is no, meaning SIP intermediaries must not view (i.e., process, inspect, or
actively dereference) any direct or indirect location within this SIP message. This is for
at least two fundamental reasons:
• To make the possibility of retention of the Target’s location moot (because it was
not viewed in the first place).
• To prevent a different treatment of this SIP request based on the contents of the
Location Information in the SIP request.
(Continued)
98 ◾ Handbook on Session Initiation Protocol
Any locationValue must be related to the original Target. This is equally true for the
location information in a SIP response, that is, from a SIP intermediary back to the
Target. SIP intermediaries should not modify or delete any existing locationValue(s). A
use case in which this would not apply would be where the SIP intermediary is an
anonymizer. The problem with this scenario is that the geolocation included by the
Target then becomes useless for the purpose or service for which they wanted to use
(include) it. For example, 911 (emergency calling) or finding the nearest (towing
company/pizza delivery/dry cleaning) service(s) will not yield the intended results if
the Location Information were to be modified or deleted from the SIP request.
Example:
Geolocation: <cid:[email protected]>
Geolocation-Error/RFC 6442 The Geolocation-Error header field is used for providing more granular error
(Standards Track)/Response notifications specific to location errors within a received SIP request message that
carries the location information if the location inserting entity is to know what was
wrong within the original request. That is, the Geolocation-Error header field is used
to convey location-specific errors within a response. The Geolocation-Error header
field must contain only one locationErrorValue to indicate what was wrong with the
locationValue the Location Recipient determined was bad. The locationErrorValue
contains a three-digit error code indicating what was wrong with the location in the
request. This error code has a corresponding quoted error text string that is human
understandable. The text string is optional, but recommended for human readability,
similar to the string phrase used for SIP response codes. The strings are complete
enough for rendering to the user, if so desired. The strings in this document are
recommendations, and are not standardized—meaning an operator can change the
strings but must not change the meaning of the error code. Similar to RFC 3261
specification, there must not be more than one string per error code.
The Geolocation-Error header field may be included in any response to one of the SIP
methods mentioned in the case of Geolocation header field, so long as a
locationValue was in the request part of the same transaction. For example, Alice
includes her location in an INVITE to Bob. Bob can accept this INVITE, thus creating a
dialog, even though his UA determined the location contained in the INVITE was bad.
Bob merely includes a Geolocation-Error header value in the 200 OK responses to the
INVITE informing Alice the INVITE was accepted but the location provided was bad. If,
on the other hand, Bob cannot accept Alice’s INVITE without a suitable location, a 424
Bad Location Information response is sent.
If Alice is deliberately leaving location information out of the location object because
she does not want Bob to have this additional information, implementations should be
aware that Bob could have made the error repeatedly in order to receive more
location information about Alice in a subsequent SIP request. Implementations must
be on guard for this, by not allowing continually more information to be revealed
unless it is clear that any LR is permitted by Alice to know all that Alice knows about
her location. A limit on the number of such rejections to learn more location
information should be configurable, with a recommended maximum of three times
for each related transaction.
A SIP intermediary that requires Alice’s location in order to properly process Alice’s
INVITE also sends a 424 Bad Location Information response with a Geolocation-Error
code. If more than one locationValue is present in a SIP request and at least one
locationValue is determined to be valid by the LR, the location in that SIP request must
be considered good as far as location is concerned, and no Geolocation-Error is to be
sent.
(Continued)
Basic Session Initiation Protocol ◾ 99
Here is an initial list of location-based error code ranges for any SIP response, including
provisional responses (other than 100 Trying) and the new 424 Bad Location
Information response. These error codes are divided into three categories, based on
how the response receiver should react to these errors. There must be no more than
one Geolocation-Error code in a SIP response, regardless of how many locationValues
there are in the correlating SIP request. When more than one locationValue is present
in a SIP request, this mechanism provides no indication to which one the Geolocation-
Error code corresponds. If multiple errors are present, the LR applies local policy to
select one.
• 1xx errors mean the LR cannot process the location within the request: A
nonexclusive list of reasons for returning a 1xx is as follows:
– The location was not present or could not be found in the SIP request.
– There was not enough location information to determine where the Target was.
– The location information was corrupted or known to be inaccurate.
• 2xx errors mean some specific permission is necessary to process the included
location information.
• 3xx errors mean there was trouble dereferencing the Location URI sent.
If an error recipient cannot process a specific error code (such as the 201 or 202 below),
perhaps because it does not understand that specific error code, the error recipient
should process the error code as if it originally were a top-level error code where the
X in X00 matches the specific error code. If the error recipient cannot process a
non-100 error code, for whatever reason, then the error code 100 must be processed.
There are two specific Geolocation-Error codes necessary to include in this document;
both have to do with permissions necessary to process the SIP request; they are
Geolocation-Error: 201; code = “Permission to Retransmit Location Information to a Third
Party”
(Continued)
100 ◾ Handbook on Session Initiation Protocol
This location error is specific to having the Geolocation-Routing header value set to no.
This location error is stating it requires permission (i.e., the Geolocation-Routing
header value set to yes) to process this SIP request further. If the LS sending the
location information does not want to give this permission, it will not change this
permission in a new request. If the LS wants this message processed with the
<retransmission-allowed> element set to yes, it must choose another logical path (if
one exists) for this SIP request.
Geolocation-Routing/RFC The Geolocation-Routing header field used in SIP request messages to indicate
6442 (Standards Track)/ whether or not SIP functional entities can route the messages within the SIP network
Request based on the information provided in the location object. The only defined values for
the Geolocation-Routing header field are yes or no. When the value is yes, the
locationValue can be used for routing decisions along the downstream signaling path
by intermediaries. Values other than yes or no are left for future extensions.
Implementations not aware of an extension must treat any other received value the
same as no. If no Geolocation-Routing header field is present in a SIP request, a SIP
intermediary may insert this header. Without knowledge from a Rule Maker, the SIP
intermediary inserting this header-value should not set the value to yes, as this may be
more permissive than the originating party intends. An easy way around this is to have
the Target always insert this header-value as no.
When this Geolocation-Routing header-value is set to no, this means no locationValue
(inserted by the originating UAC or any intermediary along the signaling path) can be
used by any SIP intermediary to make routing decisions. Intermediaries that attempt
to use the location information for routing purposes in spite of this counter indication
could end up routing the request improperly as a result. The practical implication is
that when the Geolocation-Routing header-value is set to no, if a cid:url is present in
the SIP request, intermediaries must not view the location (because it is not for
intermediaries to consider when processing the request); if a location URI is present,
intermediaries must not dereference it.
UAs are allowed to view location in the SIP request even when the Geolocation-
Routing header-value is set to no. An LR must by default consider the Geolocation-
Routing header-value as set to no, with no exceptions, unless the header field value is
set to yes security properties. At most, it is a request for behavior within SIP
intermediaries. That said, if the Geolocation-Routing header-value is set to no, SIP
intermediaries are still to process the SIP request and send it further downstream
within the signaling path if there are no errors present in this SIP request.
(Continued)
Basic Session Initiation Protocol ◾ 101
The Geolocation-Routing header field satisfies the recommendations made in Section 3.5
of RFC 5606 regarding indication of permission to use location-based routing in SIP. SIP
implementations are advised to pay special attention to the policy elements for location
retransmission and retention described in RFC 4119. The Geolocation-Routing header
field cannot appear without a header-value in a SIP request or response; that is, a null
value is not allowed. The absence of a Geolocation-Routing header-value in a SIP
request is always the same as the following header field: Geolocation-Routing: no.
The Geolocation-Routing header field may be present without a Geolocation header
field in the same SIP request. The Geolocation header field contains a Target’s
location, and it must not be present if there is no location information in this SIP
request. The location information is contained in one or more locationValues. These
locationValues may be contained in a single Geolocation header field or distributed
among multiple Geolocation header fields as indicated in RFC 3261.
The Geolocation-Routing header field indicates whether or not SIP intermediaries can
view and then route this SIP request based on the included (directly or indirectly)
location information. The Geolocation-Routing header field must not appear more
than once in any SIP request, and must not lack a header-value. The default or implied
policy of a SIP request that does not have a Geolocation-Routing header field is the
same as if one were present and the header-value were set to no.
There are only three possible states regarding the Geolocation-Routing header field:
No, Yes, or No header-field present in this SIP request. The expected results in each
state are as shown below:
If Geolocation-Routing
Is Only Possible Interpretations
Example:
Geolocation-Routing: no
Geolocation-Routing: yes
Geolocation-Routing: Geolocation-Routing absent
(Continued)
102 ◾ Handbook on Session Initiation Protocol
Hide/RFC 2543 (Standards The Hide header field defined in RFC 2543 (obsoleted by RFC 3261) is used by UAs or
Track)/Request proxies to request that the next hop proxy encrypts the Via header fields to hide
message route information. However, RFC 3261 has deprecated this header field. RFC
3261 that supersedes RFC 2543 has deprecated the use of this header.
History-Info/RFC 4244 The History-Info header field used in the SIP request and response messages to inform
(Standards Track)/Request proxies and UAs involved in processing a request about the history or progress of that
and Response request. This header field captures the history of a request that would be lost with the
normal SIP processing involved in the subsequent forwarding of the request. The
support of the History-Info header field requires no changes in the fundamental
determination of request targets or in the request forwarding as defined in RFC 3261
(see Section 3.11). The History-Info header can appear in any request not associated
with an established dialog, for example, INVITE, REGISTER, MESSAGE, REFER,
OPTIONS, PUBLISH, and SUBSCRIBE request messages and any valid response to
these requests. This capability enables many enhanced services by providing the
information as to how and why a call arrives at a specific application or user.
The History-Info header is added to a Request when a new request is created by a UAC
or forwarded by a proxy, or when the target of a request is changed. That is, the
History-Info header provides the useful information especially to application servers,
proxies, and UAs when the request/response messages are retargeted. The term
retarget refers to the changing of the target of a request and the subsequent
forwarding of that request.
It should be noted that retargeting only occurs when the Request-URI indicates a
domain for which the processing entity is responsible. In terms of the SIP, the
processing associated with retargeting, as described in RFC 3261 (see Section 3.11), is
possible for the target of a request to be changed by the same proxy multiple times
referred to as internal retargeting, as the proxy may add targets to the target set after
beginning Request Forwarding. RFC 3261 (see Section 3.11) describes Request
Forwarding. It is during this process of Request Forwarding that the History
Information is captured as an optional, additional header field. Thus, the addition of
the History-Info header does not affect fundamental SIP Request Forwarding. An
entity (UA or proxy) changing the target of a request in response to a redirect or REFER
should also propagate any History-Info header from the initial Request in the new
request.
The History-Info header is optional in that neither UAs nor proxies are required to
support it.
A new Supported header, histinfo, is included in the Request to indicate whether the
History-Info header is returned in Responses. In addition to the histinfo Supported
header, local policy determines whether or not the header is added to any request, or
for a specific Request-URI, being retargeted. It is possible that this could restrict the
applicability of services that make use of the Request History Information to be
limited to retargeting within domain(s) controlled by the same local policy, or between
domain(s) that negotiate policies with other domains to ensure support of the given
policy, or services for which complete History Information is not required to provide
the service. All applications making use of the History-Info header must clearly define
the impact of the information not being available, and specify the processing of such a
request.
(Continued)
Basic Session Initiation Protocol ◾ 103
The History-Info header can reveal the detailed information on how a request or
response message has been targeted and retargeted. The History-Info header should
not be used where the Privacy header described in RFC 3323 (see Section 20.2) indicated
that the general routing information should not be viewed by any intermediaries. In
general, the Privacy header should be used to determine whether an intermediary can
include the History-Info header in a Request that it receives and forwards or that it
retargets. Thus, the History-Info header should not be included in Requests where the
requestor has indicated a priv-value of Session-level or Header-level privacy.
The local policy may also be used to determine whether to include the History-Info header
at all, whether to capture a specific Request-URI in the header, or whether it be included
only in the Request as it is retargeted within a specific domain. In the latter case, this is
accomplished by adding a new priv-value, history, to the Privacy header of RFC 3323,
indicating whether any or a specific History-Info header(s) should be forwarded.
It is recognized that satisfying the privacy requirements can influence the functionality of
this solution by overriding the request to generate the information. The applications
making use of History-Info should address any impact on security and privacy this
header may have, or must explain why it does not have an impact on security and
privacy. The History-Info header carries the following information, with the mandatory
parameters required when the header is included in a request or response:
Example:
History-Info: <sip:[email protected]>;index=1,
<sip:[email protected]>; index=1.1,
<sip:[email protected]>; index=1.2
(Continued)
104 ◾ Handbook on Session Initiation Protocol
Identity/RFC 4474 (Standards The Identity header field defines a mechanism for securely identifying originators of
Track)/Request SIP messages that can be used for both intra- and interdomain through conveying a
signature used for validating the identity. SIP UAs or SIP servers can provide the
authentication service over the SIP network. The Identity string is constructed with
different parts of the SIP message that are separated by a “|” character. In fact, the
Identity header is the signed hash of a canonical identity string consisting of caller’s
AOR expressed in the form of SIP URI, SIPS URI, Tel URI, or any other URI that is
included in the caller’s From and callee’s To header fields; information from the
Call-Id, CSeq, Date, and Contact header fields; and all the body content that contains
the SDP part. However, the Date header field may not be present in the request and
Contact header field may be empty. If the Date field is not present, the authentication
service adds one. If the Contact header field is empty, then the corresponding field in
the string is left empty as well. Using the private key of the service provider, the
authentication service signs the hash calculated over the identity string and adds the
Identity header to the SIP request.
The signed-identity-digest is a signed hash of a canonical string generated from certain
components of a SIP request. To create the contents of the signed-identity-digest, the
following elements of a SIP message must be placed in a bit-exact string in the order
specified here, separated by a vertical line, “|” or %x7C character:
• The AOR of the UA sending the message, or addr-spec of the From header field
(referred to occasionally here as the identity field).
• The addr-spec component of the To header field, which is the AoR to which the
request is being sent.
• The callid from Call-Id header field.
• The digit (1*DIGIT) and method (method) portions from CSeq header field,
separated by a single space (ABNF SP, or %x20). Note that the CSeq header field
allows LWS rather than SP to separate the digit and method portions, and thus the
CSeq header field may need to be transformed to be canonicalized. The
authentication service must strip leading zeros from the digit portion of the Cseq
before generating the digest-string.
• The Date header field, with exactly one space each for each SP, and the weekday
and month items case set as shown in ABNF in RFC 3261 (see Section 2.4.1.2). RFC
3261 specifies that the ABNF for weekday and month is a choice among a set of
tokens. The RFC 2234 (obsoleted by RFC 5234) rules for the ABNF specify that tokens
are case sensitive. However, when used to construct the canonical string defined
here, the first letter of each week and month must be capitalized, and the
remaining two letters must be lowercase. This matches the capitalization provided
in the definition of each token. All requests that use the Identity mechanism must
contain a Date header.
• The addr-spec component of the Contact header field value. If the request does not
contain a Contact header, this field must be empty (i.e., there will be no white space
between the fourth and fifth “|” characters in the canonical string).
• The body content of the message with the bits exactly as they are in the Message (in
the ABNF for SIP, the message body). This includes all components of multipart
message bodies. Note that the message body does NOT include the CRLF
separating the SIP headers from the message body, but does include everything
that follows that CRLF. If the message has no body, then message body will be
empty, and the final “|” will not be followed by any additional characters.
(Continued)
Basic Session Initiation Protocol ◾ 105
Note again that the first addr-spec must be taken from the From header field value, the
second addr-spec must be taken from the To header field value, and the third addr-
spec must be taken from the Contact header field value, provided the Contact header
is present in the request. After the digest-string is formed, it must be hashed and
signed with the certificate for the domain. The hashing and signing algorithm is
specified by the alg parameter of the Identity-Info header (see below this table for
more information on Identity-Info header parameters). This document defines only
one value for the alg parameter: rsa-sha1; further values must be defined in a
Standards Track RFC. All implementations of this specification must support rsa-sha1.
When the rsa-sha1 algorithm is specified in the alg parameter of Identity-Info, the
hash and signature must be generated as follows: compute the results of signing this
string with sha1WithRSAEncryption as described in RFC 3370 (obsoleted by RFC 5730)
and base64 encode the results as specified in RFC 3548 (obsoleted by RFC 4648). A
1024-bit or longer RSA key must be used. The result is placed in the Identity header
field. For detailed examples of the usage of this algorithm, see Section 2.6.
The absoluteURI portion of the Identity-Info header must contain a URI that
dereferences to a resource containing the certificate of the authentication service. All
implementations of this specification must support the use of HTTP and HTTPS URIs
in the Identity-Info header. Such HTTP and HTTPS URIs must follow the conventions
of RFC 2585, and for those URIs the indicated resource must be of the form
application/pkix-cert described in that specification. Note that this introduces key
life-cycle management concerns; were a domain to change the key available at the
Identity-Info URI before a verifier evaluates a request signed by an authentication
service, this would cause obvious verifier failures. When a rollover occurs,
authentication services should thus provide new Identity-Info URIs for each new
certificate, and should continue to make older key acquisition URIs available for
duration longer than the plausible lifetime of a SIP message (an hour would most
likely suffice). The Identity-Info header field must contain an alg parameter. No other
parameters are defined for the Identity-Info header in this document. Future
Standards Track RFCs may define additional Identity-Info header parameters.
An example is shown below:
Identity:”ZYNBbHC00VMZr2kZt6VmCvPonWJMGvQTBDqghoWeLxJfzB2a1pxAr3VgrB0
SsSAaifsRdiOPoQZYOy2wrVghuhcsMbHWUSFxI6p6q5TOQXHMmz6uEo3svJsSH49thy
GnFVcnyaZ++yRlBYYQTLqWzJ+KVhPKbfU/pryhVn9Yc6U=”
In the same way, the caller might also want to know whether the call has reached to the
callee is actually the one with whom he/she wants to communicate. In theory, the
same approach can be used for authenticating the identity of the callee. In this case,
the authentication service can add an Identity header in the response to assert the
identity included in the To header. However, the responses cannot be authenticated
unless either the authentication service is located in the callee’s device or the
communication between the callee and the authentication service is secured. Even
then, there can be some problems if the call is retargeted to reach the callee, and as a
result, the URI used in the To header used by the caller will be different than that of
the retargeted URI where the callee has been reached. Since it is mandatory that the
From and To headers of the SIP request messages and their responses cannot be
changed, the authentication service of the retargeted domain cannot authenticate the
callee of the retargeted URI.
(Continued)
106 ◾ Handbook on Session Initiation Protocol
This problem of retargeting the callee has been solved in RFC 4919 (see Section 10.4.3.1)
through deprecating mandatory reflection of the original To and From URIs in mid-
dialog requests and their responses, which constitutes a change to RFC 3261. RFC 4919
makes no provision for proxies that are unable to tolerate a change of URI, since
changing the URI has been expected for a considerable time. To cater for any UAs that
are not able to tolerate a change of URI, a new option tag from-change is introduced
for providing a positive indication of support in the Supported header field. By
sending a request with a changed From header field URI only to targets that have
indicated support for this option, there is no need to send this option tag in a Require
header field.
The retargeted callee with whom the call has finally been established is defined as the
connected identity, and the callee sends a request UPDATE or re-INVITE to the caller
once the session is established using as the From header the connected identity. This
identity is asserted by the authentication service of the retargeted URI domain and is
then verified at the caller side. However, the caller must be willing to accept the
deviation from the SIP specifications defined in RFC 3261 and accept the in-dialog
request with the From header that differs from the To header that the caller has used
for setting up the dialog as stated earlier.
The Identity defined in RFC 4474 (see Section 19.4.8) and the connected identity
mechanisms specified in RFC 4916 allow an authenticated indemnity. However, it
cannot prevent itself from the man-in-the-middle attack. Identity and Identity-Info
headers from the request can be stripped by the attacker, and the request can still be
valid. In this case, the callee cannot be able to verify the identity of the caller, and the
caller would reject the call in the worst-case scenario.
Identity-Info/RFC 4474 The Identity-Info header field conveys a reference to the certificate of the signer of the
(Standards Track)/Request authentication service that signs the hash calculated over the identity string and adds
the Identity header to the SIP message. It contains a URI of a resource that contains
the certificate of the authentication service as well as the names of the algorithms
used for generating the Identity header. The syntax of this header is defined in the
Identify header of this table. Example:
Identity-Info: <https://atlanta.example.com/atlanta.
cer>;alg=rsa-sha1
Info-Package/RFC 6086 The Info-Package header field is used by a UA to indicate which Info Package is
(Standards Track)/Request associated with the request usually with the INFO method. One particular INFO
request can only be associated with a single Info Package. RFC 6086 of the INFO
method (see Section 16.8) defines an Info Package mechanism. An Info Package
specification defines the content and semantics of the information carried in an INFO
message associated with the Info Package. The Info Package mechanism also provides
a way for UAs to indicate for which Info Packages they are willing to receive INFO
requests, and which Info Package a specific INFO request is associated with.
In-Reply-To/RFC 3261 The In-Reply-To header field enumerates the Call-IDs that this call references or
(Standards Track)/Request returns. These Call-IDs may have been cached by the client, then included in this
header field in a return call. This allows automatic call distribution systems to route
return calls to the originator of the first call. This also allows callees to filter calls, so
that only return calls for calls they originated will be accepted. This field is not a
substitute for request authentication. Example:
Join/RFC 3911 (Standards The Join header field is used to logically join an existing SIP dialog with a new SIP
Track)/Request dialog. The one consequence of this insertion of a new participant to an existing
two-party multimedia call will essentially make the call a three-way multiparty
conference call where media bridging may be required. Use of an explicit Join header
is needed in some cases instead of addressing an INVITE to a conference URI for the
following reasons:
• A conference may not yet exist—the new invitation may be trying to join an
ordinary two-party call.
• The party joining may not know if the dialog it wants to join is part of a conference.
• The party joining may not know the conference URI.
This primitive can be used to enable a variety of features, for example: Barge-In,
answering-machine-style Message Screening, Call Center Monitoring, and other
multiparty conferencing services. A Join header must contain exactly one to-tag and
exactly one from-tag, as they are required for unique dialog matching. For
compatibility with dialogs initiated by RFC 2543-compliant UAs, which is superseded
by RFC 3261, a to-tag of zero matches both a to-tag value of zero and a null to-tag.
Likewise, a from-tag of zero matches both a to-tag value of zero and a null from-tag.
Examples:
Join: [email protected]
;from-tag=r33th4x0r
;to-tag=ff87ff
Join: 12adf2f34456gs5;to-tag=12345;from-tag=54321
Join: [email protected];to-tag=24796;from-tag=0
Join: [email protected];to-tag=xyz;from-tag=pdq
Max-Breadth/RFC 5393 The Max-Breadth mechanism (RFC 5393, see Section 19.9) limits the total number of
(Standards Track)/Request concurrent branches caused by a forked SIP request. With this mechanism, all
proxyable requests are assigned a positive integral Max-Breadth value, which denotes
the maximum number of concurrent branches this request may spawn through
parallel forking as it is forwarded from its current point. When a proxy forwards a
request, its Max-Breadth value is divided among the outgoing requests. In turn, each
of the forwarded requests has a limit on how many concurrent branches it may spawn.
As branches complete, their portion of the Max-Breadth value becomes available for
subsequent branches, if needed. If there is insufficient Max-Breadth to carry out a
desired parallel fork, a proxy can return the 440 Max-Breadth Exceeded response
defined in this document.
Max-Breadth does not prevent forking. It only limits the number of concurrent parallel
forked branches. In particular, a Max-Breadth of 1 restricts a request to pure serial
forking rather than restricting it from being forked at all. A client receiving a 440
Max-Breadth Exceeded response can infer that its request did not reach all possible
destinations.
The Max-Breadth header field value takes no parameters. For each response context
defined in Section 16 of RFC 3261 in a proxy, this mechanism defines two positive
integral values: Incoming Max-Breadth and Outgoing Max-Breadth. Incoming Max-
Breadth is the value in the Max-Breadth header field in the request that formed the
response context. Outgoing Max-Breadth is the sum of the Max-Breadth header field
values in all forwarded requests in the response context that have not received a final
response.
(Continued)
108 ◾ Handbook on Session Initiation Protocol
Max-Forwards/RFC 3261 The Max-Forwards header field must be used with any SIP method to limit the number
(Standards Track)/Request of proxies or gateways that can forward the request to the next downstream server.
This can also be useful when the client is attempting to trace a request chain that
appears to be failing or looping in mid-chain. The Max-Forwards value is an integer in
the range 0–255 indicating the remaining number of times this request message is
allowed to be forwarded. This count is decremented by each server that forwards the
request. The recommended initial value is 70. This header field should be inserted by
elements that cannot otherwise guarantee loop detection. For example, a B2BUA
should insert a Max-Forwards header field. Example:
Max-Forwards: 6
MIME-Version/RFC 3261 This SIP header is adopted from HTTP/1.1 of RFC 2616 (obsoleted by RFCs 7230–7235).
(Standards Track)/Response According to this RFC, SIP RFC 3261 messages may include a single MIME-Version
general-header field to indicate what version of the MIME protocol was used to
construct the message. Use of the MIME-Version header field indicates that the
message is in full compliance with the MIME protocol as defined in RFC 2045. Proxies/
gateways are responsible for ensuring full compliance (where possible) when
exporting SIP messages to strict MIME environments.
MIME version 1.0 is the default for use in SIP. However, SIP message parsing and
semantics are defined by RFC 2616 (obsoleted by RFCs 7230–7235) and not the MIME
specification. Example:
MIME-Version: 1.0
Min-Expires/RFC 3261 The Min-Expires header field conveys the minimum refresh interval supported for
(Standards Track)/Response soft-state elements managed by that server. This includes Contact header fields that
are stored by a registrar. The header field contains a decimal integer number of
seconds from 0 to (232 − 1). The use of the header field in a 423 (Interval Too Brief)
response is described in RFC 3261 (see Section 3.3). Example:
Min-Expires: 60
Min-SE/RFC 4028 (Standards The Min-SE header field indicates the minimum value for the session interval, in units
Track)/Request and of delta-seconds. When used in an INVITE or UPDATE request, it indicates the smallest
Response value of the session interval that can be used for that session. When present in a
request or response, its value must not be less than 90 seconds. When the header field
is not present, its default value for is 90 seconds. The Min-SE header field must not be
used in responses except for those with a 422 response code. It indicates the
minimum value of the session interval that the server is willing to accept. Example:
Min-SE: 360
Organization/RFC 3261 The Organization header field conveys the name of the organization to which the SIP
(Standards Track)/Request element issuing the request or response belongs. The field may be used by client
and Response software to filter calls. Example:
Organization: Boxes by Bob
(Continued)
Basic Session Initiation Protocol ◾ 109
P-Access-Network-Info/RFC The P-Access-Network-Info header field can appear in all SIP methods except ACK and
7315 (Informational)/ CANCEL. This header field is useful in SIP-based networks that also provide OSI Layer
Request and Response 2 (L2)/OSI Layer 3 (L3) connectivity through different access technologies. SIP UAs may
use this header field to relay information about the access technology to proxies that
are providing services. The serving proxy may then use this information to optimize
services for the UA. For example, a 3GPP (Third Generation Partnership Project) UA
may use this header field to pass information about the access network, such as radio
access technology and radio cell identity, to its home service provider. For the purpose
of this extension, we define an access network as the network providing the L2/L3 IP
connectivity, which, in turn, provides a user with access to the SIP capabilities and
services provided. In some cases, the SIP server that provides the user with services
may wish to know information about the type of access network that the UA is
currently using. Some services are more suitable or less suitable depending on the
access type, and some services are of more value to subscribers if the access network
details are known by the SIP proxy that provides the user with services.
In other cases, the SIP server that provides the user with services may simply wish to
know crude location information in order to provide certain services to the user. For
example, many of the location-based services available in wireless networks today
require the home network to know the identity of the cell the user is being served by.
Some regulatory requirements exist mandating that for cellular radio systems, the
identity of the cell where an emergency call is established is made available to the
emergency authorities.
The SIP server that provides services to the user may desire to have knowledge about
the access network. This is achieved by defining a new private SIP extension header
field: P-Access-Network-Info. This header field carries information relating to the
access network between the UAC and its serving proxy in the home network. A proxy
providing services based on the P-Access-Network-Info header field must consider
the trust relationship to the UA or outbound proxy including the P-Access-Network-
Info header field.
Applicability Statement for the P-Access-Network-Info Header Field:
This mechanism is appropriate in environments where SIP services are dependent on
SIP elements knowing details about the IP and lower-layer technologies used by a UA
to connect to the SIP network. Specifically, the extension requires that the UA know
the access technology it is using, and that a proxy desires such information to provide
services. Generally, SIP is built on the everything-over-IP and IP-over-everything
principles, where the access technology is not relevant for the operation of SIP. Since
SIP systems generally should not care or even know about the access technology, this
SIP extension is not for general SIP usage.
The information revealed in the P-Access-Network-Info header field is potentially very
sensitive. Proper protection of this information depends on the existence of specific
business and security relationships among the proxies that will see SIP messages
containing this header field. It also depends on explicit knowledge of the UA of the
existence of those relationships. Therefore, this mechanism is only suitable in
environments where the appropriate relationships are in place, and the UA has
explicit knowledge that they exist.
(Continued)
110 ◾ Handbook on Session Initiation Protocol
P-Asserted-Identity/RFC 3325 The P-Asserted-Identity header field is used among trusted SIP entities (typically
(Informational)/Request intermediaries) to carry the identity of the user sending a SIP message as it was
verified by authentication. It contains a URI (commonly a SIP URI) and an optional
display-name. There may be one or two P-Asserted-Identity values. If there is one
value, it must be a sip, sips, or tel URI. If there are two values, one value must be a sip
or sips URI and the other must be a tel URI. It is worth noting that proxies can (and
will) add and remove this header field. Example:
P-Asserted-Service/RFC 6050 The private P-Asserted-Service header field enables a network of trusted SIP servers to
(Informational)/Request assert the service of authenticated users. This header carries the service information
of the user sending a SIP message. The use of this header is only applicable inside an
administrative domain with previously agreed-upon policies for generation, transport,
and usage of such information. However, this header does not offer a general service
identification model suitable for use between different Trust Domains or for use in the
Internet at large. The P-Asserted-Service header field carries information that is
derived service identification. While declarative service identification can assist in
deriving the value transferred in this header field, this should be in the form of
streamlining the correct derived service identification.
By providing a mechanism to compute and store the results of the domain-specific
service calculation, that is, the derived service identification, this optimization allows a
single trusted proxy to perform an analysis of the request and authorize the
requestor’s permission to request such a service. The proxy may then include a service
identifier that relieves other trusted proxies and trusted UAs from performing further
duplicate analysis of the request for their service identification purposes. In addition,
this header allows UACs outside the Trust Domain to provide a hint of the requested
service.
This header does not provide for the dialog or transaction to be rejected if the service
is not supported end-to-end. SIP provides other mechanisms, such as the option tag
and use of the Require and Proxy-Require header fields, where such functionality is
required. No explicitly signaled service identification exists, and the session proceeds
for each node’s definition of the service in use, on the basis of information contained
in the SDP and in other SIP header fields.
This mechanism is specifically for managing the information needs of intermediate
routing devices between the calling user and the user represented by the Request-
URI. In support of this mechanism, an informal URN is defined to identify the services.
This provides a hierarchical structure for defining services and subservices, and
provides an address that can be resolvable for various purposes. It should be noted
that how a service can be uniquely resolved has not been addressed by this header or
URN.
A proxy server that handles a request can, after authenticating the originating user in
some way (e.g., digest authentication) to ensure that the user is entitled to that
service, insert a header field such as a P-Asserted-Service into the request and forward
it. However it is not sufficient to uniquely identify a service, and the remedy of this is
to extend the SIP signaling to capture the missing element. A proxy server or UA that it
does not trust removes all the P-Asserted-Service header field values. Syntactically,
there may be multiple P-Preferred-Service header fields in a request. The semantics of
multiple P-Preferred-Service header fields appearing in the same request is not
defined at this time. Implementations of this specification must only provide one
P-Preferred-Service header field value.
(Continued)
112 ◾ Handbook on Session Initiation Protocol
The naming convention described above uses the term service; however, all the
constructs are equally applicable to identifying applications within the UA. The URN
consists of a hierarchical service identifier or application identifier, with a sequence of
labels separated by periods. The leftmost label is the most significant one and is called
top-level service identifier, while names to the right are called subservices or
subapplications. The set of allowable characters is the same as that for domain names
and a subset of the labels allowed in RFC 3958. Labels are case insensitive and MUST
be specified in all lowercase. For any given service identifier, labels can be removed
right-to-left and the resulting URN is still valid, referring a more generic service, with
the exception of the top-level service identifier and possibly the first subservice or
subapplication identifier. Labels cannot be removed beyond a defined basic service;
for example, the label w.x may define a service, but the label w may only define an
assignment authority for assigning subsequent values and not define a service in its
own right. In other words, if a service identifier w.x.y.z exists, the URNs w.x and w.x.y
are also valid service identifiers, but w may not be a valid service identifier if it merely
defines who is responsible for defining x. Example:
P-Asserted-Service:
urn:urn-7:3gpp-service.exampletelephony.version1
P-Associated-URI/RFC 7315 The P-Associated-URI header field can appear in the SIP REGISTER method and 2xx
(Informational)/Response responses. This extension allows a registrar to return a set of associated URIs for a
registered SIP AOR. We define the P-Associated- URI header field, used in the 200 OK
response to a REGISTER request. The P-Associated-URI header field contains the set of
URIs that are associated with the registered AOR. In addition to the AOR, an
associated URI is a URI that the service provider has allocated to a user. A registrar
contains information that allows zero or more URIs to be associated with an AOR.
Usually, all these URIs (the AOR and the associated URIs) are allocated for the usage of
a particular user.
This extension to SIP allows the UAC to know, upon a successful authenticated
registration, which other URIs, if any, the service provider has associated with an AOR
URI. Note that, in standard SIP usage (RFC 3261), the registrar does not register the
associated URIs on behalf of the user. Only the AOR that is present in the To header
field of the REGISTER is registered and bound to the contact address. The only
information conveyed is that the registrar is aware of other URIs that can be used by
the same user. A situation may be possible, however, in which an application server (or
even the registrar itself) registers any of the associated URIs on behalf of the user by
means of a third-party registration.
However, this third-party registration is beyond the scope of this document. A UAC
must not assume that the associated URIs are registered. If a UAC wants to check
whether any of the associated URIs is registered, it can do so by mechanisms specified
outside this document; for example, the UA may send a REGISTER request with the To
header field value set to any of the associated URIs and without a Contact header
field. The 200 OK response will include a Contact header field with the list of AORs
that have been registered with contact addresses. If the associated URI is not
registered, the UA may register it before its utilization.
(Continued)
Basic Session Initiation Protocol ◾ 113
Path/RFC 3327 (Standards The SIP Path header field is very similar to the Record-Route header and is used in
Track)/Request and conjunction with SIP REGISTER requests and with 200 class messages in response to
Response REGISTER (REGISTER responses). A preloaded Path header field may be inserted into a
REGISTER by any SIP node traversed by that request. Like the Route header field,
sequential Path header fields are evaluated in the sequence in which they are present
in the request, and Path header fields may be combined into compound Path header
in a single Path header field. The registrar reflects the accumulated Path back into the
REGISTER response, and intermediate nodes propagate this back toward the
originating UA.
The difference between Path and Record-Route is that Path applies to REGISTER and
200 class responses to REGISTER. Record-Route does not, and cannot, be defined in
REGISTER for reasons of backward compatibility. Furthermore, the vector established
by Record-Route applies only to requests within the dialog that established that
Record-Route, whereas the vector established by Path applies to future dialogs.
Note that the Path header field values conform to the syntax of a Route element as
defined in RFC 3261. As suggested therein, such values must include the loose-routing
indicator parameter ;lr for full compliance with RFC 3261. Support for the Path header
field may be indicated by a UA by including the option tag path in a Supported header
field. Example:
Path: <sip:P3.EXAMPLEHOME.COM;lr>,<sip:P1.EXAMPLEVISITED.COM;lr>
P-Called-Party-ID/RFC 7315 The P-Called-Party-ID header field can appear in SIP INVITE, OPTIONS, PUBLISH,
(Informational)/Request SUBSCRIBE, and MESSAGE methods and all responses. A proxy server inserts a
P-Called-Party-ID header field, typically in an INVITE request, en route to its
destination. The header is populated with the Request-URI received by the proxy in
the request. The UAS identifies to which AOR, out of several registered AORs, the
invitation was sent (e.g., the user may be simultaneously using one personal SIP URI
and one business SIP URI to receive invitation to sessions). The UAS can use the
information to render different distinctive audiovisual alerting tones, depending on
the URI used to receive the invitation to the session. Users in the 3GPP IP Multimedia
Subsystem (IMS) may get one or several SIP URIs (AOR) to identify the user. For
example, a user may get one business SIP URI and one personal SIP URI. As an
example of utilization, the user may make available the business SIP URI to coworkers
and may make available the personal SIP URI to members of the family.
At a certain point in time, both the business SIP URI and the personal SIP URI are
registered in the SIP registrar, so both URIs can receive invitations to new sessions.
When the user receives an invitation to join a session, he/she should be aware of
which of the registered SIP URIs this session was sent to. This requirement is stated in
the 3GPP Release 5 requirements on SIP (RFC 4083). The problem arises during the
terminating side of a session establishment. At that time, the SIP proxy that is serving a
UA gets an INVITE request, and the SIP server retargets the SIP URI that is present in
the Request-URI, and replaces that SIP URI with the SIP URI published by the user in
the Contact header field of the REGISTER request at registration time. One can argue
that the To header field conveys the semantics of the called user, and therefore, this
extension to SIP is not needed. Although the To header field in SIP may convey the
called party ID in most situations, there are two particular cases when the above
assumption is not correct:
(Continued)
Basic Session Initiation Protocol ◾ 115
1. The session has been forwarded, redirected, etc., by previous SIP proxies, before
arriving to the proxy that is serving the called user.
2. The UAC builds an INVITE request and the To header field is not the same as the
Request-URI. The problem of using the To header field is that this field is populated
by the UAC and not modified by proxies in the path. If the UAC, for any reason, did
not populate the To header field with the AOR of the destination user, then the
destination user is not able to distinguish to which AOR the session was destined.
Another possible solution to the problem is built upon the differentiation of the
Contact header field value between different AOR at registration time. The UA can
differentiate each AOR it registers by assigning a different Contact header field value.
For example, when the UA registers the AOR sip:id1, the Contact header field value
can be sip:id1@ua, while the registration of the AOR sip:id2 can be bound to the
Contact header field value sip:id2@ua. The solution described above assumes that the
UA explicitly registers each of its AORs, and therefore, it has full control over the
contact address values assigned to each registration.
However, if the UA does not have full control of its registered AORs, because of, for
example, a third-party registration, the solution does not work. This may be the case of
the 3GPP registration, where the UA may have previously indicated to the network, by
means outside of SIP, that some other AORs may be automatically registered when the
UA registers a particular AOR. The requirement is covered in the 3GPP Release 5
requirements on SIP (RFC 4083). In the next paragraphs, we show an example of the
problem, in the case in which there has been some sort of call forwarding in the
session, so that the UAC is not aware of the intended destination URI in the current
INVITE request. We assume that a UA is registering to its proxy (P1).
Scenario UA–P1
F1 Register UA -> P1
Later, the proxy/registrar (P1) receives an INVITE request from another proxy (P2)
destined to the user’s business SIP AOR. We assume that this INVITE request has
undergone some sort of forwarding in the past, and as such, the To header field is not
populated with the SIP URI of the user. In this case, we assume that the session was
initially addressed to sip:[email protected]. The SIP server at
othernetwork.com has forwarded this session to sip:[email protected].
Scenario UA–P1–P2
F3 Invite P2 -> P1
The proxy P1 retargets the user and replaces the Request-URI with the SIP URI
published during registration time in the Contact header field value.
F4 Invite P1 -> UA
When the UAS receives the INVITE request, it cannot determine whether it got the
session invitation due to the user's registration of the business or personal AOR.
Neither the UAS nor proxies/application servers can provide this user a service based
on the destination AOR of the session. We solve this problem by allowing the proxy
that is responsible for the home domain (as defined in SIP) of the user to insert a
P-Called-Party-ID header field that identifies the AOR to which this session is destined.
If this SIP extension is used, the proxy serving the called user will get the message flow
F5; it will populate the P-Called-Party-ID header field in message flow F6 with the
contents of the Request-URI in F4. This is shown in flows F5 and F6, as follows:
F5 Invite P2 -> P1
When the UA receives the INVITE request F6, it can determine the intended AOR of the
session and apply whatever service is needed for that AOR.
Applicability Statement for the P-Called-Party-ID Header Field:
The P-Called-Party-ID header field is applicable when the UAS needs to be aware of the
intended AOR that was present in the Request-URI of the request, before the proxy
retargets to the contact address. The UAS may be interested in applying different
audiovisual alerting effects or other filtering services, depending on the intended
destination of the request. It is especially valuable when the UAS has registered
several AORs to his registrar, and therefore, the UAS is not aware of the AOR that was
present in the INVITE request when it hit his proxy/registrar, unless this extension is
used. It is acknowledged that the History-Info header field will provide equivalent
coverage to that of the P-Called-Party-ID header field. However, the P-Called-Party-ID
header field is used entirely within the 3GPP system and does not appear to SIP
entities outside that of a single 3GPP operator.
Usage of the P-Called-Party-ID Header Field:
The P-Called-Party-ID header field provides proxies and the UAS with the AOR that was
present in the Request-URI of the request, before a proxy retargets the request. This
information is intended to be used by subsequent proxies in the path or by the UAS.
Typically, a SIP proxy inserts the P-Called-Party-ID header field before retargeting the
Request-URI in the SIP request. The header field value is populated with the contents
of the Request-URI, before replacing it with the contact address.
Procedures at the UA:
A UAC must not insert a P-Called-Party-ID header field in any SIP request or response.
A UAS may receive a SIP request that contains a P-Called-Party-ID header field. The
header field will be populated with the AOR received by the proxy in the Request-URI
of the request, before its forwarding to the UAS. The UAS may use the value in the
P-Called-Party-ID header field to provide services based on the called party URI, such
as, for example, filtering of calls depending on the date and time, distinctive
presentation services, distinctive alerting tones, and others.
Procedures at the Proxy:
A proxy that has access to the contact information of the user can insert a P-Called-
Party-ID header field in any of the SIP INVITE, OPTIONS, PUBLISH, SUBSCRIBE, and
MESSAGE requests. When included, the proxy must populate the header field value
with the contents of the Request-URI present in the SIP request that the proxy
received. It is necessary that the proxy that inserts the P-Called-Party-ID header field
has information about the user, in order to prevent a wrong delivery of the called
party ID. This information may, for example, have been learned through a registration
process.
A proxy or application server that receives a request containing a P-Called-Party-ID
header field may use the contents of the header field to provide a service to the user
based on the URI of that header field value. A SIP proxy must not insert a P-Called-
Party-ID header field in REGISTER requests.
(Continued)
118 ◾ Handbook on Session Initiation Protocol
P-Charging-Function- 3GPP has defined a distributed architecture that results in multiple network entities
Addresses/RFC 7315 becoming involved in providing access and services. There is a need to inform each
(Informational)/Request SIP proxy involved in a transaction about the common charging functional entities to
receive the generated charging records or charging events. The solution provided by
3GPP is to define two types of charging functional entities: Charging Collection
Function (CCF) and Event Charging Function (ECF). CCF is used for offline charging
(e.g., for postpaid account charging). ECF is used for online charging (e.g., for prepaid
account charging). There may be more than a single instance of CCF and ECF in a
network, in order to provide redundancy in the network. In case there are more than a
single instance of either the CCF or the ECF addresses, implementations should
attempt sending the charging data to the ECF or CCF address, starting with the first
address of the sequence (if any) in the P-Charging-Function-Addresses header field. If
the first address of the sequence is not available, then the next address (ccf-2 or ecf-2)
must be used if available. The CCF and ECF addresses may be passed during the
establishment of a dialog or in a stand-alone transaction. More detailed information
about charging can be found in 3GPP TS 32.240 [4] and 3GPP TS 32.260 [5].
We define the SIP private header field P-Charging-Function-Addresses header field. The
P-Charging-Function-Addresses header field can appear in all SIP methods except ACK
and CANCEL. A proxy may include this header field, if not already present, in either
the initial request or response for a dialog or in the request and response of a stand-
alone transaction outside a dialog. When present, only one instance of the header
must be present in a particular request or response. The mechanisms by which a SIP
proxy collects the values to populate the P-Charging-Function-Addresses header field
values are outside the scope of this document. However, as an example, a SIP proxy
may have preconfigured these addresses or may obtain them from a subscriber
database.
Applicability Statement for the P-Charging-Function-Addresses Header Field:
The P-Charging-Function-Addresses header field is applicable within a single private
administrative domain where coordination of charging is required, for example,
according to the architecture specified in 3GPP TS 32.240 [4]. The P-Charging-Function-
Addresses header field is not included in a SIP message sent outside of the own
administrative domain. The header is not applicable if the administrative domain does
not provide a charging function. The P-Charging-Function-Addresses header field is
applicable whenever the following circumstances are met:
F2 Invite P1 -> P2
Now both P1 and P2 are aware of the IP addresses of the entities that collect charging
record or charging events. Both proxies can send the charging information to the
same entities.
P-Charging-Vector/RFC 7315 3GPP has defined a distributed architecture that results in multiple network entities
(Informational)/Request becoming involved in providing access and services. Operators need the ability and
flexibility to charge for the access and services as they see fit. This requires
coordination among the network entities (e.g., SIP proxies), which includes correlating
charging records generated from different entities that are related to the same
session. The correlation information includes, but is not limited to, a globally unique
charging identifier that makes the billing effort easy. A charging vector is defined as a
collection of charging information. The charging vector may be filled in during the
establishment of a dialog or stand-alone transaction outside a dialog. The information
inside the charging vector may be filled in by multiple network entities (including SIP
proxies) and retrieved by multiple network entities. There are three types of
correlation information to be transferred: the IMS Charging Identity (ICID) value, the
address of the SIP proxy that creates the ICID value, and the Inter-operator Identifier
(IOI).
ICID is a charging value that identifies a dialog or a transaction outside a dialog. It is
used to correlate charging records. ICID must be a globally unique value. One way to
achieve globally uniqueness is to generate the ICID using two components: a locally
unique value and the host name or IP address of the SIP proxy that generated the
locally unique value. The IOI identifies both the originating and terminating networks
involved in a SIP dialog or transaction outside a dialog. There may be an IOI generated
from each side of the dialog to identify the network associated with each side.
(Continued)
Basic Session Initiation Protocol ◾ 121
P-Early-Media/RFC 5009 The private P-Early-Media header field with the supported parameter may be included
(Informational)/Request and in an INVITE request to indicate that the UAC or a proxy on the path recognizes the
Response header field. This header is not used in the Internet. A network entity may request the
authorization of early media or change a request for authorization of early media by
including the P-Early-Media header field in any message allowed by Table 2.5 (Section
2.8), within the dialog toward the sender of the INVITE request. The P-Early-Media
header field includes one or more direction parameters where each has one of the
values sendrecv, sendonly, recvonly, or inactive, following the convention used for
SDP stream directionality. Each parameter applies, in order, to the media lines in the
corresponding SDP messages establishing session media. Unrecognized parameters
shall be silently discarded. Nondirection parameters are ignored for purposes of
early-media authorization. If there are more direction parameters than media lines,
the excess shall be silently discarded.
If there are fewer direction parameters than media lines, the value of the last direction
parameter shall apply to all remaining media lines. A message directed toward the
UAC containing a P-Early-Media header field with no recognized direction parameters
shall not be interpreted as an early-media authorization request.
The parameter value sendrecv indicates a request for authorization of early media
associated with the corresponding media line, both from the UAS toward the UAC and
from the UAC toward the UAS (both backward and forward early media). The value
sendonly indicates a request for authorization of early media from the UAS toward the
UAC (backward early media), and not in the other direction. The value recvonly
indicates a request for authorization of early media from the UAC toward the UAS
(forward early media), and not in the other direction. The value inactive indicates
either a request that no early media associated with the corresponding media line be
authorized, or a request for revocation of authorization of previously authorized early
media.
(Continued)
Basic Session Initiation Protocol ◾ 125
The P-Early-Media header field in any message within a dialog toward the sender of the
INVITE request may also include the nondirection parameter gated to indicate that a
network entity on the path toward the UAS is already gating the early media,
according to the direction parameter(s). When included in the P-Early-Media header
field, the gated parameter shall come after all direction parameters in the parameter
list.
When receiving a message directed toward the UAC without the P-Early-Media header
field and no previous early-media authorization request has been received within the
dialog, the default early-media authorization depends on local policy and may depend
on whether the header field was included in the INVITE request. After an early-media
authorization request has been received within a dialog, and a subsequent message is
received without the P-Early-Media header field, the previous early-media
authorization remains unchanged. The P-Early-Media header field in any message
within a dialog toward the UAS may be ignored or interpreted according to local
policy.
The P-Early-Media header field does not interact with SDP offer–answer procedures in
any way. Early-media authorization is not influenced by the state of the SDP offer–
answer procedures (including preconditions and directionality) and does not
influence the state of the SDP offer–answer procedures. The P-Early-Media header
field may or may not be present in messages containing SDP. The most recently
received early-media authorization applies to the corresponding media line in the
session established for the dialog until receipt of the 200 OK response to the INVITE
request, at which point all media lines in the session are implicitly authorized. Early-
media flow in a particular direction requires that early media in that direction is
authorized, that media flow in that direction is enabled by the SDP direction attribute
for the stream, and that any applicable preconditions for resources management
described in RFC 3312 (see Section 15.4) are met. Early-media authorization does not
override the SDP direction attribute or preconditions state, and the SDP direction
attribute does not override early-media authorization.
The syntax of the P-Early-Media header field is described below in ABNF as an
extension to the ABNF for SIP in RFC 3261. Note that not all combinations of em-param
elements are semantically valid.
Permission-Missing/RFC 5360 The Permission-Missing header field is provided in the SIP response message for the
(Standards Track)/Response consent-based communications with 470 Consent Needed response code defined in
RFC 5360 (see Section 19.8). It indicates that the sender of the message has not taken
permission before communications with the called party. In the consent-based
communications defined in RFC 5360, a relay that can be any SIP server, be it a proxy,
B2BUA, or some hybrid is used for facilitating to obtain the consent of the users
before sending request messages. The triggering of communications without explicit
consent can cause a number of problems. These include amplification and DoS
(denial of service) attacks. These problems are described in more detail in the
consent-based communications requirements (RFC 4453, see Section 19.8.1).
(Continued)
126 ◾ Handbook on Session Initiation Protocol
On receiving a request-contained URI list, the relay checks whether or not it has
permissions for all the URIs contained in the incoming URI list. If it does, the relay
performs the translation. If it lacks permissions for one or more URIs, the relay must
not perform the translation and should return an error response. A relay that receives
a request-contained URI list with a URI for which the relay has no permissions should
return a 470 Consent Needed response. The relay should add a Permission-Missing
header field with the URIs for which the relay has no permissions. For example, a relay
receives an INVITE that contains URIs for which the relay does not have permission
(the INVITE carries the recipient URIs in its message body). The relay rejects the
request with a 470 Consent Needed response. That response contains a Permission-
Missing header field with the URIs for which there was no permission. RFC 5360 also
defines the format of the permission document that the user needs to send to the
relay for taking permission.
Relays implementing this framework obtain and store permissions associated to their
translation logic. These permissions indicate whether or not a particular recipient has
agreed to receive traffic at any given time. Recipients that have not given the relay
permission to send them traffic are simply ignored by the relay when performing a
translation. In principle, permissions are valid as long as the context where they were
granted is valid or until they are revoked. For example, the permissions obtained by a
URI-list SIP service that distributes MESSAGE requests to a set of recipients will be
valid as long as the URI-list SIP service exists or until the permissions are revoked.
Additionally, if a recipient is removed from a relay’s translation logic, the relay should
delete the permissions related to that recipient. For example, if the registration of a
Contact URI expires or is otherwise terminated, the registrar deletes the permissions
related to that contact address. It is also recommended that relays request recipients
to refresh their permissions periodically. If a recipient fails to refresh its permissions
for a given period of time, the relay should delete the permissions related to that
recipient.
This framework does not provide any guidance for the values of the refreshment
intervals because different applications can have different requirements to set those
values. For example, a relay dealing with recipients that do not implement this
framework may choose to use longer intervals between refreshes. The refresh process
in such recipients has to be performed manually by their users (since the recipients
do not implement this framework), and having too short refresh intervals may become
too heavy a burden for those users. Example:
Permission-Missing: sip:[email protected]
P-Preferred-Identity/RFC The P-Preferred-Identity header field is used from a UA to a trusted proxy to carry the
3325 (Informational)/ identity the user sending the SIP message wishes to be used for the P-Asserted-
Request Header field value that the trusted element will insert. Like P-Asserted-Identity, there
may also be one or two P-Preferred-Identity values. If there is one value, it must be a
sip, sips, or tel URI. If there are two values, one value must be a sip or sips URI and the
other must be a tel URI. It is worth noting that proxies can (and will) remove this
header field. Example:
P-Preferred-Service/RFC 6050 The P-Preferred-Service header field is used by a UA sending the SIP request to provide
(Informational)/Request a hint to a trusted proxy of the preferred service that the user wishes to be used for
the P-Asserted-Service header field value that the trusted element will insert. The use
and applicability of this header is similar, which has been explained in the case of the
P-Asserted-Service header field.
(Continued)
Basic Session Initiation Protocol ◾ 127
P-Profile-Key/RFC 5002 The private P-Profile-Key P-header field contains the key to be used by a proxy to query
(Informational)/Request the user database for a given profile. This header field carries the key of a service
profile that is stored in a user database used by SIP proxies that belong to the same
administrative domain and share a common frame of reference to the use database.
Typically, when SIP is used on the Internet, there are no multiple proxies with a trust
relationship between them querying the same user database. Consequently, the
P-Profile-Key header field does not seem useful in a general Internet environment.
There can be scenarios where a set of proxies handling a request need to consult the
same user database for providing services to the users through using a variety of
service features. That is, they address a specific application in an application server.
Those proxies typically use the destination SIP URI of the request as the key for their
database queries where service identities are SIP URIs that refer to services instead of
users. There can be wildcarded service identities that are a set of service identities that
match a regular expression and share the same profile. For example, the service
identities sip:[email protected] and sip:[email protected] would
match the wildcarded service identity sip:chatroom-!.*[email protected]. Nevertheless,
when a proxy handles a wildcarded service identity, the key to be used in its database
query is not the destination SIP URI of the request, but a regular expression instead.
When a proxy queries the user database for a service identity for which there is no
profile in the user database, the user database needs to find its matching wildcarded
service identity. For example, if the user database receives a query for
sip:[email protected], the user database needs to go through all the
wildcarded service identity it has until it finds a matching one; in this case,
sip:chatroom-!.*[email protected]. The process to find a matching wildcarded service
identity can be computationally expensive, time consuming, or both.
When two proxies query the user database for the same service identity, which
matches a wildcarded service identity, the user database needs to perform the
matching process twice. Having to perform that process twice can be avoided by
having the first proxy obtain the wildcarded service identity from the user database
and transfer it, piggy-backed in the SIP message, to the second proxy. This way, the
second proxy can query the user database using the wildcarded service identity
directly.
An alternative, but undesirable, solution would consist of having the user database
store every service identity and its matching wildcarded service identity. The
scalability and manageability properties of this approach are considerably worse than
those of the approach described earlier.
We, therefore, can derive the usefulness of the P-Profile-Key header field from the
above analysis as follows:
P-Profile-Key: <sip:chatroom-!.*[email protected]>
(Continued)
128 ◾ Handbook on Session Initiation Protocol
Priority/RFC 3261 (Standards The Priority header field indicates the urgency of the request as perceived by the client.
Track)/Request The Priority header field describes the priority that the SIP request should have to the
receiving human or its agent. For example, it may be factored into decisions about call
routing and acceptance. For these decisions, a message containing no Priority header
field should be treated as if it specified a Priority of normal. The Priority header field
does not influence the use of communications resources such as packet forwarding
priority in routers or access to circuits in PSTN gateways. The header field can have the
values non-urgent, normal, urgent, and emergency, but additional values can be
defined elsewhere. It is recommended that the value of emergency only be used
when life, limb, or property are in imminent danger. Otherwise, there are no
semantics defined for this header field. These are the values of RFC 2076, with the
addition of emergency. Example:
Subject: A tornado is heading our way!
Priority: emergency
or
Subject: Weekend plans
Priority: nonurgent
Privacy/RFC 3323 (Standards The Privacy header field allows a SIP UA to request a certain degree of privacy for a
Track)/Request message. There are some headers that a UA cannot conceal itself, because they are
used in routing, which could be concealed by an intermediary that subsequently takes
responsibility for directing messages to and from the anonymous user. The UA must
have some way to request such privacy services from the network. For that purpose,
this document defines a new SIP header, Privacy, that can be used to specify privacy
handling for requests and responses.
UAs should include a Privacy header when network-provided privacy is required. Note
that some intermediaries may also add the Privacy header to messages, including
privacy services. However, such intermediaries should only do so if they are operating
at a user’s behest, for example, if a user has an administrative arrangement with the
operator of the intermediary that it will add such a Privacy header. An intermediary
must not modify the Privacy header in any way if the none priv-value is already
specified. The values of priv-value today are restricted to the above options, although
further options can be defined as appropriate. Each legitimate priv-value can appear
zero or one time in a Privacy header. The current values are
• header: The user requests that a privacy service obscure those headers that cannot
be completely expunged of identifying information without the assistance of
intermediaries (such as Via and Contact). Also, no unnecessary headers should be
added by the service that might reveal personal information about the originator of
the request.
• session: The user requests that a privacy service provide anonymization for the
session(s) (described, for example, in an SDP body defined in RFC 4566, see Section
7.7) initiated by this message. This will mask the IP address from which the session
traffic would ordinarily appear to originate. When session privacy is requested, UAs
must not encrypt SDP bodies in messages.
• user: This privacy level is usually set only by intermediaries, in order to
communicate that user-level privacy functions must be provided by the network,
presumably because the UA is unable to provide them. UAs may, however, set this
privacy level for REGISTER requests, but should not set user-level privacy for other
requests.
(Continued)
Basic Session Initiation Protocol ◾ 129
• none: The user requests that a privacy service apply no privacy functions to this
message, regardless of any preprovisioned profile for the user or default behavior
of the service. UAs can specify this option when they are forced to route a message
through a privacy service that will, if no Privacy header is present, apply some
privacy functions that the user does not desire for this message. Intermediaries
must not remove or alter a Privacy header whose priv-value is none. UAs must not
populate any other priv-values (including critical) in a Privacy header that contains a
value of none.
• critical: The user asserts that the privacy services requested for this message are
critical, and that, therefore, if these privacy services cannot be provided by the
network, this request should be rejected. Criticality cannot be managed
appropriately for responses. When a Privacy header is constructed, it must consist
of either the value none, or one or more of the values user, header, and session
(each of which must appear at most once), which may, in turn, be followed by the
critical indicator.
IANA registration for the Privacy header field values is required along with the RFC
publication. RFC 6050 (see Section 12.3) adds a new privacy type (priv-value) to the
Privacy header, defined in RFC 3323 (see Section 20.2). The presence of this privacy
type in a Privacy header field indicates that the user would like the Network Asserted
Identity to be kept private with respect to SIP entities outside the Trust Domain with
which the user authenticated. Note that a user requesting multiple types of privacy
must include all of the requested privacy types in its Privacy header field value.
priv-value = "id"
Example:
Privacy: id
Proxy-Authenticate/RFC 3261 A Proxy-Authenticate header field value contains an authentication challenge. The use
(Standards Track)/Response of this header field is defined in Section 14.33 of RFC 2616 (obsoleted by RFCs 7230–
7235). See Section 22.3 of RFC 2616 for further details on its usage. Example:
Proxy-Authorization/RFC The Proxy-Authorization header field allows the client to identify itself (or its user) to a
3261 (Standards Track)/ proxy that requires authentication. A Proxy-Authorization field value consists of
Request credentials containing the authentication information of the UA for the proxy or realm
of the resource being requested. See Section 3.6.3 for a definition of the usage of this
header field. This header field, along with Authorization, breaks the general rules
about multiple header field names. Although not a comma-separated list, this header
field name may be present multiple times, and must not be combined into a single
header line using the usual rules described in RFC 3261. Example:
P-Refused-URI-List/RFC 5318 RFC 5318 specifies the SIP P-Refused-URI-List Private-Header (P-Header). This P-Header
(Informational)/Request and is used in the Open Mobile Alliance’s (OMA) Push to talk over Cellular (PoC) system. It
Response enables URI-list servers to refuse the handling of incoming URI lists that have
embedded URI lists. This P-Header also makes it possible for the URI-list server to
inform the client about the embedded URI list that caused the rejection and the
individual URIs that form such a URI list.
Usage Scenario:
An ad hoc PoC group session is a type of multiparty PoC session. The originator of a
particular ad hoc PoC group session chooses in an ad hoc manner (e.g., selecting from
an address book) the set of desired participants. To establish the ad hoc PoC group
session, the originator sends an INVITE request with a URI list that contains the URIs
of those participants. The PoC network, following the procedures defined in RFC 5366,
receives such an INVITE request and generates an individual INVITE request toward
each of the URIs in the URI list. In previous versions of the OMA PoC service, the
originator of an ad hoc PoC group session was only allowed to populate the initial URI
list with URIs identifying individual PoC users. Later versions of the service allow the
originator to also include URI lists whose entries represent URI lists. That is, the initial
URI list contains entries that are URI lists themselves. The expected service behavior
then is that the members of the embedded URI lists are invited to join the ad hoc PoC
group session. Figure 2.7a illustrates the expected behavior of the PoC. The originator
(not shown) places the URI list [email protected], along with the URI alice@
example.com, in the initial URI list. The PoC network resolves [email protected]
into its members, [email protected] and [email protected], and sends INVITE
requests to all the recipients.
The PoC network in Figure 2.7a consists of PoC servers, which are SIP entities that can
behave as proxies or B2BUAs. There are two types of logical PoC servers: controlling
and participating. In an ad hoc PoC group session, there is always exactly one
controlling PoC server. The controlling PoC server of an ad hoc PoC group session
resolves an incoming URI list and sends INVITEs to the members of the list. The
controlling PoC server also functions as the focus of the session. Every participant in
an ad hoc PoC group has an associated participating PoC server, which resides in the
home domain of the participant. Figure 2.7b shows how the PoC servers of the PoC
network behave in the scenario shown in Figure 2.7a. An originating PoC UA sends an
INVITE request (F1) with a URI list to its participating PoC server. The participating PoC
server of the originator receives the INVITE request, assumes the role of controlling
PoC server for the ad hoc PoC group session, and sends an INVITE request to the
recipient.
(Continued)
Basic Session Initiation Protocol ◾ 131
The first URI of the list, [email protected], identifies a single user. The second URI of
the URI list, [email protected], identifies a URI list. In PoC terminology, friends@
example.com identifies a prearranged PoC group. The PoC server at example.org,
which knows the membership of [email protected], cannot send INVITE requests
to the members of [email protected] because that PoC server does not act as a
controlling PoC server for the ad hoc PoC group session being established. Instead, it
informs the controlling PoC server that [email protected] is a list whose members
are [email protected] and [email protected]. Upon receiving this information, the
controlling PoC server generates INVITE requests toward [email protected] and
[email protected]. Although not shown in the above example, the participating PoC
server (example.org) can include—based on policy, presence of the members,
etc.—just a partial list of URIs of the URI list. Furthermore, a URI that the participating
PoC server returns can be a URI list. At present, there is no mechanism for a
participating PoC server to inform a controlling PoC server that a URI identifies a list
and the members of that list, nor is there a mechanism to indicate the URIs contained
in the list. This document defines such a mechanism: the P-Refused-URI-List P-Header.
Overview of Operation:
When a URI-list server receives an INVITE request with a URI list containing entries that
are URI lists themselves, and the server cannot handle the request, it returns a 403
Forbidden response with a P-Refused-URI-List P-Header, as shown in Figure 2.7c. The
P-Refused-URI-List P-Header contains the members of the URI list or lists that caused
the rejection of the request. This way, the client can send requests directly to those
member URIs.
Response Generation:
A 403 Forbidden response can contain more than one P-Refused-URI-List entries. The
P-Refused-URI-List header field MUST NOT be used with any other response. The
P-Refused-URI-List P-Header contains one or more URIs, which were present in the URI
list in the incoming request and could not be handled by the server. Additionally, the
P-Refused-URI-List can optionally carry some or all of the members of the URI lists
identified by those URIs. The 403 (Forbidden) response may contain body parts that
contain URI lists. Those body parts can be referenced by the P-Refused-URI-List
entries through their Content-IDs (RFC 2392). If there is a Content-ID defined in the
P-Refused-URI-List, one of the body parts must have an equivalent Content-ID. The
format of a URI list is service specific. This kind of message structure enables clients to
determine which URI relates to which URI list, if the URI-list server is willing to
disclose that information. Furthermore, the information enclosed in the URI lists
enable clients to take further actions to remedy the rejection situation (e.g., send
individual requests to the members of the URI list).
Message Sequence Example:
In the following message sequence example, a controlling PoC server sends an INVITE
request to a participating PoC server. The participating PoC server rejects the request
with a 403 Forbidden response. The 403 response has a P-Refused-URI-List P-Header
that carries the members of the rejected URI lists that the participating PoC server
determines to disclose to this controlling PoC server in the body of the message. The
INVITE request shown in Figure 2.7c is as follows (Via header fields are not shown for
simplicity):
(Continued)
132 ◾ Handbook on Session Initiation Protocol
---boundary1
Content-Type: application/sdp
(SDP not shown)
---boundary1
Content-Type: application/resource-lists+xml
Content-Disposition: recipient-list
<?xml version="1.0" encoding="UTF-8"?>
<resource-lists xmlns="urn:ietf:params:xml:ns:resource-
lists">
<list>
<entry uri="sip:[email protected]"/>
<entry uri="sip:[email protected]"/>
<entry uri="sip:[email protected]"/>
</list>
</resource-lists>
---boundary1---
Using the message body of the 403 Forbidden response above, the controlling PoC
server can determine the members of sip:[email protected] and sip:colleagues-
[email protected] that the participating PoC server determines to disclose to this
controlling PoC server. Furthermore, the controlling PoC server can deduce that the
participating PoC server has not sent any outgoing requests, per regular URI-list server
procedures.
Applicability:
The P-Refused-URI-List header field is intended to be used in OMA PoC networks. This
header field is used between PoC servers and carries information about those URI lists
that were rejected by the server receiving the request. The OMA PoC services is
designed so that, in a given session, only one PoC server can resolve incoming URI
lists and send INVITEs to members of these lists. This restriction is not present on
services developed to be used on the public Internet. Therefore, the P-Refused-URI-
List P-Header does not seem to have general applicability outside the OMA PoC
service. Additionally, the use of the P-Refused-URI-List P-Header requires special trust
relationships between servers that do not typically exist on the public Internet. It is
important to note that the P-Refused-URI-List is optional and does not change the
basic behavior of a SIP URI-list service. The P-Refused-URI-List only provides clients
with additional information about the refusal of the request.
Proxy-Require/RFC 3261 The Proxy-Require header field is used to indicate proxy-sensitive features that must be
(Standards Track)/Request supported by the proxy. See Section 3.11 (RFC 3261) for more details on the mechanics
of proxy behavior and a usage example. Example:
Proxy-Require: foo
(Continued)
134 ◾ Handbook on Session Initiation Protocol
P-Served-User/RFC 5502 The private P-Served-User header field conveys the identity of the served user and the
(Informational)/Request session case parameter that applies to this particular communication session and
application invocation. The header is not used in the Internet in general. The session
case parameter is used to indicate the category of the served user: Originating served
user or Terminating served user and Registered user or Unregistered user. Note that
there can be many kinds of ASs in the farm for providing a variety of services to the
SIP users based on different subscription types.
This header field can be added to initial requests for a dialog or stand-alone requests,
which are routed between nodes in a Trust Domain for P-Served-User. The P-Served-
User header field contains an identity of the user that represents the served user. The
sescase parameter may be used to convey whether the initial request is originated by
the served user or destined for the served user. The regstate parameter may be used
to indicate whether the initial request is for a registered or unregistered user. The
following is an example of a P-Served-User header field:
P-User-Database/RFC 4457 A distributed SIP network architecture results in multiple network entities, and the
(Informational)/Request SIP registrar database that keeps the users’ profiles may be distributed across the
network. The SIP REGISTER sent by a UA may have to traverse through multiple SIP
proxies before reaching the SIP registrar database. The private P-User-Database
header field is designed to meet this requirement. This header is a private extension
of the SIP header and is not used in the Internet. This header field can be added to
requests routed from one SIP proxy to another one to convey the address of the
database that contains the user profiles. The P-User-Database P-header contains the
address of the Home Subscriber Server (HSS) handling the user that generated the
request. The key benefit of using this SIP header field is to reduce the time and the
traffic handling it takes for a UA to register to the distributed SIP network with
multiple registrar databases in a given administrative domain. Because Diameter
URIs are used by this header, the following examples of valid Diameter host
identities are provided:
aaa://host.example.com;transport=tcp
aaa://host.example.com:6666;transport=tcp
aaa://host.example.com;protocol=diameter
aaa://host.example.com:6666;protocol=diameter
aaa://host.example.com:6666;transport=tcp;protocol=diameter
aaa://host.example.com:1813;transport=udp;protocol=radius
Example:
P-User-Database: <aaa://host.example.com;transport=tcp>
(Continued)
Basic Session Initiation Protocol ◾ 135
P-Visited-Network-ID/RFC 3GPP networks are composed of a collection of so-called home networks, visited
7315 (Informational)/ networks, and subscribers. A particular home network may have roaming agreements
Request with one or more visited networks. The effect of this is that when a mobile terminal is
roaming, it can use resources provided by the visited network in a transparent fashion.
One of the conditions for a home network to accept the registration of a UA roaming
to a particular visited network is the existence of a roaming agreement between the
home and the visited network. There is a need to indicate to the home network which
network is the visited network that is providing services to the roaming UA.
3GPP UAs always register to the home network. The REGISTER request is proxied by
one or more proxies located in the visited network toward the home network. For the
sake of a simple approach, it seems sensible that the visited network includes an
identification that is known to the home network. This identification should be
globally unique, and it takes the form of a quoted-text string or a token. The home
network may use this identification to verify the existence of a roaming agreement
with the visited network, and to authorize the registration through that visited
network. Note that P-Visited-Network-ID information reveals the location of the user,
to the level of the coverage area of the visited network. For a national network, for
example, P-Visited-Network-ID would reveal that the user is in the country in question.
Applicability Statement for the P-Visited-Network-ID Header Field:
The P-Visited-Network-ID header field is applicable whenever the following
circumstances are met:
1. There is transitive trust in intermediate proxies between the UA and the home
network proxy via established relationships between the home network and the
visited network, supported by the use of standard security mechanisms, e.g., IPsec,
Authentication and Key Agreement (AKA), or Transport Layer Security (TLS).
2. An end point is using resources provided by one or more visited networks (a
network to which the user does not have a direct business relationship).
3. A proxy that is located in one of the visited networks wants to be identified at the
user’s home network.
4. There is no requirement that every visited network need be identified at the home
network. Those networks that want to be identified make use of this extension.
Those networks that do not want to be identified do nothing.
5. A commonly pre-agreed text string or token identifies the visited network at the
home network.
6. The UAC sends a REGISTER request or dialog-initiating request (e.g., INVITE
request) or a stand-alone request outside a dialog (e.g., OPTIONS request) to a
proxy in a visited network.
7. The request traverses, en route to its destination, a first proxy located in the visited
network and a second proxy located in the home network or its destination is the
registrar in the home network.
8. The registrar or home proxy verifies and authorizes the usage of resources (e.g.,
proxies) in the visited network.
(Continued)
136 ◾ Handbook on Session Initiation Protocol
The P-Visited-Network-ID header field assumes that there is trust relationship between
a home network and one or more transited visited networks. It is possible for other
proxies between the proxy in the visited network that inserts the header, and the
registrar and the home proxy, to modify the value of P-Visited-Network-ID header
field. Therefore, intermediaries participating in this mechanism must apply a hop-by-
hop integrity-protection mechanism such as IPsec or other available mechanisms in
order to prevent such attacks.
Usage of the P-Visited-Network-ID Header Field:
The P-Visited-Network-ID header field is used to convey to the registrar or home proxy
in the home network the identifier of a visited network. The identifier is a text string
or token that is known by both the registrar or the home proxy at the home network
and the proxies in the visited network. Typically, the home network authorizes the UA
to roam to a particular visited network. This action requires an existing roaming
agreement between the home and the visited network.
While it is possible for a home network to identify one or more visited networks by
inspecting the domain name in the Via header fields, this approach has a heavy
dependency on DNS. It is an option for a proxy to populate the Via header field with
an IP address, for example, and in the absence of a reverse DNS entry, the IP address
will not convey the desired information. The P-Visited-Network-ID header field can
appear in all SIP methods except ACK, BYE, and CANCEL and all responses. Any SIP
proxy in the visited network that receives any of these requests may insert a P-Visited-
Network-ID header field when it forwards the request.
In case a REGISTER request or other request is traversing different administrative
domains (e.g., different visited networks), a SIP proxy may insert a new P-Visited-
Network-ID header field if the request does not contain a P-Visited-Network-ID
header field with the same network identifier as its own network identifier (e.g., if the
request has traversed other different administrative domains). Note also that there is
no requirement for this header field value to be readable in the proxies. Therefore, a
first proxy may insert an encrypted header field that only the registrar can decrypt. If
the request traverses a second proxy located in the same administrative domain as the
first proxy, the second proxy may not be able to read the contents of the P-Visited-
Network-ID header field.
In this situation, the second proxy will consider that its visited network identifier is not
already present in the value of the header field, and therefore, it will insert a new
P-Visited-Network-ID header field value (hopefully with the same identifier that the
first proxy inserted, although perhaps, not encrypted). When the request arrives at the
registrar or proxy in the home network, it will notice that the header field value is
repeated (both the first and the second proxy inserted it). The decrypted values
should be the same because both proxies were part of the same administrative
domain.
(Continued)
Basic Session Initiation Protocol ◾ 137
While this situation is not desirable, it does not create any harm at the registrar or
proxy in the home network. The P-Visited-Network-ID header field is normally used at
registration. However, this extension does not preclude other usages. For example, a
proxy located in a visited network that does not maintain registration state may insert
a P-Visited-Network-ID header field into any stand-alone request outside a dialog or a
request that creates a dialog. At the time of writing this document, the only requests
that create dialogs are INVITE requests, SUBSCRIBE requests, and REFER requests.
To avoid conflicts with identifiers, especially when the number of roaming agreements
between networks increase, care must be taken when selecting the value of the
P-Visited-Network-ID header field. The identifier must be globally unique to avoid
duplications. Although there are many mechanisms to create globally unique
identifiers across networks, one such mechanism is already in operation, and that is
DNS. The P-Visited-Network-ID header field does not have any connection to DNS,
but the values in the header field can be chosen from the DNS entry representing the
domain name of the network. This guarantees the uniqueness of the value.
Procedures at the UA:
In the context of the network to which the header fields defined in this document
apply, a UA has no knowledge of the P-Visited-Network-ID when sending the
REGISTER request. Therefore, UACs must not insert a P-Visited-Network-ID header
field in any SIP message.
Procedures at the Registrar and Proxy:
A SIP proxy that is located in a visited network may insert a P-Visited-Network-ID
header field in any of the requests in all SIP methods except ACK, BYE, and CANCEL
and all responses. The header field must be populated with the contents of a text
string or a token that identifies the administrative domain of the network where the
proxy is operating toward the user’s home network. A SIP proxy or registrar that is
located in the home network can use the contents of the P-Visited-Network-ID header
field as an identifier of one or more visited networks that the request traversed. The
proxy or registrar in the home network may take local-policy-driven actions based on
the existence (or nonexistence) of a roaming agreement between the home and the
visited networks.
This means, for instance, that the authorization of the actions of the request is based
on the contents of the P-Visited-Network-ID header field. A SIP proxy that is located in
the home network MUST delete this header field when forwarding the message
outside the home network administrative domain, in order to retain the user’s privacy.
A SIP proxy that is located in the home network should delete this header field when
the home proxy has used the contents of the header field or the request is routed
based on the called party’s identification, even when the request is not forwarded
outside the home network administrative domain. Note that a received P-Visited-
Network-ID from a UA is not allowed and must be deleted when the request is
forwarded.
Examples of Usage:
We present an example in the context of the scenario shown in the following network
diagram:
Scenario UA–P1–P2–REGISTRAR
This example shows the message sequence for a REGISTER transaction originating from
the UA eventually arriving at the REGISTRAR. P1 is an outbound proxy in the visited
network for UA. In this case, P1 inserts the P-Visited-Network-ID header field. Then, P1
routes the REGISTER request to REGISTRAR via P2. Message sequence for REGISTER
using P-Visited-Network-ID header field:
(Continued)
138 ◾ Handbook on Session Initiation Protocol
F1 Register UA -> P1
In flow F2, proxy P1 adds its own identifier in a quoted string to the P-Visited-
Network-ID header field.
F2 Register P1 -> P2
Finally, in flow F3, proxy P2 decides to insert its own identifier, derived from its own
domain name to the P-Visited-Network-ID header field.
F3 Register P2 -> REGISTRAR
RAck/RFC 3262 (Standards The RAck header is sent in a PRACK request to support the reliability of provisional (1xx
Track)/Request class) responses. It contains two numbers and a method tag. The first number is the
value from the RSeq header in the provisional response that is being acknowledged.
The next number, and the method, are copied from the CSeq in the response that is
being acknowledged. The method name in the RAck header is case sensitive. Example:
Reason/RFC 3326 (Standards The Reason header field is also intended to be used to encapsulate a final status code
Track)/Request in a provisional response. This functionality is needed to resolve the heterogeneous
error response forking problem (HEREP) to know the reason why SIP request is used
for creation of services. The Reason header field may appear in any request within a
dialog, in any CANCEL request, and in any response whose status code explicitly
allows the presence of this header field. A SIP message may contain more than one
Reason value (i.e., multiple Reason lines), but all of them must have different protocol
values (e.g., one SIP and another Q.850). The following values for the protocol field
have been defined for SIP and Q.850 [ITU-T Q.850] for interworking between SIP and
ISUP over the IP and PSTN network, respectively:
Example:
RFC 4411 (see Section 15.3) defines two use cases in which new preemption Reason values
are necessary:
• Access Preemption Event: This is when a UA receives a new SIP session request
message with a valid RP (Resource-Priority) value that is higher than the one
associated with the currently active session at that UA. The UA must discontinue the
existing session in order to accept the new one (according to local policy of some
domains).
• Network Preemption Event: This is when a network element—such as a router—
reaches capacity on a particular interface and has the ability to statefully choose
which session(s) will remain active when a new session/reservation is signaled for
under the parameters outlined in SIP Preconditions per RFC 3312 (see Section 15.4)
that would otherwise overload that interface (perhaps adversely affecting all
sessions). In this case, the router must terminate one or more reservations of lower
priority in order to allow this higher-priority reservation access to the requested
amount of bandwidth (according to local policy of some domains).
RFC 3312 has also registered Precondition Type with IANA as follows:
(Continued)
140 ◾ Handbook on Session Initiation Protocol
Record-Route/RFC 3261 The Record-Route header field is inserted by proxies in a request to force future
(Standards Track)/Request requests in the dialog to be routed through the proxy. Examples of its use with the
and Response Route header field are described in Section 3.11 (RFC 3261). Example:
Record-Route: <sip:server10.biloxi.com;lr>,
<sip:bigbox3.site3.atlanta.com;lr>
Recv-Info/RFC 6086 A UA uses the Recv-Info header field, on a per-dialog basis, to indicate for which Info
(Standards Track)/Request Packages it is willing to receive usually with INFO requests. A UA can indicate an initial
and Response set of Info Packages during dialog establishment and can indicate a new set during the
lifetime of the invite dialog usage. A UA can also use an empty Recv-Info header field
(a header field without a value) to indicate that it is not willing to receive INFO
requests for any Info Package, while still informing other UAs that it supports the Info
Package mechanism.
(Continued)
Basic Session Initiation Protocol ◾ 141
Referred-By/RFC 3892 The Referred-By request header field is a request header field used by the REFER
(Standards Track)/Request method. It can appear in any request. It carries a SIP URI representing the identity of
the referrer and, optionally, the Content-ID of a body part (the Referred-By token) that
provides a more secure statement of that identity. The Referred-By header field may
appear in any SIP request, but is meaningless for ACK and CANCEL. Proxies do not
need to be able to read Referred-By header field values and must not remove or
modify them. Example (indicates the token is in the body part with Content-ID:
<[email protected]>):
Referred-By: sip:[email protected];cid=”[email protected]”
Refer-Sub/RFC 4488 The Refer-Sub header field that is only meaningful within a REFER transaction indicates
(Standards Track)/Request whether or not an implicit/explicit subscription has been created by issuing a REFER
and Response request. This header field may be used with a REFER request and the corresponding
2xx response only. When this header field is set to false, it specifies that a REFER-Issuer
requests that the REFER-Recipient does not establish an implicit subscription and the
resultant dialog. However, when Refer-Sub field is set true, only then it specifies that a
subscription has been established with the issuing of the REFER request.
This header field clarifies the SIP REFER extension as defined in RFC 3515 that
automatically establishes a typically short-lived implicit event subscription used to
notify the party sending a REFER request about the receiver’s status in executing the
transaction requested by the REFER. The fact of the matter is that these notifications
are not needed in all cases. This header field provides a way to prevent the automatic
establishment of an event subscription and subsequent notifications using a new SIP
extension header field that may be included in a REFER request.
It should be noted that the Refer-Sub header field set to false may be used by the
REFER-Issuer only when the REFER-Issuer can be certain that the REFER request will not
be forked. If the REFER-Recipient supports the extension and is willing to process the
REFER transaction without establishing an implicit subscription, it must insert the
Refer-Sub header field set to false in the 2xx response to the REFER-Issuer. In this case,
no implicit subscription is created. Consequently, no new dialog is created if this
REFER was issued outside any existing dialog.
If the REFER-Issuer inserts the Refer-Sub header field set to false, but the REFER-
Recipient does not grant the suggestion (i.e., either does not include the Refer-Sub
header field or includes the Refer-Sub header field set to true in the 2xx response), an
implicit subscription is created as in the default case. The Refer-Sub header field may
be encrypted as part of end-to-end encryption.
The REFER specification allows for the possibility of forking a REFER request that is sent
outside of an existing dialog. In addition, a proxy may fork an unknown method type.
Should forking occur, the sender of the REFER with Refer-Sub will not be aware as only
a single 2xx response will be forwarded by the forking proxy. As a result, the
responsibility is on the issuer of the REFER with Refer-Sub to ensure that no forking
will result. If a REFER request to a given Request-URI might fork, the REFER-Issuer
should not include the Refer-Sub header field. The REFER-Issuer should use
standardized mechanisms for ensuring the REFER request does not fork. In the
absence of any other mechanism, the Request-URI of the REFER request should have
Globally Routable User Agent URI (GRUU) properties according to the definitions of
RFC 5627 (see Section 4.3) as those properties ensure the request will not fork.
(Continued)
142 ◾ Handbook on Session Initiation Protocol
Refer-To/RFC 3515 (Standards The Refer-To header field is a request header field (request-header) used in the REFER
Track)/Request method. It provides a URL to reference. The Refer-To header field may be encrypted as
part of end-to-end encryption. Example:
Refer-To: sip:[email protected]
Refer-To: <sip:[email protected];method=SUBSCRIBE>
Reject-Contact/RFC 3841 The Reject-Contact header field allows the UAC to specify that a UA should not be
(Standards Track)/Request contacted if it matches any of the values of the header field. Each value of the Reject-
Contact header field contains a “*,” and is parameterized by a set of feature
parameters. Some additional parameters are also defined for Contact header field,
such as media, duplex, and language, when used in the header field. Any UA whose
capabilities match the feature set described by the feature parameters matches the
value. For each contact predicate, each Reject-Contact predicate (i.e., each predicate
associated with the Reject-Contact header field) is examined. If that Reject-Contact
predicate contains a filter for a feature tag, and that feature tag is not present
anywhere in the contact predicate, that Reject-Contact predicate is discarded for the
processing of that contact predicate. If the Reject-Contact predicate is not discarded,
it is matched with the contact predicate using the matching operation of RFC 2533 (see
Section 3.4.3). If the result is a match, the URI corresponding to that contact predicate
is discarded from the target set. The ABNF syntax of the Reject-Contact header is
provided in the Accept-Header described in this table. Example:
Reject-Contact: *;actor="msg-taker";video
Replaces/RFC 3891 The Replaces header field indicates that a single dialog identified by the header field is
(Standards Track)/Request to be shut down and logically replaced by the incoming INVITE in which it is
contained. It is a request header only, and defined only for INVITE requests. The
Replaces header field may be encrypted as part of end-to-end encryption. Only a
single Replaces header field value may be present in a SIP request. A Replaces header
field must contain exactly one to-tag and exactly one from-tag, as they are required for
unique dialog matching. For compatibility with dialogs initiated by RFC 2543 that is
obsoleted by RFC 3261-compliant UAs, a tag of zero matches both tags of zero and
null. A Replaces header field may contain the early-flag. Examples:
Replaces: [email protected];from-tag=r33th4x0r;to-
tag=ff87ff
Replaces: 12adf2f34456gs5;to-tag=12345;from-tag=54321
Replaces: [email protected];to-tag=24796;from-tag=0
Reply-To/RFC 3261 (Standards The Reply-To header field contains a logical return URI that may be different from the
Track)/Request From header field. For example, the URI may be used to return missed calls or
unestablished sessions. If the user wishes to remain anonymous, the header field
should either be omitted from the request or populated in such a way that does not
reveal any private information. Even if the display-name is empty, the name-addr form
must be used if the addr-spec contains a comma, question mark, or semicolon. Syntax
issues are discussed in Section 2.4.1 RFC 3261. Example:
Request-Disposition/RFC The Request-Disposition header field specifies caller preferences for how a server
3841 (Standards Track)/ should process a request. Its value is a list of tokens, each of which specifies a
Request particular directive. The directives are grouped into types. There can only be one
directive of each type per request (e.g., both proxy and redirect cannot be put in the
same Request-Disposition header field). Note that a compact form, using the letter d,
has been defined. Example:
Require/RFC 3261 (Standards The Require header field is used by UACs to tell UASs about options that the UAC
Track)/Request expects the UAS to support in order to process the request. Although an optional
header field, the Require must not be ignored if it is present. The Require header field
contains a list of option tags, described in Section 2.10 (RFC 3261). Each option tag
defines a SIP extension that must be understood to process the request. Frequently,
this is used to indicate that a specific set of extension header fields need to be
understood. A UAC compliant to this specification MUST only include option tags
corresponding to Standards Track RFCs. Example:
Require: 100rel
Resource-Priority/RFC 4412 The Resource-Priority request header field marks a SIP request as desiring prioritized
(Standards Track)/Request access to resources. There is no protocol requirement that all requests within a SIP
dialog or session use the Resource-Priority header field. Local administrative policy
may mandate the inclusion of the Resource-Priority header field in all requests.
Implementations of this specification must allow inclusion to be either by explicit user
request or automatic for all requests. The syntax of the Resource-Priority header field
is described below. An example Resource-Priority header field is shown below:
Resource-Priority: dsn.flash
The r-value parameter in the Resource-Priority header field indicates the resource
priority desired by the request originator. Each resource value (r-value) is formatted as
namespace . priority value. The value is drawn from the namespace identified by the
namespace token. Namespaces and priorities are case-insensitive ASCII tokens that do
not contain periods. Thus, dsn.flash and DSN.Flash, for example, are equivalent. Each
namespace has at least one priority value. Namespaces and priority values within each
namespace must be registered with the IANA. Initial six registered namespaces are as
follows: dsn, drsn 6, q735, ets, and wps. Since a request may traverse multiple
administrative domains with multiple different namespaces, it is necessary to be able
to enumerate several different namespaces within the same message.
However, a particular namespace must not appear more than once in the same SIP
message. These may be expressed equivalently as either comma-separated lists within
a single header field, as multiple header fields, or as some combination. The ordering
of r-values within the header field has no significance. Thus, for example, the following
three header snippets are equivalent:
Response-Key/RFC 2543 The Response-Key request-header field defined in RFC 2543 is used by a client to
(Standards Track)/Request request the key that the called UA should use to encrypt the response with; however,
it has been deprecated in RFC 3261.
(Continued)
144 ◾ Handbook on Session Initiation Protocol
Retry-After/RFC 3261 The Retry-After header field can be used with a 500 Server Internal Error or 503 Service
(Standards Track)/Response Unavailable response to indicate how long the service is expected to be unavailable to
the requesting client and with a 404 Not Found, 413 Request Entity Too Large, 480
Temporarily Unavailable, 486 Busy Here, 600 Busy, or 603 Decline response to indicate
when the called party anticipates being available again. The value of this field is a
positive integer number of seconds (in decimal) after the time of the response.
An optional comment can be used to indicate additional information about the time of
callback. An optional duration parameter indicates how long the called party will be
reachable starting at the initial time of availability. If no duration parameter is given,
the service is assumed to be available indefinitely. Examples:
Retry-After: 18000;duration=3600
Retry-After: 120 (I’m in a meeting)
Route/RFC 3261 (Standards The Route header field is used to force routing for a request through the listed set of proxies.
Track)/Request Examples of the use of the Route header field are in Section 3.11 (RFC 3261). Example:
Route: <sip:bigbox3.site3.atlanta.com;lr>,
<sip:server10.biloxi.com;lr>
RSeq/RFC 3262 (Standards The RSeq header is used in provisional (1xx class) responses in order to transmit them
Track)/Response reliably. This header field may only be sent if the INVITE request contains the
Supported: rel100 header field. If RSeq is present in a provisional response, the UAC
should acknowledge the receipt of the response with a PRACK method. It contains a
single numeric value from 1 to (232 − 1). Each provisional response is given a sequence
number, carried in the RSeq header field in the response. The RSeq numbering space
is within a single transaction. This means that provisional responses for different
requests may use the same values for the RSeq number. The value of the RSeq in each
subsequent reliable provisional response for the same request must be greater by
exactly 1. RSeq numbers must not wrap around. Because the initial one is chosen to be
less than (231 − 1), but the maximum is (232 − 1), there can be up to 231 reliable
provisional responses per request, which is more than sufficient. Example:
RSeq: 7859254
Security-Client/RFC 3329 The Security-Client header field with non-TLS connections must be used by a SIP UAC that
(Standards Track)/Request wishes to use the security agreement specified in RFC 3329 (see Section 19.3) to a SIP
request message addressed to its first-hop server/proxy (i.e., the destination of the request
is the first-hop proxy). This header field contains a list of all the security mechanisms
that the client supports. The client should not add preference parameters to this list. The
client must add both a Require and Proxy-Require header field with the value sec-agree in
the option tag to its request. The contents of the Security-Client header field may be used
by the server to include any necessary information in its response. The parameters
described by the ABNF syntaxes described in Section 2.4.1.2 have the following semantics:
• Preference: The q-value indicates a relative preference for the particular mechanism.
The higher the value, the more preferred the mechanism is. All the security
mechanisms must have different q-values. It is an error to provide two mechanisms
with the same q-value.
• Digest-algorithm: This optional parameter is defined here only for HTTP Digest in
RFC 2617 (see Sections 19.4.5 and 19.12.2.3) in order to prevent the bidding-down
attack for the HTTP Digest algorithm parameter. The content of the field may have
same values as defined in RFC 2617 for the algorithm field.
• Digest-qop: This optional parameter is defined here only for HTTP Digest RFC 2617
in order to prevent the bidding-down attack for the HTTP Digest qop parameter.
The content of the field may have same values as defined in RFC 2617 for the qop
field.
• Digest-verify: This optional parameter is defined here only for HTTP Digest RFC 2617
to prevent the bidding-down attack for the SIP security mechanism agreement (this
document). The content of the field is counted exactly the same way as request-
digest in RFC 2617 except that the Security-Server header field is included in the A2
parameter. If the qop directive’s value is auth or is unspecified, then A2 is
A2 = Method “:” digest-uri-value “:” security-server
If the qop value is auth-int, then A2 is
A2 = Method “:” digest-uri-value “:” H(entity-body) “:” security-server
All linear white spaces in the Security-Server header field must be replaced by a single
SP before calculating or interpreting the digest-verify parameter. Method, digest-uri-
value, entity-body, and any other HTTP Digest parameter are as specified in RFC 2617.
Note that this specification does not introduce any extension or change to HTTP
Digest RFC 2617. RFC 3329 (see Section 19.3) only reuses the existing HTTP Digest
mechanisms to protect the negotiation of security mechanisms between SIP entities.
Security-Server/RFC 3329 The Security-Server header field provides a list of security capabilities of the server. A
(Standards Track)/Response server that by policy requires the use of this specification and receives a request that
does not have the sec-agree option tag in a Require, Proxy-Require, or Supported
header field must return a 421 Extension Required response. If the request had the
sec-agree option tag in a Supported header field, it must return a 494 Security
Agreement Required response. In both situations, the server must also include in the
response a Security-Server header field listing its capabilities and a Require header
field with an option tag sec-agree in it. The server must also add necessary information
so that the client can initiate the preferred security mechanism (e.g., a Proxy-
Authenticate header field for HTTP Digest).
Security-Verify/RFC 3329 The Security-Verify header field is used to protect the Security-Server header field of
(Standards Track)/Request SIP messages. For example, the client must also use the digest-verify parameter in the
Security-Verify header field to protect the Security-Server header field as specified in
RFC 3329 (see Section 19.3). When the client receives a response with a Security-Server
header field, it must choose the security mechanism in the server’s list with the
highest q-value among all the mechanisms that are known to the client. Then, it must
initiate that particular security mechanism as described in RFC 3329. This initiation
may be carried out without involving any SIP message exchange (e.g., establishing a
TLS connection). If an attacker modified the Security-Client header field in the
request, the server may not include in its response the information needed to
establish the common security mechanism with the highest preference value (e.g., the
Proxy-Authenticate header field is missing).
(Continued)
146 ◾ Handbook on Session Initiation Protocol
A client detecting such a lack of information in the response must consider the current
security agreement specified in RFC 3329 (see Section 19.3) process aborted, and may
try to start it again by sending a new request with a Security-Client header field. All the
subsequent SIP requests sent by the client to that server should make use of the security
mechanism initiated in the previous step. These requests must contain a Security-Verify
header field that mirrors the server’s list received previously in the Security-Server
header field. These requests must also have both a Require and Proxy-Require header
fields with the value sec-agree. The server must check that the security mechanisms
listed in the Security-Verify header field of incoming requests correspond to its static
list of supported security mechanisms.
Server/RFC 3261 (Standards The Server header field contains information about the software used by the UAS to
Track)/Response handle the request. Revealing the specific software version of the server might allow
the server to become more vulnerable to attacks against software that is known to
contain security holes. Implementers should make the Server header field a
configurable option. Example:
Server: HomeServer v2
Service-Route/RFC 3608 The SIP Service-Route header field contains a route vector that will direct requests
(Standards Track)/Response through a specific sequence of proxies. A registrar may use a Service-Route header
field to inform a UA of a service route that, if used by the UA, will provide services
from a proxy or set of proxies associated with that registrar. The Service-Route header
field may be included by a registrar in the response to a REGISTER request.
Consequently, a registering UA learns of a service route that may be used to request
services from the system it just registered with. The routing established by the
Service-Route mechanism applies only to requests originating in the UA. That is, it
applies only to UA-originated requests, and not to requests terminated by that UA.
The registrar generates a service route for the registering UA and returns it in the response
to each successful REGISTER request. This service route has the form of a Route header
field that the registering UA may use to send requests through the service proxy selected
by the registrar. The UA would use this route by inserting it as a preloaded Route header
field in requests originated by the UA intended for routing through the service proxy.
Note that the Service-Route header field values MUST conform to the syntax of a
Route element as defined in RFC 3261. As suggested therein, such values must include
the loose-routing indicator parameter ;lr for full compliance with RFC 3261. Example:
Service-Route: <sip:P2.HOME.EXAMPLE.COM;lr>,
<sip:HSP.HOME.EXAMPLE.COM;lr>
Session-Expires/RFC 4028 The Session-Expires header field conveys the session interval for a SIP session. It is
(Standards Track)/Request placed only in INVITE or UPDATE requests, as well as in any 2xx response to an INVITE
or UPDATE. Like the SIP Expires header field, it contains a delta-time. The absolute
minimum for the Session-Expires header field is 90 seconds. This value represents a bit
more than twice the duration that a SIP transaction can take in the event of a timeout.
This allows sufficient time for a UA to attempt a refresh at the half point of the session
interval, and for that transaction to complete normally before the session expires.
However, 1800 seconds (30 minutes) is recommended as the value for the Session-
Expires header field. In other words, SIP entities must be prepared to handle Session-
Expires header field values of any duration greater than 90 seconds, but entities that
insert the Session-Expires header field should not choose values of less than 30 minutes.
Example:
Session-Expires: 4800
(Continued)
Basic Session Initiation Protocol ◾ 147
SIP-ETag/RFC 3903 (Standards The SIP-ETag header field must be used in the response code of the PUBLISH request
Track)/Response message by the Event State Compositor, acting as a SIP UAS for processing the
PUBLISH request, indicating the type of published event state. In fact, the Event
Publication Agent, acting as a SIP UAC representing a SIP UAC in issuing the PUBLISH
request, must include a single Event header field in PUBLISH requests for determining
the type of the published event state. The value of this header field indicates the event
package for which this request is publishing event state. For each successful PUBLISH
request, the ESC will generate and assign an entity-tag and return it in the SIP-ETag
header field of the 2xx response.
SIP-If-Match/RFC 3903 The If-Match header field must be used for updating previously published event state
(Standards Track)/Request by the EPA, acting as a SIP UAC representing a SIP UAC in issuing the PUBLISH
request. In other words, when updating a previously published event state, PUBLISH
requests must contain a single SIP-If-Match header field identifying the specific event
state that the request is refreshing, modifying, or removing. This header field must
contain a single entity-tag that was returned by the ESC in the SIP-ETag header field of
the response to a previous publication. The PUBLISH request may contain a body,
which contains event state that the client wishes to publish. The content format and
semantics are dependent on the event package identified in the Event header field.
The presence of a body and the SIP-If-Match header field determine the specific
operation that the request is performing, as described below:
Remove No Yes 0
An Initial publication sets the initial event state for a particular EPA. There may, of
course, already be event state published by other EPAs (for the same AOR). Note that
this state is unaffected by an initial publication. A Refresh publication refreshes the
lifetime of a previous publication, whereas a Modify publication modifies the event
state of a previous publication. A Remove publication requests immediate removal of
event state. These operations are described in more detail in the following chapters.
An EPA is responsible for refreshing its previously established publications before their
expiration interval has elapsed. To refresh a publication, the EPA must create a
PUBLISH request that includes in a SIP-If-Match header field the entity-tag of the
publication to be refreshed. The SIP-If-Match header field containing an entity-tag
conditions the PUBLISH request to refresh a specific event state established by a prior
publication. If the entity-tag matches the previously published event state at the ESC,
the refresh succeeds, and the EPA receives a 2xx response. Like the 2xx response to an
initial PUBLISH request, the 2xx response to a refresh PUBLISH request will contain a
SIP-ETag header field with an entity-tag. The EPA must store this entity-tag, replacing
any existing entity-tag for the refreshed event state.
(Continued)
148 ◾ Handbook on Session Initiation Protocol
Subject/RFC 3261 (Standards The Subject header field provides a summary or indicates the nature of the call,
Track)/Request allowing call filtering without having to parse the session description. The session
description does not have to use the same subject indication as the invitation. The
compact form of the Subject header field is s. Example:
Subject: Need more boxes
s: Tech Support
Subscription-State/RFC 6665 The Subscription-State header is used by a SIP UA to know the current state of the
(Standards Track)/Request subscription and is a required header field in the NOTIFY method. The values defined
in this header are active, pending, or terminated. Additional parameters like Expires,
Reason, and Retry-After are also included. Values in the Reason parameter can include
deactivated, giveup, probation, noresource, rejected, or timeout. The ABNF syntaxes
of this header are provided with the Event header described in this table. Example:
Supported/RFC 3261 The Supported header field enumerates all the extensions supported by the UAC or
(Standards Track)/Request UAS. The Supported header field contains a list of option tags, described in Section
and Response 2.10 (RFC 3261), that are understood by the UAC or UAS. A UA compliant to this
specification must only include option tags corresponding to Standards Track RFCs. If
empty, it means that no extensions are supported. The compact form of the Supported
header field is k. Example:
Supported: 100rel
Suppress-If-Match/RFC 5839 The SUBSCRIBE request may include the conditional Suppress-If-Match header field
(Standards Track)/Request including an entity tag specified in RFC 5839 for reduction of the number of NOTIFY
requests the subscriber can expect to receive. The subscriber must include a single
conditional header field including an entity-tag in the request when generating a
conditional SUBSCRIBE request. The condition is evaluated by comparing the entity-
tag (see Section 2.9) of the subscribed resource with the entity-tag carried in the
conditional header field. If they match, the condition evaluates to true. Unlike the
condition introduced for the PUBLISH method specified in RFC 3903 (see Section
5.2.2.2), these conditions do not apply to the SUBSCRIBE request itself; however, they
result in changes in the behavior of NOTIFY requests with regard to sending the
notifications to the subscriber after sending the SUBSCRIBE request provided the
condition is true.
If the condition is true, it instructs the notifier either to omit the body of the resulting
NOTIFY message (if the SUBSCRIBE is not sent within an existing dialog) or to
suppress (i.e., block) the NOTIFY request that would otherwise be triggered by the
SUBSCRIBE (for an established dialog). In the latter case, the SUBSCRIBE message will
be answered with a 204 No Notification response.
If the condition is false, the notifier follows its default behavior specified in RFCs 6665
and 5839. If the subscriber receives a 204 No Notification response to an in-dialog
SUBSCRIBE, the subscriber must consider the event state and the subscription state
unchanged. The value of the Suppress-If-Match header field is an entity-tag, which is
an opaque token that the subscriber simply copies (byte-wise) from a previously
received NOTIFY request. The inclusion of an entity-tag in a Suppress-If-Match header
field of a SUBSCRIBE request indicates that the client has a copy of, or is capable of
recreating a copy of, the entity associated with that entity-tag (see Section 2.9 for more
detail).
(Continued)
Basic Session Initiation Protocol ◾ 149
Target-Dialog/RFC 4538 The Target-Dialog header field is used in requests that create SIP dialogs facilitating
(Standards Track) secured communications authenticating out-of-dialog SIP requests. It indicates to the
recipient that the sender is aware of an existing dialog with the recipient, either
because the sender is on the other side of that dialog or because it has access to the
dialog identifiers. The recipient can then authorize the request based on this
awareness. The SIP option tag tdialog can be used in a Supported header field
implying that the sender of the message supports it. This header field contains the
dialog identifier of the other dialog that includes Call-ID, local tag, and remote tag.
One such example is call transfer, accomplished through REFER. If UAs A and B are in
an INVITE dialog, and UA A wishes to transfer UA B to UA C, UA A needs to send a
REFER request to UA B, asking UA B to send an INVITE request to UA C. UA B needs to
authorize this REFER. The proper authorization decision is that UA B should accept the
request if it came from a user with whom B currently has an INVITE dialog
relationship. In this case, the better approach is for UA A to send the REFER request to
UA B outside of the dialog. In that case, UA B can authorize the REFER request through
using the Target-Dialog header field.
Another example is the application interaction framework specified in RFC 5629. In that
framework, proxy servers on the path of a SIP INVITE request can place user interface
components on the UA that generated or received the request. To do this, the proxy
server needs to send a REFER request to the UA, targeted to its GRUU specified in RFC
5627 (see Section 4.3), asking the UA to fetch an HTTP resource containing the user
interface component. In such a case, the Target-Dialog header will provide a means
for the UA to authorize the REFER because the application interaction framework
recommends that the request be authorized if it was sent from an entity on the path of
the original dialog.
Another example is if two UAs share an INVITE dialog, and an element on the path of
the INVITE request wishes to track the state of the INVITE. In such a case, it sends a
SUBSCRIBE request to the GRUU of the UA, asking for a subscription to the dialog
event package. If the SUBSCRIBE request came from an element on the INVITE
request path, it can be authorized using the Target-Dialog header.
In addition, the use of the Target-Dialog header should not be confused with the
In-Reply-To header. Target-Dialog is similar, in that it also references a previous session
like In-Reply-To header. Because of their similarities, it is important to understand the
differences, as these two header fields are not substitutes for each other.
• First, In-Reply-To is meant for consumption by a human or a user interface widget,
for providing the users with a context that allows them to decide what a call is
about and whether they should take it. Target-Dialog, on the other hand, is meant
for consumption by the UA itself, to facilitate authorization of session requests in
specific cases where authorization is not a function of the user, but rather the
underlying protocols. A UA will authorize a call containing Target-Dialog based on a
correct value of the Target-Dialog header field.
• Second, Target-Dialog references a specific dialog that must be currently in
progress. In-Reply-To references a previous call attempt, most likely one that did
not result in a dialog. This is why In-Reply-To uses a Call-ID, and Target-Dialog uses
a set of dialog identifiers.
(Continued)
150 ◾ Handbook on Session Initiation Protocol
Finally, In-Reply-To implies cause and effect. When In-Reply-To is present, it means that
the request is being sent because of the previous request that was delivered. Target-
Dialog does not imply cause and effect, merely awareness for the purposes of
authorization. Example:
Timestamp/RFC 3261 The Timestamp header field describes when the UAC sent the request to the UAS. See
(Standards Track)/Request Section 3.1 (RFC 3261) for details on how to generate a response to a request that
and Response contains the header field. Although there is no normative behavior defined here that
makes use of the header, it allows for extensions or SIP applications to obtain round
trip time (RTT) estimates. Example:
Timestamp: 54
To/RFC 3261 (Standards The To header field specifies the logical recipient of the request. The optional display-
Track)/Request and name is meant to be rendered by a human-user interface. The tag parameter serves as
Response a general mechanism for dialog identification. See Section 2.9 (RFC 3261) for details of
the tag parameter. Comparison of To header fields for equality is identical to
comparison of From header fields. RFC 3261 (see Section 4.2) defines the rules for
parsing a display name, URI and URI parameters, and header field parameters.
The compact form of the To header field is t. The following are examples of valid To
header fields:
Trigger-Consent/RFC 5360 The Trigger-Consent header field specified in RFC 5360 (see Section 19.8) facilitates the
(Standards Track)/Request consent-based communications between the users providing the resource lists in SIP
request messages to trigger consent lookups. Receipt of these requests without
explicit consent can cause a number of problems. These include amplification and
DoS attacks. These problems are described in more detail in RFC 4453. RFC 5360
conceptualizes that a relay that can be any SIP server, be it a proxy, B2BUA, or some
hybrid, that receives a request, translates its Request-URI into one or more next-hop
URIs, and delivers the request to those URIs. The Request-URI of the incoming
request is referred to as target URI, while the destination URIs of the outgoing
requests are referred to as recipient URIs.
Thus, an essential aspect of a relay is that of translation. When a relay receives a
request, it translates the Request-URI (target URI) into one or more additional URIs
(recipient URIs). Through this translation operation, the relay can create outgoing
requests to one or more additional recipient URIs, thus creating the consent problem.
The consent problem is created by two types of translations: translations based on
local data and translations that involve amplifications.
Translation operations based on local policy or local data (such as registrations) are the
vehicle by which a request is delivered directly to an end point, when it would not
otherwise be possible to. In other words, if a spammer has the address of a user,
sip:[email protected], it cannot deliver a MESSAGE request to the UA of that user
without having access to the registration data that maps sip:[email protected] to the
UA on which that user is present. Thus, it is the usage of this registration data, and
more generally, the translation logic, that is expected to be authorized in order to
prevent undesired communications. Of course, if the spammer knows the address of
the UA, it will be able to deliver requests directly to it.
(Continued)
Basic Session Initiation Protocol ◾ 151
Translation operations that result in more than one recipient URI are a source of
amplification. Servers that do not perform translations, such as outbound proxy
servers, do not cause amplification. On the other hand, servers that perform
translations (e.g., inbound proxies authoritatively responsible for a SIP domain) may
cause amplification if the user can be reached at multiple end points (thereby
resulting in multiple recipient URIs). The Trigger-Consent header field allows potential
recipients of a translation to agree to be actual recipients by giving the relay
performing the translation permission to send them traffic. Example:
Trigger-Consent: sip:[email protected];
target-uri=sip:[email protected]
Unsupported/RFC 3261 The Unsupported header field lists the features not supported by the UAS. See Section
(Standards Track)/Response 3.4.32 of RFC 3261 for motivation. Example:
Unsupported: foo
User-Agent/RFC 3261 The User-Agent header field contains information about the UAC originating the
(Standards Track)/Request request. The semantics of this header field are defined in Section 14.43 of RFC 2616
and Response (obsoleted by RFCs 7230–7235). Revealing the specific software version of the UA
might allow the UA to become more vulnerable to attacks against software that is
known to contain security holes. Implementers should make the User-Agent header
field a configurable option. Example:
User-to-User/RFC 7433 RFC 7433 (see Section 16.9) defines a new SIP header field User-to-User to transport call
(Standards Track)/ control UUI data to meet the requirements specified in RFC 6567. To help tag and
Message-Body identify the UUI data used with this header field, purpose, content, and encoding
header field parameters are defined. The purpose header field parameter identifies
the package that defines the generation and usage of the UUI data for a particular
application. The value of the purpose parameter is the package name, as registered in
the UUI Packages subregistry defined in Section 6.3 of RFC 7344. For the case of
interworking with the ISDN UUI service, the ISDN UUI service interworking package
is used. The default value for the purpose header field is isdn-uui as defined in RFC
7434. If the purpose header field parameter is not present, the ISDN UUI must be
used. The content header field parameter identifies the actual content of the UUI data.
If not present, the default content defined for the package must be used.
Newly defined UUI packages must define or reference at least a default content value.
The encoding header field parameter indicates the method of encoding the
information in the UUI data associated with a particular content value. This
specification only defines encoding=hex. If the encoding header field parameter is not
present, the default encoding defined for the package MUST be used. UUI data is
considered an opaque series of octets. This mechanism must not be used to convey a
URL or URI, since the Call-Info header field already supports this use case. Example:
User-to-User: 342342ef34;encoding=hex
(Continued)
152 ◾ Handbook on Session Initiation Protocol
RFC 7433 has also registered the UUI header field parameters with the IANA. The
following rows .have been added to the “Header Field Parameters and Parameter
Values” section of the SIP parameter registry:
Parameter
Header Field Name Predefined Values Reference
Via/RFC 3261 (Standards The Via header field indicates the path taken by the request thus far and indicates the
Track)/Request and path that should be followed in routing responses. The branch ID parameter in the Via
Response header field values serves as a transaction identifier, and is used by proxies to detect
loops. A Via header field value contains the transport protocol used to send the
message, the client’s host name or network address, and possibly the port number at
which it wishes to receive responses. A Via header field value can also contain
parameters such as maddr, ttl, received, and branch, whose meaning and use are
described in other sections. For implementations compliant to this specification, the
value of the branch parameter must start with the magic cookie z9hG4bK, as discussed
in Section 3.1 (RFC 3261). Transport protocols defined here are UDP, TCP, TLS, and
SCTP. TLS means TLS over TCP. When a request is sent to a SIPS URI, the protocol still
indicates SIP, and the transport protocol is TLS.
The compact form of the Via header field is v. In this example, the message originated
from a multihomed host with two addresses, 192.0.2.1 and 192.0.2.207. The sender
guessed wrong as to which network interface would be used. erlang.belltelephone.
com noticed the mismatch and added a parameter to the previous hop’s Via header
field value, containing the address that the packet actually came from. The host or
network address and port number are not required to follow the SIP URI syntax.
Specifically, LWS on either side of the “:” or “/” is allowed, as shown here:
Even though this specification mandates that the branch parameter be present in all
requests, the ABNF for the header field indicates that it is optional. This allows
interoperation with RFC 2543 (obsoleted by RFC 3261) elements, which did not have to
insert the branch parameter. Two Via header fields are equal if their sent-protocol and
sent-by fields are equal, both have the same set of parameters, and the values of all
parameters are equal.
(Continued)
Basic Session Initiation Protocol ◾ 153
RFC 7339 (see Section 13.3) specification defines four new Via header parameters as
detailed below in the “Header Field Parameter and Parameter Values” subregistry as
per the registry created by RFC 3968. The required information is
Warning/RFC 3261 (Standards The Warning header field is used to carry additional information about the status of a
Track)/Response response. Warning header field values are sent with responses and contain a three-
digit warning code, host name, and warning text. The warn-text should be in a natural
language that is most likely to be intelligible to the human user receiving the
response. This decision can be based on any available knowledge, such as the location
of the user, the Accept-Language field in a request, or the Content-Language field in a
response. The default language is i-default as defined in RFC 2277.
The currently defined warn-codes are listed below, with a recommended warn-text in
English and a description of their meaning. These warnings describe failures induced
by the session description. The first digit of warning codes beginning with “3”
indicates warnings specific to SIP. Warnings 300 through 329 are reserved for
indicating problems with keywords in the session description, 330 through 339 are
warnings related to basic network services requested in the session description, 370
through 379 are warnings related to quantitative QOS parameters requested in the
session description, and 390 through 399 are miscellaneous warnings that do not fall
into one of the above categories.
1xx and 2xx have been taken from HTTP/1.1. Additional warn-codes can be defined
through IANA. Examples:
interoperability. An Internet Assigned Numbers Authority string, where line folding cannot take place. The production
(IANA) registry of option tags is used to ensure easy refer- for qdtext can be found in RFC 3261. There are additional
ence. The option tags defined in SIP are shown in Table 2.11. constraints on the usage of feature-param that cannot be rep-
resented in an ABNF. There must only be one instance of any
feature tag in feature-param. Any numbers present in a fea-
ture parameter must be representable using an ANSI C double.
2.11 SIP Media Feature Tags Following these rules, RFC 3840 updates the one in RFC 3261
RFC 3840 (also see Section 3.4) that is described here provides for contact-params (see detail ABNF syntax in Section 2.4.1.2):
a more general framework for an indication of capabilities and
characteristics in SIP. Capability and characteristic informa- contact-params
= c-p-q/c-p-expires/
feature-param
tion about a UA is carried as parameters of the Contact header /contact-extension
field. These parameters can be used within REGISTER
requests and responses, OPTIONS responses, and requests In addition, RFC 5626 has extended the contact param-
and responses that create dialogs such as INVITE. eters using reg-id and instance-id as described earlier. The
detail of all ABNF syntaxes can be seen in Section 2.4.1.2.
2.11.1 Contact Header Field
2.11.2 Feature Tag Name,
RFC 3840 (also see Section 3.4) extends the Contact header
field. In particular, it allows for the Contact header field param-
Description, and Usage
eters to include feature-param. Feature-param is a feature RFC 3840 (also see Section 3.4) defines an initial set of SIP
parameter that describes a feature of the UA associated with media feature tags for use registered with IANA as depicted in
the URI in the Contact header field. Feature parameters are Table 2.12. If any new media type is defined in the future, the
identifiable because they either belong to the well-known set name of the feature tag must equal sip. concatenated with the
of base feature tags, or they begin with a plus sign. It should name of the media type, unless there is an unlikely naming
be noted that the tag-value-list uses an actual comma instead collision between the new media type and an existing feature
of the comma construction because it appears within a quoted tag registration. For example, if a new feature tag sip.gruu is
Basic Session Initiation Protocol ◾ 155
F2. INVITE
[email protected]
example.com F1. INVITE example.net
(URI list in a URI list)
F1. INVITE F3. INVITE
PoC Controlling Participating
network PoC server PoC server
[email protected] [email protected]
(Content of refused URI list)
[email protected]
F2. 403 Forbidden
F4. INVITE
(a) (c)
F2. INVITE
[email protected] Participant
PoC server
example.com
F3. INVITE
[email protected]
F1. INVITE Controlling Participant
PoC PoC server
[email protected] server example.org
[email protected] F4. 403 Forbidden
[email protected]
[email protected]
F5. INVITE
[email protected]
(b)
Figure 2.7 PoC behavior and operation—(a) expected behavior, (b) network behavior, and (c) operational view. (Copyright
IETF. Reproduced with permission.)
registered in the SIP tree, the IANA registration would be for use the mechanism of RFC 3840. By doing so, the originator
the tag sip.gruu and not +sip.gruu or gruu. As such, all regis- of a REFER may inform the recipient about the characteristics
trations into the SIP tree will have the sip. prefix. of the target that the induced request is expected to reach.
RFC 4508 extends the SIP REFER method to be used
with feature parameters defined in RFC 3840. Feature tags
2.11.3 Conveying Feature Tags with REFER are used by a UA to convey to another UA information about
capabilities and features. This information can be shared by
2.11.3.1 Overview
a UA using a number of mechanisms, including REGISTER
The SIP Caller Preferences extension defined in RFC 3840 requests and responses, and OPTIONS responses. This
provides a mechanism that allows a SIP request to convey information can also be shared in the context of a dialog by
information relating to the originator’s capabilities and pref- inclusion with a remote target URI (Contact URI). Feature
erences for the handling of that request. The SIP REFER tag information can be very useful to another UA. It is espe-
method defined in RFC 3515 (see Section 2.5) provides a cially useful before the establishment of a session. For exam-
mechanism that allows one party to induce another to initiate ple, if a UA knows (e.g., through an OPTIONS query) that
a SIP request. This document extends the REFER method to the remote UA supports both video and audio, the calling
156 ◾ Handbook on Session Initiation Protocol
199/RFC 6228 This option tag is for indicating support of the 199 Early Dialog Terminated provisional response
code. When present in a Supported header of a request, it indicates that the UAC supports the
199 response code. When present in a Require or Proxy-Require header field of a request, it
indicates that the UAS, or proxies, must support the 199 response code. It does not require the
UAS, or proxies, to actually send 199 responses.
100rel/RFC 3262 This option tag is for reliability of provisional responses. When present in a Supported header,
it indicates that the UA can send or receive reliable provisional responses. When present in a
Require header in a request, it indicates that the UAS must send all provisional responses
reliably. When present in a Require header in a reliable provisional response, it indicates that
the response is to be sent reliably.
answermode/RFC This answermode option tag is for support of the Answer-Mode and Priv-Answer-Mode
5373 extensions used to negotiate automatic or manual answering of a request. The SIP option tag
indicating support for this extension is answermode. For implementers: SIP header field
names and values are always compared in a case-insensitive manner. The pretty capitalization
is just for readability. This syntax includes extension hooks (token for answer-mode values and
generic-param for optional parameters) that could be defined in the future. This specification
defines only the behavior for the values given explicitly above. To provide forward
compatibility, implementations must ignore unknown values.
early-session/RFC A UA adding the early-session option tag to a message indicates that it understands the early-
3959 session disposition type.
Content-Disposition: early-session
eventlist/RFC 4662 The eventlist option tag allows subscriptions to lists of resources.
from-change/RFC This option tag is used to indicate that a UA supports changes to URIs in From and To header
4916 fields during a dialog.
gin/RFC 6140 This option tag is used to identify the extension that provides registration for Multiple Phone
Numbers in SIP. When present in a Require or Proxy-Require header field of a REGISTER
request, it indicates that support for this extension is required of registrars and proxies,
respectively, which are a party to the registration transaction.
Join/RFC 3911 RFC 3911 defines a Require/Supported header option tag join. UAs that support the Join header
must include the join option tag in a Supported header field. UAs that want explicit failure
notification if Join is not supported may include the join option in a Require header field.
Example:
multiple-refer/RFC The multiple-refer option tag indicates support for REFER requests that contain a resource list
5368 document describing multiple REFER targets.
(Continued)
Basic Session Initiation Protocol ◾ 157
norefersub/RFC 4486 This option tag specifies a UA ability of accepting a REFER request without establishing an implicit
subscription (compared with the default case defined in RFC 3515). This option tag, when included
in the Supported header field, specifies that a UA is capable of accepting a REFER request without
creating an implicit subscription when acting as a REFER-Recipient. The REFER-Issuer can know the
capabilities of the REFER-Recipient from the presence of the option tags in the Supported header
field of the dialog initiating request or response. Another way of learning the capabilities would be
by using presence, such as defined in RFC 5196. However, if the capabilities of the REFER-Recipient
are not known, using the norefersub tag with the Require header field is not recommended. This is
because in the event the REFER-Recipient does not support the extension, in order to fall back to
the normal REFER, the REFER-Issuer will need to issue a new REFER transaction, thus resulting in
additional round trips. A REFER-Recipient will reject a REFER request containing a Require:
norefersub header field with a 420 Bad Extension response unless it supports this extension. Note
that Require: norefersub can be present with a Refer-Sub: false header field.
outbound/RFC 5626 The outbound option tag is used to identify UAs and registrars that support extensions for
Client-Initiated Connections. A UA places this option in a Supported header to communicate
its support for this extension. A registrar places this option tag in a Require header to indicate
to the registering UA that the registrar used registrations using the binding rules defined in
this extension.
resource-priority/ The resource-priority option tag indicates or requests support for the resource priority mechanism.
RFC 4412 417 Unknown Resource-Priority response code (see Section 2.6) defines its behavior.
sec-agree/RFC 3329 The sec-agree option tag indicates support for the Security Agreement mechanism. When used
in the Require, or Proxy-Require headers, it indicates that proxy servers are required to use the
Security Agreement mechanism. When used in the Supported header, it indicates that the UA
Client supports the Security Agreement mechanism. When used in the Require header in the
494 Security Agreement Required or 421 Extension Required responses, it indicates that the
UAC must use the Security Agreement Mechanism.
tdialog/RFC 4538 This option tag is used to identify the target dialog header field extension. When used in a Require
header field, it implies that the recipient needs to support the Target-Dialog header field. When
used in a Supported header field, it implies that the sender of the message supports it.
uui/RFC 7433 This option tag is used to indicate that a UA supports and understands the Use-to-User header field.
Audio/sip. It indicates that the device Boolean It is most useful in a Routing a call to a phone
audio/1.3.6.1.8.4.1 supports audio as a streaming communications application that can support audio
media type. for describing the capabilities
of a device, such as a phone or
PDA.
Application/sip. It indicates that the device Boolean It is most useful in a Routing a call to a phone
application/1.3.6.1.8.4.2 supports application as a communications application, that can support a media
streaming media type. for describing the capabilities control application
of a device, such as a phone or
PDA.
Data/sip.data/1.3.6.1.8.4.3 It indicates that the device Boolean It is most useful in a Routing a call to a phone
158 ◾ Handbook on Session Initiation Protocol
Control/sip. It indicates that the device Boolean It is most useful in a Routing a call to a phone
control/1.3.6.1.8.4.4 supports control as a communications application that can support a floor
streaming media type. for describing the capabilities control application
of a device, such as a phone or
PDA.
Video/sip. It indicates that the device Boolean It is most useful in a Routing a call to a phone
video/1.3.6.1.8.4.5 supports video as a streaming communications application that can support video
media type. for describing the capabilities
of a device, such as a phone or
PDA.
Text/sip.text/1.3.6.1.8.4.6 It indicates that the device Boolean It is most useful in a Routing a call to a phone
supports text as a streaming communications application that can support text
media type. for describing the capabilities
of a device, such as a phone or
PDA.
(Continued)
Table 2.12 (Continued) SIP Media Feature Tag Defined in RFC 3840 and Other RFCs
Media Feature/Media
Feature Tag Name/ASN.1
Identifier Description Value Primary Usage Example of Typical Usage
Automata/sip. It indicates whether the UA Boolean. TRUE indicates that It is most useful in a Refusing to communicate
automata/1.3.6.1.8.4.7 represents an automata (such the UA represents an communications application with the automata when
as a voice-mail server, automata. for describing the capabilities it is known that
conference server, IVR, or of a device, such as a phone or automated services are
recording device) or a human. PDA. unacceptable
Class/sip.class/1.3.6.1.8.4.8 It indicates the setting, Token with an equality It is most useful in a Choosing between a
business or personal, in which relationship. Typical values communications application, business phone and a
a communications device is include business (the for describing the capabilities home phone
used. device is used for business of a device, such as a phone or
communications) and PDA.
personal (the device is
used for personal
communications).
Mobility/sip. It indicates whether the device Token with an equality It is most useful in a Choosing to
mobility/1.3.6.1.8.4.10 is fixed (meaning that it is relationship. Typical values communications application communicate with a
associated with a fixed point include fixed (the device is for describing the capabilities wireless phone instead
of contact with the network), stationary) and mobile (the of a device, such as a phone or of a desktop phone
or mobile (meaning that it is device can move around PDA.
not associated with a fixed with the user).
point of contact). Note that
cordless phones are fixed, not
mobile, based on this
definition.
(Continued)
Basic Session Initiation Protocol ◾ 159
Table 2.12 (Continued) SIP Media Feature Tag Defined in RFC 3840 and Other RFCs
Media Feature/Media
Feature Tag Name/ASN.1
Identifier Description Value Primary Usage Example of Typical Usage
Description/sip. It provides a textual String with an equality It is most useful in a Indicating that a device is
description/1.3.6.1.8.4.11 description of the device. relationship communications application of a certain make and
for describing the capabilities model
of a device, such as a phone or
PDA.
Events/sip. It indicates a SIP event Token with an equality It is most useful in a Choosing to
events/1.3.6.1.8.4.12 package, defined in RFC 6665, relationship. Values are communications application communicate with a
supported by a SIP UA. The taken from the IANA SIP for describing the capabilities server that supports the
values for this tag equal the Event type namespace of a device, such as a phone or message waiting event
event package names that are registry. PDA. package, such as a
registered by each event voice-mail server
package. defined in RFC 3842
160 ◾ Handbook on Session Initiation Protocol
Priority/sip. It indicates the call priorities An integer. Each integral It is most useful in a Choosing to
priority/1.3.6.1.8.4.13 the device is willing to handle. value corresponds to one communications application communicate with the
A value of X means that the of the possible values of for describing the capabilities emergency cell phone of
device is willing to take the Priority header field as of a device, such as a phone or a user
requests with priority X and specified in SIP of RFC PDA.
higher. This does not imply 3261. The mapping is
that a phone has to reject calls defined as nonurgent
of lower priority. As always, (integral value of 10; the
the decision on handling of device supports non-
such calls is a matter of local urgent calls), normal
policy. (integral value of 20; the
device supports normal
calls), urgent (integral value
of 30; the device supports
urgent calls), and
emergency (integral value
of 40; the device supports
calls in the case of an
emergency situation).
(Continued)
Table 2.12 (Continued) SIP Media Feature Tag Defined in RFC 3840 and Other RFCs
Media Feature/Media
Feature Tag Name/ASN.1
Identifier Description Value Primary Usage Example of Typical Usage
Methods/sip. It indicates a SIP method Token with an equality It is most useful in a Choosing to
methods/1.3.6.1.8.4.14 supported by this UA. In this relationship. Values are communications application communicate with a
case, supported means that taken from the Methods for describing the capabilities presence application on
the UA can receive requests table defined in the IANA of a device, such as a phone or a PC, instead of a PC
with this method. In that SIP parameters registry. PDA. phone application
sense, it has the same
connotation as the Allow
header field.
Extensions/sip. It is a SIP extension (each of Token with an equality It is most useful in a Choosing to
extensions/1.3.6.1.8.4.15 which is defined by an option relationship. Values are communications application communicate with a
tag registered with IANA) that taken from the option tags for describing the capabilities phone that supports
is understood by the UA. table in the IANA SIP of a device, such as a phone or QOS preconditions
Understood, in this context, parameters registry. PDA. instead of one that does
means that the option tag not
would be included in a
Supported header field in a
request.
Schemes/sip. It indicates a URI scheme, Token with an equality It is most useful in a Choosing to get
schemes/1.3.6.1.8.4.16 defined in RFC 2396, that is relationship. Values are communications application redirected to a phone
supported by a UA. taken from the IANA URI for describing the capabilities number when a called
Supported implies, for scheme registry. of a device, such as a phone or party is busy, rather than
example, that the UA would PDA. a web page
know how to handle a URI of
that scheme in the Contact
header field of a redirect
response.
(Continued)
Basic Session Initiation Protocol ◾ 161
Table 2.12 (Continued) SIP Media Feature Tag Defined in RFC 3840 and Other RFCs
Media Feature/Media
Feature Tag Name/ASN.1
Identifier Description Value Primary Usage Example of Typical Usage
Actor/sip. It indicates the type of entity Token with an equality It is most useful in a Requesting that a call not
actor/1.3.6.1.8.4.17 that is available at this URI. relationship. The following communications application be routed to voice mail
values are defined: for describing the capabilities
Principal—the device of a device, such as a phone or
provides communication PDA.
with the principal that is
associated with the device.
Often this will be a specific
human being, but it can be
an automata (e.g., when
calling a voice portal);
162 ◾ Handbook on Session Initiation Protocol
Attendant—the device
provides communication
with an automaton or
person that will act as an
intermediary in contacting
the principal associated
with the device, or a
substitute; and Msg-Taker—
the device provides
communication with an
automaton or person that
will take messages and
deliver them to the
principal; Information—the
device provides
communication with an
automaton or person that
will provide information
about the principal.
(Continued)
Table 2.12 (Continued) SIP Media Feature Tag Defined in RFC 3840 and Other RFCs
Media Feature/Media
Feature Tag Name/ASN.1
Identifier Description Value Primary Usage Example of Typical Usage
isfocus/sip. It indicates that the UA is a Boolean It is most useful in a Indicating to a UA that the
isfocus/1.3.6.1.8.4.18 conference server, also communications application server to which it has
known as a focus, and will mix for describing the capabilities connected is a
together the media for all calls of a device, such as a phone or conference server
to the same URI defined in PDA.
RFC 4353.
sip.uui-isdn/1.3.6.1.8.4.x This media feature tag when None It is most useful for Indicating that a mobile
used in a Contact header field interworking and transporting phone supports Single
of a SIP request or a SIP User-to-User Information Radio Voice call
response indicates that the (UUI) from the ITU-T Digital Continuity (SRVCC) for
entity sending the SIP Subscriber Signaling System calls in the alerting
message supports the No. 1 (DSS1). User-user phase.
package uui-isdn specified in information elements within
RFC 7434. SIP are described in RFC 6567.
Basic Session Initiation Protocol ◾ 163
164 ◾ Handbook on Session Initiation Protocol
2.11.3.3 Feature Tag Usage Examples Server that is not a SIP entity. The ABNF syntax that is
used for SIP signaling messages are also provided. We
2.11.3.3.1 isfocus have described the details of the terminologies, request and
The example below shows how the isfocus feature tag can be response messages, message headers, message body, option
used by REFER-Issuer to tell the REFER-Recipient that the tag, tag, and message format of the basic SIP.
REFER-Target is a conference focus and, consequently, that
sending an INVITE will bring the REFER-Recipient into PROBLEMS
the conference:
1. What are the differences between networked and
Refer-To: sip:[email protected];isfocus stand-alone multimedia communications?
2. Describe the key functional features of the point-
2.11.3.3.2 Voice and Video to-point and multipoint-to-multipoint networked
multimedia communications. What are their major
The example below shows how a REFER-Issuer can tell differences?
the REFER-Recipient that the REFER-Target supports 3. Describe the media bridging characteristic of audio,
audio and video and, consequently, that a video and audio video, and data applications. What are their major
session can be established by sending an INVITE to the differences?
REFER-Target: 4. What are the fundamental differences between the
circuit-switching and the IP packet-switching commu-
Refer-To: "Alice’s Videophone" <sip:alice@
videophone.example.com>;audio;video
nication networks? What are the challenges in devel-
oping the multimedia call signaling protocol over these
two fundamentally different kinds of networks?
2.11.3.3.3 URI and Multiple Feature Tags 5. What are the challenges in meeting the QOS require-
The example below shows how the REFER-Issuer can tell the ments of multimedia communications over the IP
REFER-Recipient that the REFER-Target is a voice-mail server. packet-switching network? How can a multimedia call
Note that the transport URI parameter is enclosed within control signaling protocol deal with solving the QOS
the < and > so that it is not interpreted as a header parameter. problems? How does the QOS solution differ in solv-
ing the QOS problems between the private IP network
Refer-To: <sip:[email protected]; and the public Internet?
transport=tcp>;actor="msg-taker"; 6. Describe the rules of ABNF. Describe the key differ-
automata;audio
ences between RFCs 2806, 4434, and 5234. What are
the exceptions in SIP ABNF syntaxes adopted for SIP
messages described in RFC 3261?
2.12 Summary 7. Describe the major features of the SIP call signaling
We have described the key characteristics of the networked protocol. Describe request, response, header, and body
multimedia session for both point-to-point and multipoint of SIP messages.
communications. A multimedia session may consist of audio, 8. What are the request messages of SIP? Describe the
video, or data applications. Each user may have many audio characteristics of each SIP method. What are the key
codecs, video codecs, or data applications. It is natural that differences between all the SIP methods? Why is the
negotiations between users for using audio codecs, video INVITE method so special in SIP?
codecs, or data applications for setting up the session will 9. What are the key differences between SIP and HTTP?
take place along with meeting the requirements of QOS for What are the pros and cons of SIP that is targeted
audio/video codecs and data-sharing applications. For multi- for setting up the sessions for the real-time conversa-
media communications between more than two users, media tional networked multimedia communications because
bridging is an essential requirement. The SIP signaling pro- of adopting messaging structures similar to those of
tocol is designed in meeting many of those requirements for HTTP?
the networked multimedia communications. 10. Describe in detail why SIP needs to use a host of pro-
We explained the trapezoidal model of the SIP session tocols under its umbrella to set up the multimedia
setup for the point-to-point call between two users with session?
signaling and media. In addition, the characteristics of the 11. Describe the concept of the SIP network. Describe the
SIP network functional entities such as UA, Back-to-Back characteristics of each SIP server that is used over the
UA, Proxy Server, Redirect Server, Register Server, and SIP network. Why is a location server not a SIP func-
Application Server are described, including the Location tional entity?
Basic Session Initiation Protocol ◾ 165
12. Why is a SIP proxy server so important that it is termed creation and invoking of services that go far beyond
as the call controller in the SIP network? Compare the the primary objectives in designing SIP?
functional/capability differences between each server
of the SIP network.
13. Describe the call flows of SIP signaling for the point-
to-point call using only a proxy server assuming both References
users are in the same administrative domain. Populate 1. 3GPP, “TS 24.229: IP Multimedia Call Control Protocol
the headers of SIP request and response messages based on SIP and SDP; Stage 3 (Release 5),” 3GPP 24.229,
with the key features that may be needed conceptu- September 2002. Available at ftp://ftp.3gpp.org/Specs/archive
ally. Develop a conceptual call model for a three-party /24_series/24.229/.
conference call assuming that a centralized conference 2. 3GPP, “Numbering, addressing and identification,” 3GPP
server is used where a user sets up the point-to-call with TS 23.003 3.15.0, October 2006.
the conference server. 3. IEEE, “Standard for information technology—Portable
14. What will be the modifications of the conceptual call operating system interface (POSIX). Base definitions,” IEEE
1003.1-2004, 2004.
flow of the three-party call flow of Q.11 if the media 4. 3GPP, “Telecommunication management; Charging man-
bridging server is separated from the conference appli- agement; Charging architecture and principles,” 3GPP TS
cation server? 32.240 12.3.0, March 2013.
15. Why is SIP so popular as a signaling protocol even for 5. 3GPP, "Multimedia Subsystem (IMS) Charging," 3GPP TS
communications between application servers for the 32.260 V13.2.0 IP Release 13, June 2015.
Chapter 3
167
168 ◾ Handbook on Session Initiation Protocol
bodies using Secure/Multipurpose Internet Mail Extensions of the user or resource that is the target of this request. This
(S/MIME; see Section 19.6). may or may not be the ultimate recipient of the request. The
To header field may contain a SIP or SIP Security (SIPS)
URI, but it may also make use of other URI schemes (e.g.,
3.1.2 UAC General Behavior the tel URL; RFC 3966, see Section 4.2.2) when appropri-
This section covers UAC behavior outside of a dialog. ate. All SIP implementations MUST support the SIP URI
scheme. Any implementation that supports Transport Layer
Security (TLS) MUST support the SIPS URI scheme. The
3.1.2.1 Generating the Request To header field allows for a display name. A UAC may learn
A valid SIP request formulated by a UAC must, at a mini- how to populate the To header field for a particular request
mum, contain the following header fields: To, From, CSeq, in a number of ways. Usually, the user will suggest the To
Call-ID, Max-Forwards, and Via; all of these header fields header field through a human interface, perhaps inputting
are mandatory in all SIP requests. These six header fields are the URI manually or selecting it from some sort of address
the fundamental building blocks of a SIP message, as they book. Frequently, the user will not enter a complete URI,
jointly provide for most of the critical message routing ser- but rather a string of digits or letters (e.g., bob). It is at the
vices, including the addressing of messages, the routing of discretion of the UA to choose how to interpret this input.
responses, limiting message propagation, ordering of mes- Using the string to form the user part of a SIP URI implies
sages, and the unique identification of transactions. These that the UA wishes the name to be resolved in the domain to
header fields are in addition to the mandatory request line, the right-hand side (RHS) of the at-sign in the SIP URI (e.g.,
which contains the method, Request-URI, and SIP ver- sip:[email protected]).
sion. Examples of requests sent outside of a dialog include Using the string to form the user part of a SIPS URI
an INVITE to establish a session (Section 3.7) and an implies that the UA wishes to communicate securely, and
OPTIONS request to query for capabilities (Section 3.4). that the name is to be resolved in the domain to the RHS of
the at-sign. The RHS will frequently be the home domain
of the requestor, which allows for the home domain to pro-
3.1.2.1.1 Request-URI
cess the outgoing request. This is useful for features like
The initial Request-URI of the message should be set to the speed dial that require interpretation of the user part in the
value of the Uniform Resource Identifier (URI) in the To home domain. The tel URL may be used when the UA does
field. One notable exception is the REGISTER method; not wish to specify the domain that should interpret a tele-
the behavior for setting the Request-URI of REGISTER is phone number that has been input by the user. Rather, each
given in Section 3.3. It may also be undesirable for privacy domain through which the request passes would be given
reasons or convenience to set these fields to the same value that opportunity. As an example, a user in an airport might
(especially if the originating UA expects that the Request- log in and send requests through an outbound proxy in the
URI will be changed during transit). In some special circum- airport. If they enter 411 (this is the phone number for local
stances, the presence of a preexisting route set can affect the directory assistance in the United States), that needs to be
Request-URI of the message. A preexisting route set is an interpreted and processed by the outbound proxy in the air-
ordered set of URIs that identify a chain of servers to which port, not the user’s home domain. In this case, tel:411 would
a UAC will send outgoing requests that are outside of a dia- be the right choice. A request outside of a dialog must not
log. Commonly, they are configured on the UA by a user or contain a To tag; the tag in the To field of a request identifies
service provider manually, or through some other non-SIP the peer of the dialog. Since no dialog is established, no tag
mechanism. When a provider wishes to configure a UA with is present. For further information on the To header field,
an outbound proxy, it is recommended that this be done by see Section 2.8.2 (Table 2.10). The following is an example
providing it with a preexisting route set with a single URI, of a valid To header field:
that of the outbound proxy. When a preexisting route set
is present, the procedures for populating the Request-URI To: Carol <sip:[email protected]>
and Route header field detailed in Section 3.6.2.1.1 must be
followed (even though there is no dialog), using the desired
3.1.2.1.3 From
Request-URI as the remote target URI.
The From header field indicates the logical identity of the
initiator of the request, possibly the user’s AOR. Like the
3.1.2.1.2 To
To header field, it contains a URI and optionally a display
The To header field first and foremost specifies the desired name. It is used by SIP elements to determine which process-
logical recipient of the request, or the address of record (AOR) ing rules to apply to a request (e.g., automatic call rejection).
SIP Message Elements ◾ 169
As such, it is very important that the From URI not contain information on the Call-ID header field, see Section 2.8.2.
Internet Protocol (IP) addresses or the fully qualified domain Example:
name (FQDN) of the host on which the UA is running, since
these are not logical names. The From header field allows Call-ID: f81d4fae-7dec-11d0-a765
[email protected]
for a display name. A UAC should use the display name
Anonymous, along with a syntactically correct, but otherwise
meaningless, URI (like sip:[email protected]), if the 3.1.2.1.5 CSeq
identity of the client is to remain hidden.
Usually, the value that populates the From header field The CSeq header field serves as a way to identify and order
in requests generated by a particular UA is preprovisioned by transactions. It consists of a sequence number and a method.
the user or by the administrators of the user’s local domain. The method must match that of the request. For non-
If a particular UA is used by multiple users, it might have REGISTER requests outside of a dialog, the sequence num-
switchable profiles that include a URI corresponding to ber value is arbitrary. The sequence number value must be
the identity of the profiled user. Recipients of requests can expressible as a 32-bit unsigned integer and MUST be less
authenticate the originator of a request in order to ascertain than 2**31. As long as it follows the above guidelines, a client
that they are who their From header field claims they are (see may use any mechanism it would like to select CSeq header
Sections 19.4.5 and 19.4.9 for more details on authentica- field values. Section 3.6.2.1.1 discusses construction of the
tion). The From field MUST contain a new tag parameter, CSeq for requests within a dialog. Example:
chosen by the UAC. See Section 2.9 for details on choosing
CSeq: 4711 INVITE
a tag. For further information on the From header field, see
Section 2.8.2. Examples:
this rule are CANCEL and ACK for non-2xx responses. As the request, it must insert a Require header field into the request
discussed below, a CANCEL request will have the same value listing the option tag for that extension. If the UAC wishes to
of the branch parameter as the request it cancels. As discussed apply an extension to the request and insist that any proxies that
in Section 3.12.1.1.3, an ACK for a non-2xx response will are traversed understand that extension, it must insert a Proxy-
also have the same branch ID as the INVITE whose response Require header field into the request listing the option tag for
it acknowledges. The uniqueness property of the branch ID that extension. As with the Supported header field, the option
parameter, to facilitate its use as a transaction ID, was not part tags in the Require and Proxy-Require header fields must only
of RFC 2543 obsoleted by RFC 3261. The branch ID inserted refer to extensions defined in Standards Track RFCs.
by an element compliant with this specification MUST
always begin with the characters z9hG4bK. These seven char-
3.1.2.1.10 Additional Message Components
acters are used as a magic cookie (seven is deemed sufficient
to ensure that an older RFC 2543 implementation would not After a new request has been created, and the header fields
pick such a value), so that servers receiving the request can described above have been properly constructed, any addi-
determine that the branch ID was constructed in the fashion tional optional header fields are added, as are any header
described by this specification (i.e., globally unique). Beyond fields specific to the method. SIP requests may contain a
this requirement, the precise format of the branch token is MIME-encoded message body. Regardless of the type of
implementation defined. The Via header maddr, ttl, and sent- body that a request contains, certain header fields must be
by components will be set when the request is processed by formulated to characterize the contents of the body. For fur-
the transport layer (Section 3.13). Via processing for proxies is ther information on these header fields, see Section 2.8.2.
described in Sections 3.11.6 and 3.11.7.
3.1.2.2 Sending the Request
3.1.2.1.8 Contact
The destination for the request is then computed. Unless
The Contact header field provides a SIP or SIPS URI that can there is local policy specifying otherwise, the destination
be used to contact that specific instance of the UA for subse- must be determined by applying the Domain Name System
quent requests. The Contact header field MUST be present (DNS) procedures described in RFC 3263 (see Section 8.2.4)
and contain exactly one SIP or SIPS URI in any request that as follows. If the first element in the route set indicated a
can result in the establishment of a dialog. For the methods strict router (resulting in forming the request as described in
defined in this specification, that includes only the INVITE Section 3.6.2.1.1), the procedures MUST be applied to the
request. For these requests, the scope of the Contact header Request-URI of the request. Otherwise, the procedures are
field is global. That is, the Contact header field value contains applied to the first Route header field value in the request
the URI at which the UA would like to receive requests, and (if one exists), or to the request’s Request-URI if there is no
this URI must be valid even if used in subsequent requests Route header field present. These procedures yield an ordered
outside of any dialogs. If the Request-URI or top Route set of address, port, and transports to attempt. Independent
header field value contains a SIPS URI, the Contact header of which URI is used as input to the procedures of RFC
field must contain a SIPS URI as well. For further informa- 3263 (see Section 8.2.4), if the Request-URI specifies a SIPS
tion on the Contact header field, see Section 2.8.2. resource, the UAC must follow the procedures of RFC 3263
(see Section 8.2.4) as if the input URI were a SIPS URI.
Local policy may specify an alternate set of destinations
3.1.2.1.9 Supported and Require
to attempt. If the Request-URI contains a SIPS URI, any
If the UAC supports extensions to SIP that can be applied alternate destinations must be contacted with TLS. Beyond
by the server to the response, the UAC should include a that, there are no restrictions on the alternate destinations if
Supported header field in the request listing the option tags the request contains no Route header field. This provides a
(Section 2.10) for those extensions. The option tags listed simple alternative to a preexisting route set as a way to specify
must only refer to extensions defined in Standards Track an outbound proxy. However, that approach for configuring
RFCs. This is to prevent servers from insisting that clients an outbound proxy is not recommended; a preexisting route
implement nonstandard, vendor-defined features in order set with a single URI should be used instead. If the request
to receive service. Extensions defined by experimental and contains a Route header field, the request should be sent to
informational RFCs are explicitly excluded from usage with the locations derived from its topmost value, but maybe sent
the Supported header field in a request, since they too are to any server that the UA is certain will honor the Route and
often used to document vendor-defined extensions. Request-URI policies specified in this document (as opposed
If the UAC wishes to insist that a UAS understand an exten- to those in RFC 2543 obsoleted by RFC 3261). In particular,
sion that the UAC will apply to the request in order to process a UAC configured with an outbound proxy should attempt
SIP Message Elements ◾ 171
to send the request to the location indicated in the first Route of additional Via header field values that precede the origina-
header field value instead of adopting the policy of sending tor of the request suggests that the message was misrouted or
all messages to the outbound proxy. possibly corrupted.
This ensures that outbound proxies that do not add
Record-Route header field values will drop out of the path of
3.1.2.3.4 Processing 3xx Responses
subsequent requests. It allows end points that cannot resolve
the first Route URI to delegate that task to an outbound Upon receipt of a redirection response (e.g., a 301 response
proxy. The UAC should follow the procedures defined in status code), clients should use the URI(s) in the Contact
RFC 3263 (see Section 8.2.4) for stateful elements, trying header field to formulate one or more new requests based
each address until a server is contacted. Each try constitutes on the redirected request. This process is similar to that
a new transaction, and therefore each carries a different top- of a proxy recursing on a 3xx class response as detailed in
most Via header field value with a new branch parameter. Sections 3.11.5 and 3.11.6. A client starts with an initial tar-
Furthermore, the transport value in the Via header field is set get set containing exactly one URI, the Request-URI of the
to whatever transport was determined for the target server. original request. If a client wishes to formulate new requests
based on a 3xx class response to that request, it places the
URIs to try into the target set. Subject to the restrictions in
3.1.2.3 Processing Responses
this specification, a client can choose which Contact URIs
Responses are first processed by the transport layer and it places into the target set. As with proxy recursion, a client
then passed to the transaction layer. The transaction layer processing 3xx class responses must not add any given URI
performs its processing and then passes the response to the to the target set more than once. If the original request had
transaction user (TU). The majority of response processing a SIPS URI in the Request-URI, the client may choose to
in the TU is method specific. However, there are some gen- recurse to a non-SIPS URI, but should inform the user of the
eral behaviors independent of the method. redirection to an insecure URI.
Any new request may receive 3xx responses themselves
containing the original URI as a contact. Two locations can
3.1.2.3.1 Transaction Layer Errors
be configured to redirect to each other. Placing any given
In some cases, the response returned by the transaction layer will URI in the target set only once prevents infinite redirection
not be a SIP message, but rather a transaction layer error. When loops. As the target set grows, the client may generate new
a timeout error is received from the transaction layer, it must requests to the URIs in any order. A common mechanism is
be treated as if a 408 Request Timeout status code has been to order the set by the q parameter value from the Contact
received. If a fatal transport error is reported by the transport header field value. Requests to the URIs may be generated
layer (generally, due to fatal Internet Control Message Protocol serially or in parallel. One approach is to process groups of
[ICMP] errors in User Datagram Protocol [UDP] or connec- decreasing q-values serially and process the URIs in each
tion failures in Transmission Control Protocol [TCP]), the con- q-value group in parallel. Another is to perform only serial
dition must be treated as a 503 Service Unavailable status code. processing in decreasing q-value order, arbitrarily choosing
between contacts of equal q-value.
If contacting an address in the list results in a failure,
3.1.2.3.2 Unrecognized Responses
as defined in the next paragraph, the element moves to the
A UAC must treat any final response it does not recognize as next address in the list, until the list is exhausted. If the list
being equivalent to the x00 response code of that class, and is exhausted, then the request has failed. Failures should be
must be able to process the x00 response code for all classes. detected through failure response codes (codes greater than
For example, if a UAC receives an unrecognized response 399); for network errors, the client transaction will report any
code of 431, it can safely assume that there was something transport layer failures to the TU. Note that some response
wrong with its request and treat the response as if it had codes (detailed in Section 3.1.2.3.5) indicate that the request
received a 400 Bad Request response code. A UAC must treat can be retried; requests that are reattempted should not be
any provisional response different than 100 that it does not considered failures. When a failure for a particular contact
recognize as 183 Session Progress. A UAC must be able to address is received, the client should try the next contact
process 100 and 183 responses. address. This will involve creating a new client transaction
to deliver a new request. To create a request based on a con-
tact address in a 3xx response, a UAC must copy the entire
3.1.2.3.3 Via
URI from the target set into the Request-URI, except for the
If more than one Via header field value is present in a method-param and header URI parameters (see Section 4.2.1
response, the UAC should discard the message. The presence for a definition of these parameters).
172 ◾ Handbook on Session Initiation Protocol
It uses the header parameters to create header field values (Section 2.6), the Request-URI used a URI scheme not sup-
for the new request, overwriting header field values associated ported by the server. The client should retry the request, this
with the redirected request in accordance with the guidelines time using a SIP URI. If a 420 Bad Extension response is
in Section 4.2.1.5. Note that in some instances, header fields received (Section 2.6), the request contained a Require or
that have been communicated in the contact address may Proxy-Require header field listing an option tag for a feature
instead append to existing request header fields in the origi- not supported by a proxy or UAS. The UAC should retry
nal redirected request. As a general rule, if the header field the request, this time omitting any extensions listed in the
can accept a comma-separated list of values, then the new Unsupported header field in the response. In all of the above
header field value may be appended to any existing values in cases, the request is retried by creating a new request with
the original redirected request. If the header field does not the appropriate modifications. This new request constitutes
accept multiple values, the value in the original redirected a new transaction and should have the same value as the
request may be overwritten by the header field value com- Call-ID, To, and From of the previous request, but the CSeq
municated in the contact address. For example, if a contact should contain a new sequence number that is one higher
address is returned with the following value: than the previous one. With other 4xx responses, including
those yet to be defined, a retry may or may not be possible
sip:user@host?Subject=foo&Call-Info=<http:// depending on the method and the use case.
www.foo.com>
3.1.3.2.1 To and Request-URI in it those options it does not understand among those in
the Require header field of the request. Note that Require
The To header field identifies the original recipient of the
and Proxy-Require must not be used in a SIP CANCEL
request designated by the user identified in the From field.
request, or in an ACK request sent for a non-2xx response.
The original recipient may or may not be the UAS processing
These header fields must be ignored if they are present in
the request, due to call forwarding or other proxy operations.
these requests. An ACK request for a 2xx response must con-
A UAS may apply any policy it wishes to determine whether
tain only those Require and Proxy-Require values that were
to accept requests when the To header field is not the identity
present in the initial request. Example:
of the UAS. However, it is recommended that a UAS accept
requests even if they do not recognize the URI scheme (e.g., a
UAC->UAS:
tel: URI) in the To header field, or if the To header field does
not address a known or current user of this UAS. If, on the INVITE sip:[email protected] SIP/2.0
other hand, the UAS decides to reject the request, it should Require: 100rel
generate a response with a 403 Forbidden status code and
pass it to the server transaction for transmission. UAS->UAC:
However, the Request-URI identifies the UAS that is to
SIP/2.0 420 Bad Extension
process the request. If the Request-URI uses a scheme not
Unsupported: 100rel
supported by the UAS, it should reject the request with a
416 Unsupported URI Scheme response. If the Request-URI This behavior ensures that the client–server interaction
does not identify an address that the UAS is willing to accept will proceed without delay when all options are under-
requests for, it should reject the request with a 404 (Not stood by both sides, and only slow down if options are not
Found) response. Typically, a UA that uses the REGISTER understood (as in the example above). For a well-matched
method to bind its AOR to a specific contact address will client–server pair, the interaction proceeds quickly, saving
see requests whose Request-URI equals that contact address. a round trip often required by negotiation mechanisms.
Other potential sources of received Request-URIs include In addition, it also removes ambiguity when the client
the Contact header fields of requests and responses sent by requires features that the server does not understand. Some
the UA that establish or refresh dialogs. features, such as call handling fields, are only of interest to
end systems.
3.1.3.2.2 Merged Requests
If the request has no tag in the To header field, the UAS 3.1.3.3 Content Processing
core must check the request against ongoing transactions. Assuming the UAS understands any extensions required
If the From tag, Call-ID, and CSeq exactly match those by the client, the UAS examines the body of the message,
associated with an ongoing transaction, but the request does and the header fields that describe it. If there are any bod-
not match that transaction (based on the matching rules in ies whose type (indicated by the Content-Type), language
Section 3.12.1.3), the UAS core should generate a 482 Loop (indicated by the Content-Language), or encoding (indi-
Detected response and pass it to the server transaction. The cated by the Content-Encoding) are not understood, and
same request has arrived at the UAS more than once, fol- that body part is not optional (as indicated by the Content-
lowing different paths, most likely due to forking. The UAS Disposition header field), the UAS must reject the request
processes the first such request received and responds with a with a 415 Unsupported Media Type response. The response
482 Loop Detected to the rest of them. must contain an Accept header field listing the types of all
bodies it understands, in the event the request contained
bodies of types not supported by the UAS. If the request
3.1.3.2.3 Require
contained content encodings not understood by the UAS,
Assuming the UAS decides that it is the proper element to the response must contain an Accept-Encoding header field
process the request, it examines the Require header field, if listing the encodings understood by the UAS. If the request
present. The Require header field is used by a UAC to tell a contained content with languages not understood by the
UAS about SIP extensions that the UAC expects the UAS to UAS, the response must contain an Accept-Language
support in order to process the request properly. Its format is header field indicating the languages understood by the
described in Section 2.8.2. If a UAS does not understand an UAS. Beyond these checks, body handling depends on the
option tag listed in a Require header field, it must respond by method and type. For further information on the process-
generating a response with status code 420 Bad Extension. ing of content-specific header fields, see Sections 2.4.2.4,
The UAS must add an Unsupported header field, and list 3.9, and 16.6.
174 ◾ Handbook on Session Initiation Protocol
In all other respects, a stateless UAS behaves in the same Note that a Contact header field value may also refer to a
manner as a stateful UAS. A UAS can operate in either a different resource than the one originally called. For example,
stateful or stateless mode for each new request. a SIP call connected to a PSTN gateway may need to deliver
a special informational announcement such as, “The num-
ber you have dialed has been changed.” A Contact response
3.1.4 Redirect Server General Behavior header field can contain any suitable URI indicating where
In some architectures, it may be desirable to reduce the pro- the called party can be reached, not limited to SIP URIs. For
cessing load on proxy servers that are responsible for routing example, it could contain URIs for phones, fax [1–3], or irc
requests, and improve signaling path robustness, by relying (if they were defined) or a mailto: (RFC 6068) URL. Section
on redirection. Redirection allows servers to push routing 19.12.2.2 discusses implications and limitations of redirect-
information for a request back in a response to the client, ing a SIPS URI to a non-SIPS URI.
thereby taking themselves out of the loop of further mes- The expires parameter of a Contact header field value
saging for this transaction while still aiding in locating the indicates how long the URI is valid. The value of the param-
target of the request. When the originator of the request eter is a number indicating seconds. If this parameter is not
receives the redirection, it will send a new request based on provided, the value of the Expires header field determines
the URI(s) it has received. By propagating URIs from the how long the URI is valid. Malformed values should be
core of the network to its edges, redirection allows for con- treated as equivalent to 3600. This provides a modest level of
siderable network scalability. backwards compatibility with RFC 2543 obsoleted by RFC
A redirect server is logically constituted of a server trans- 3261, which allowed absolute times in this header field. If an
action layer and a TU that has access to a location service of absolute time is received, it will be treated as malformed, and
some kind (see Section 2.6 for more information on regis- then default to 3600. Redirect servers must ignore features
trars and location services). This location service is effectively that are not understood (including unrecognized header
a database containing mappings between a single URI and a fields, any unknown option tags in Require, or even method
set of one or more alternative locations at which the target of names) and proceed with the redirection of the request in
that URI can be found. question.
A redirect server does not issue any SIP requests of its
own. After receiving a request other than CANCEL, the
server either refuses the request or gathers the list of alterna-
tive locations from the location service and returns a final
3.2 Canceling a Request
response of class 3xx. For well-formed CANCEL requests, The CANCEL request is a hop-by-hop message and is
it should return a 2xx response. This response ends the SIP designed to be used to cancel a previous request sent by a cli-
transaction. The redirect server maintains transaction state ent for the message like INVITE that may take a long time
for an entire SIP transaction. It is the responsibility of clients to respond and where a UAS has not sent a final response yet.
to detect forwarding loops between redirect servers. When a Specifically, it asks the UAS to cease processing the request
redirect server returns a 3xx response to a request, it popu- and to generate an error response to that request. For this
lates the list of (one or more) alternative locations into the reason, CANCEL is best for INVITE requests, which can
Contact header field. An expires parameter to the Contact take a long time to generate a response. In that usage, a UAS
header field values may also be supplied to indicate the life- that receives a CANCEL request for an INVITE, but has
time of the Contact data. not yet sent a final response, would stop ringing, and then
The Contact header field contains URIs giving the new respond to the INVITE with a specific error response (e.g.,
locations or user names to try, or may simply specify addi- 487 Request Terminated). CANCEL requests can be con-
tional transport parameters. A 301 Moved Permanently or structed and sent by both proxies and UACs.
302 Moved Temporarily response may also give the same A CANCEL request should not be sent by a client to can-
location and user name that was targeted by the initial cel a request other than INVITE. Since requests other than
request, but specify additional transport parameters such as INVITE are responded to immediately, sending a CANCEL
a different server or multicast address to try, or a change of for a non-INVITE request would always create a race con-
SIP transport from UDP to TCP or vice versa. However, dition. The Request-URI, Call-ID, To, the numeric part of
redirect servers must not redirect a request to a URI equal to CSeq, and From header fields in the CANCEL request must
the one in the Request-URI; instead, provided that the URI be identical to those in the request being cancelled, includ-
does not point to itself, the server may proxy the request to ing tags. A CANCEL constructed by a client must have
the destination URI, or may reject it with a 404. If a client is only a single Via header field value matching the top Via
using an outbound proxy, and that proxy actually redirects value in the request being cancelled. Using the same values
requests, a potential arises for infinite redirection loops. for these header fields allows the CANCEL to be matched
176 ◾ Handbook on Session Initiation Protocol
with the request it cancels. However, the method part of an AOR at a domain’s location service when requests for that
the CSeq header field must have a value of CANCEL. This AOR would be routed to that domain. In most cases, this
allows CANCEL to be identified and processed as a transac- means that the domain of the registration will need to match
tion, as a distinct SIP method. If the request being cancelled the domain in the URI of the AOR. The registration scheme
contains a Route header field, the CANCEL request must indicated above using RFC 3261 procedures does not man-
include that Route header field’s values. This is needed so age the outbound connection of the SIP requests. As a result,
that stateless proxies are able to route CANCEL requests if the outbound connection of the REGISTER request are
properly. If no provisional response has been received, the disconnected, as it happens frequently by middle boxes like
CANCEL request must not be sent by the client; rather, network address translators (NATs) or firewalls, the client
the client must wait for the arrival of a provisional response will have no clue that it needs to refresh its binding with
before sending the request. the registration server. RFC 5626 has devised a registration
For a server, the CANCEL method requests that the TU with Client-Initiated Connection Management described in
at the server side cancel a pending transaction. A stateless Section 13.2 to take care of this problem, thereby making the
proxy will forward it, a stateful proxy might respond to it client-initiated connection management much more scalable
and generate some CANCEL requests of its own, and a UAS in the large-scale SIP network. Registration of UA’s contacts
will respond to it. This request cannot be challenged by the information for SIPS URI is provided in Section 4.2.3. In
server in order to get proper credentials in an Authorization view of multiple UAs registering using the same AOR, RFC
header field as it is a hop-by-hop request and cannot be 5627 describes the Globally Routable UA URI (GRUU)-
resubmitted. If the UAS does not find a matching transac- based registration using an instant-id so that the call can be
tion for the CANCEL as described, it should respond to the routed to that particular UA. The detail of GRUU registra-
CANCEL with a 481 Call Leg/Transaction Does Not Exist. tion is provided in Section 4.3. However, GRUU SIPS URI
If the UAS has not issued a final response for the original registration is described in Section 4.2.3.
request, its behavior depends on the method of the original
request. If the original request was an INVITE, the UAS 3.3.1 Registration without Managing
should immediately respond to the INVITE with a 487
Request Terminated. A CANCEL request has no impact
Client-Initiated Connection
on the processing of transactions with any other method. Figure 3.1a illustrates a SIP trapezoidal network with two
Regardless of the method of the original request, as long as UAs, an outbound proxy server, and an inbound proxy server
the CANCEL matched an existing transaction, the UAS where each proxy is acting as a registration server as well.
answers the CANCEL request itself with a 200 OK response. Earlier, we described all of these entities including the dis-
An example of CANCEL method is as follows: covery of the registrar server briefly that the location server
is not a part of SIP, but a registrar for some domain must
CANCEL sip:[email protected] SIP/2.0 be able to read and write data to the location service, and a
Via: SIP/2.0/UDP client.atlanta.example.
com:5060;branch=z9hG4bK74bf9
proxy or a redirect server for that domain must be capable of
Max-Forwards: 70 reading that same data. A registrar may be colocated with a
From: Alice <sip:[email protected]. particular SIP proxy server for the same domain.
com>;tag=9fxced76sl Registration entails sending a REGISTER request to a
To: Bob <sip:[email protected]> special type of UAS known as a registrar. A registrar acts as
Route: <sip:ss1.atlanta.example.com;lr> the front end to the location service for a domain, reading
Call-ID: [email protected].
com
and writing mappings based on the contents of REGISTER
CSeq: 1 CANCEL requests. This location service is then typically consulted by
Content-Length: 0 a proxy server that is responsible for routing requests for that
domain. Here, we will describe about REGISTER method
in detail.
Outbound Inbound
Alice proxy (P1) proxy (P2) Bob
F1 REGISTER F3 REGISTER
F2 200 OK F4 200 OK
Outbound Inbound
proxy proxy
(P1) (P2) F5 INVITE
SIP
F6 INVITE
F7 100 Trying F8 INVITE
SIP SIP
F9 100 Trying
IP network
Media F10 100 Trying
F23 200 OK
(b) F24 200 OK
F25 200 OK
(c)
Figure 3.1 SIP registration: (a) SIP trapezoidal network, (b) SIP URIs and IP addresses of functional entities, and (c) reg-
istration call flows. (Copyright IETF. Reproduced with permission.)
an AOR. A REGISTER request does not establish a dialog. as the former contains a user name. This AOR must be
A UAC may include a Route header field in a REGISTER a SIP URI or SIPS URI.
request based on a preexisting route set, as described in a sepa- ◾◾ From: The From header field contains the AOR of the
rate section. The Record-Route header field has no meaning person responsible for the registration. The value is the
in REGISTER requests or responses, and must be ignored same as the To header field unless the request is a third-
if present. In particular, the UAC must not create a new party registration.
route set based on the presence or absence of a Record-Route ◾◾ Call-ID: All registrations from a UAC should use the
header field in any response to a REGISTER request. The same Call-ID header field value for registrations sent
following header fields, except Contact, must be included in a to a particular registrar. If the same client were to use
REGISTER request. A Contact header field may be included: different Call-ID values, a registrar could not detect
whether a delayed REGISTER request might have
◾◾ Request-URI: The Request-URI names the domain of arrived out of order.
the location service for which the registration is meant. ◾◾ CSeq: The CSeq value guarantees proper ordering of
The userinfo and @ components of the SIP URI must REGISTER requests. A UA must increment the CSeq
not be present. value by one for each REGISTER request with the
◾◾ To: The To header field contains the AOR whose reg- same Call-ID.
istration is to be created, queried, or modified. The To ◾◾ Contact: REGISTER requests may contain a Contact
header field and the Request-URI field typically differ, header field with zero or more values containing
178 ◾ Handbook on Session Initiation Protocol
address bindings. UAs must not send a new registra- 3.3.1.3 Setting the Expiration Interval
tion (i.e., containing new Contact header field val- of Contact Addresses
ues, as opposed to a retransmission) until they have
received a final response from the registrar for the When a client sends a REGISTER request, it may suggest an
previous one, or the previous REGISTER request has expiration interval that indicates how long the client would like
timed out. the registration to be valid. There are two ways in which a cli-
ent can suggest an expiration interval for a binding: through an
Expires header field or an expires Contact header parameter.
An example of the REGISTER method: The latter allows expiration intervals to be suggested on a per-
binding basis when more than one binding is given in a single
REGISTER sips:ss2.biloxi.example.com SIP/2.0 REGISTER request, whereas the former suggests an expiration
Via: SIP/2.0/TLS client.biloxi.example. interval for all Contact header field values that do not contain
com:5061;branch=z9hG4bKnashds7
the expires parameter. If neither mechanism for expressing a
Max-Forwards: 70
From: Bob <sips:[email protected]. suggested expiration time is present in a REGISTER, the client
com>;tag=a73kszlfl is indicating its desire for the server to choose.
To: Bob <sips:[email protected]>
Call-ID: [email protected]
CSeq: 1 REGISTER
3.3.1.3.1 Preferences among Contact Addresses
Authorization: Digest user name="bob", If more than one Contact is sent in a REGISTER request,
realm="atlanta.example.com",
the registering UA intends to associate all of the URIs in
nonce="df84f1cec4341ae6cbe5ap359a9c8e88",
opaque="", these Contact header field values with the AOR present in
uri="sips:ss2.biloxi.example.com", the To field. This list can be prioritized with the q parameter
response="aa7ab4678258377c6f7d4be6087e2f60" in the Contact header field. The q parameter indicates a rela-
Content-Length: 0 tive preference for the particular Contact header field value
compared with other bindings for this AOR.
3.3.2 Discovering a SIP Registrar be sent using one of its AORs. This AOR will typically show
up in the From header field of the request, and credentials
UAs can use three ways to determine the address to which unique to that AOR will be used to authenticate the request.
to send registrations (RFC 3261): by configuration, using The GRUU placed into the Contact header field of such a
the AOR, and multicast. A UA can be configured, in ways request should be one that is associated with the AOR used
beyond the scope of this specification, with a registrar to send the request. In cases where the UA uses a tel URI to
address. If there is no configured registrar address, the UA populate the From header field, the UA typically has a SIP
should use the host part of the AOR as the Request-URI AOR that is treated as an alias for the tel URI. The GRUU
and address the request there, using the normal SIP server associated with that SIP AOR should be used in the Contact
location mechanisms described in RFC 3263 (see Section header field. When a UA receives a request, the GRUU
8.2.4). For example, the UA for the user sip:carol@chicago placed into the Contact header field of a 2xx response should
.com addresses the REGISTER request to sip:chicago.com. be the one associated with the AOR or GRUU to which the
Finally, a UA can be configured to use multicast. request was most recently targeted. There are several ways to
Multicast registrations are addressed to the well-known determine the AOR or GRUU to which a request was sent.
all SIP servers multicast address sip.mcast.net (224.0.1.75 For example, if a UA registered a different contact to each
for IPv4). No well-known IPv6 multicast address has been AOR by using a different user part of the URI, the Request-
allocated; such an allocation will be documented separately URI (which contains that contact) will indicate the AOR.
when needed. SIP UAs may listen to that address and use it
to become aware of the location of other local users; however,
they do not respond to the request. Multicast registration 3.3.4 Registration Call Flows
may be inappropriate in some environments, for example, if We have taken the registration call flows example for explain-
multiple businesses share the same local area network. ing most of the features of SIP registration defined in RFC
3261.
3.3.3 Multiple-AOR Registration
The present REGISTER method does not allow registering
3.3.4.1 Successful New Registration
multiple AORs because it uses only the single AOR for each We have taken an example shown in Figure 3.1c. Bob sends
registration. In some situations, the registration of multiple a SIP REGISTER request to the SIP server. The request
AORs may be useful as articulated in RFC 5947. For example, includes the user’s contact list. This flow shows the use of
a SIP Private Branch Exchange (PBX) needs to provide the reg- HTTP Digest for authentication using TLS transport. TLS
istration for all AORs of all SIP UAs behind it, while each AOR transport is used because of the lack of integrity protection in
may have different requirements for Contact (e.g., SIP URI, tel HTTP Digest and the danger of registration hijacking with-
URI), security (e.g., P-Associated-URI, P-Asserted Identity, out it, as described in RFC 3261. The SIP server provides a
P-Preferred Identity), and policy (e.g., quality of service challenge to Bob. Bob enters his valid user ID and password.
[QOS], security). Owing to lack of multiple-AOR registration Bob’s SIP client encrypts the user information according to
in SIP, the benefits such as reduction in registration response the challenge issued by the SIP server and sends the response
growth, proper handling of SIP-aware middle boxes, use of to the SIP server. The SIP server validates the user’s creden-
wildcard syntax, correct routing of the incoming requests to tials. It registers the user in its contact database and returns a
the SIP middle boxes with the target-URI, and proper match- response (200 OK) to Bob’s SIP client. The response includes
ing of security and QOS policies cannot be obtained. To meet the user’s current contact list in Contact headers. The format
these requirements, a private extension in SIP header known of the authentication shown is HTTP digest. It is assumed
as the P-Associated-URI header specified in RFC 7315 (see that Bob has not previously registered with this server. The
Section 2.8.2) is made, and is not used in the Internet. The message details are shown below:
P-Associated-URI header field transports a set of associated
URIs to the registered AOR, and allows a registrar to return a F1 REGISTER Bob -> SIP Registration Server
set of associated URIs for a registered AOR. The P-Associated-
URI header field is used in 200 OK responses to a REGISTER REGISTER sips:ss2.biloxi.example.com SIP/2.0
request. A more detailed discussion of this header can be seen Via: SIP/2.0/TLS client.biloxi.example.
in the SIP header section (see Section 2.8.2). com:5061;branch=z9hG4bKnashds7
Max-Forwards: 70
In case of the GRUU defined in RFC 5627 (see Section
From: Bob <sips:[email protected]>;
4.3), if a UA has a multiplicity of AORs, either in different tag=a73kszlfl
domains or within the same domain, additional consider- To: Bob <sips:[email protected]>
ations apply. When a UA sends a request, the request will Call-ID: [email protected]
180 ◾ Handbook on Session Initiation Protocol
indicating that the user wishes to query the server for the contact list, and returns a response (200 OK) to Bob’s SIP
user’s current contact list. It is known that the user already client. The message details are as follows:
has authenticated with the server; the user supplies authenti-
cation credentials with the request and is not challenged by F1 REGISTER Bob -> SIP Registration Server
the server. The SIP server validates the user’s credentials. The
server returns a response (200 OK) that includes the user’s REGISTER sips:ss2.biloxi.example.com SIP/2.0
current registration list in Contact headers. The message Via: SIP/2.0/TLS client.biloxi.example.
details are as follows: com:5061;branch=z9hG4bKnashds7
Max-Forwards: 70
F1 REGISTER Bob -> SIP Registration Server From: Bob <sips:[email protected]>;
tag=a73kszlfl
REGISTER sips:ss2.biloxi.example.com SIP/2.0 To: Bob <sips:[email protected]>
Via: SIP/2.0/TLS client.biloxi.example. Call-ID: [email protected]
com:5061;branch=z9hG4bKnashds7 CSeq: 1 REGISTER
Max-Forwards: 70 Expires: 0
From: Bob <sips:[email protected]>; Contact: *
tag=a73kszlfl Authorization: Digest username="bob",
To: Bob <sips:[email protected]> realm="atlanta.example.com",
Call-ID: [email protected] nonce="88df84f1cac4341aea9c8ee6cbe5a359",
CSeq: 1 REGISTER opaque="",
Authorization: Digest user name="bob", uri="sips:ss2.biloxi.example.com",
realm="atlanta.example.com", response="ff0437c51696f9a76244f0cf1dbabbea"
nonce="df84f1cec4341ae6cbe5ap359a9 Content-Length: 0
c8e88", opaque="",
uri="sips:ss2.biloxi.example.com", F2 200 OK SIP Registration Server -> Bob
response="aa7ab4678258377c6f7d4be6087
e2f60" SIP/2.0 200 OK
Content-Length: 0 Via: SIP/2.0/TLS client.biloxi.example.
com:5061;branch=z9hG4bKnashds7
;received=192.0.2.201
F2 200 OK SIP Registration Server -> Bob
From: Bob <sips:[email protected]>;
tag=a73kszlfl
SIP/2.0 200 OK
To: Bob <sips:[email protected]>;
Via: SIP/2.0/TLS client.biloxi.example.com:50
tag=1418nmdsrf
61;branch=z9hG4bKnashds7;received=192.0.2.
Call-ID: [email protected]
201
CSeq: 1 REGISTER
From: Bob <sips:[email protected]>;
Content-Length: 0
tag=a73kszlfl
To: Bob <sips:[email protected]>;
tag=jqoiweu75
Call-ID: [email protected] 3.3.5 Registration for Multiple
CSeq: 1 REGISTER
Contact: <sips:[email protected]. Phone Numbers in SIP
com>;expires=3600
RFC 6140 that is described here defines a mechanism by
Contact: <mailto:[email protected]>;
expires=4294967295 which a SIP server acting as a traditional PBX can register
Content-Length: 0 with a SIP Service Provider (SSP) to receive phone calls for
SIP UAs. To function properly, this mechanism requires that
each of the AORs registered in bulk maps to a unique set of
contacts. This requirement is satisfied by AORs representing
3.3.4.3 Cancellation of Registration
phone numbers regardless of the domain, since phone num-
In the final example using Figure 3.1c, Bob wishes to can- bers are fully qualified and globally unique. This specifica-
cel their registration with the SIP server. Bob sends a SIP tion (RFC 6140) therefore focuses on the use case of the fully
REGISTER request to the SIP server. The request has an qualified and globally identifiable phone number that can be
expiration period of 0 and applies to all existing contact loca- used for routing. Note that the security of this mechanism is
tions. As explained earlier, the user already has authenticated described in Section 19.2.2.
with the server; the user supplies authentication credentials In actual deployments, some SIP servers have been
with the request and is not challenged by the server. The deployed in architectures that, for various reasons, have
SIP server validates the user’s credentials. It clears the user’s requirements to provide dynamic routing information for
182 ◾ Handbook on Session Initiation Protocol
large blocks of AORs, where all of the AORs in the block in SIP–PBX environments, and since SIP URIs in which the
were to be handled by the same server. For purposes of effi- user portion is an E.164 number are always globally unique,
ciency, many of these deployments (Figure 3.2) do not wish regardless of the domain, this document focuses on registration
to maintain separate registrations for each of the AORs in of SIP URIs in which the user portion is an E.164 number.
the block. For example, in virtually all models, the SIP–PBX Before describing the detail about the multiple AORs of
generates a SIP REGISTER request using a mutually agreed- global routable telephone numbers, we summarize the fact
upon SIP AOR—typically based on the SIP–PBX’s main that this specification (RFC 6140) satisfies the following
attendant-/reception-desk number. mandatory requirements described in RFC 5947:
The AOR is often in the domain of the SSP, and both the
To and From URIs used for the REGISTER request iden- ◾◾ The mechanism allows a SIP–PBX to enter into a
tify that AOR. In all respects, it appears on the wire as a trunking arrangement with an SSP, whereby the two
normal SIP REGISTER request, as if from a typical user’s parties have agreed on a set of telephone numbers
UA. However, it generally implicitly registers other AORs assigned to the SIP–PBX.
associated with the SIP–PBX. Thus, an alternate mechanism ◾◾ The mechanism allows a set of assigned telephone
to provide dynamic routing information for blocks of AORs numbers to comprise E.164 numbers, which can be in
is desirable. However, the following two constraints have the contiguous ranges, discrete, or in any combination of
most profound effect in addressing this multiple AORs reg- the two. However, the Direct Inward Dialing (DID)
istration: the SIP–PBX cannot be assumed to be assigned a numbers associated with a registration are established
static IP address and no DNS entry can be relied upon to by a bilateral agreement between the SSP and the SIP–
consistently resolve to the IP address of the SIP–PBX. PBX; they are not part of the mechanism described in
Although the use of SIP REGISTER request messages to this specification (RFC 6140).
update reachability information for multiple users simultane- ◾◾ The mechanism allows a SIP–PBX to register reach-
ously is somewhat beyond the original semantics defined for ability information with its SSP, to enable the SSP to
REGISTER requests by RFC 3261 (see Section 3.3.1), this route to the SIP–PBX inbound requests targeted at
approach has seen significant deployment in certain environ- assigned telephone numbers.
ments. In particular, deployments in which small to medium ◾◾ The mechanism allows UAs attached to a SIP–PBX to
SIP–PBX servers are addressed using E.164 numbers have used register with the SIP–PBX for AORs based on assigned
this mechanism to avoid the need to maintain DNS entries or telephone numbers, to receive requests targeted at
static IP addresses for the SIP–PBX servers. In recognition of those telephone numbers, without needing to involve
the momentum that REGISTER-based approaches have seen the SSP in the registration process; in the presumed
in deployments, this document defines a REGISTER-based architecture, SIP–PBX UAs register with the SIP–PBX
approach. Since E.164-addressed UAs are very common today and requires no interaction with the SSP.
SIP
UA SIP–PBX forwards
inbound requests to
appropriate SIP UAs
Figure 3.2 Multiple-AOR registration by SIP–PBX with SSP. (Copyright IETF. Reproduced with permission.)
SIP Message Elements ◾ 183
◾◾ The mechanism allows a SIP–PBX to handle requests ◾◾ The mechanism is able to operate over a transport that
originating at its own UAs and targeted at its assigned provides end-to-end integrity protection and confiden-
telephone numbers, without routing those requests tiality between the SIP–PBX and the SSP, for example,
to the SSP; SIP–PBXs may recognize their own DID using TLS as specified in RFC 3261 (see Section 3.13);
numbers and GRUUs, and perform on-SIP–PBX rout- nothing in the proposed mechanism prevents the use
ing without sending the requests to the SSP. of TLS between the SSP and the SIP–PBX.
◾◾ The mechanism allows a SIP–PBX to receive requests ◾◾ The mechanism supports authentication of the SIP–
to its assigned telephone numbers originating outside PBX by the SSP and vice versa, for example, using SIP
the SIP–PBX and arriving via the SSP, so that the SIP– digest authentication plus TLS server authentication as
PBX can route those requests onwards to its UAs, as it specified in RFC 3261 (see Section 19.4); SIP–PBXs
would for internal requests to those telephone numbers. may employ either SIP digest authentication or mutu-
◾◾ The mechanism provides a means whereby a SIP– ally authenticated TLS for authentication purposes.
PBX knows which of its assigned telephone numbers ◾◾ The mechanism allows the SIP–PBX to provide its UAs
an inbound request from its SSP is targeted at. The with public or temporary GRUUs (see Section 4.3). It
requirement is satisfied. For ordinary calls and calls is performed via the mechanisms described here.
using public GRUUs, the DID number is indicated ◾◾ The mechanism works over any existing transport
in the user portion of the Request-URI. For calls specified for SIP, including UDP. It is accomplished
using Temp GRUUs constructed with the mechanism to the extent that UDP can be used for REGISTER
described here, the gr parameter provides a correlation requests in general. The application of certain exten-
token the SIP–PBX can use to identify to which UA sions or network topologies may exceed UDP maxi-
the call should be routed. mum transmission unit (MTU) sizes; however, such
◾◾ The mechanism provides a means of avoiding problems issues arise both with and without the mechanism
due to one side using the mechanism and the other side described here. This specification (RFC 6140) does not
not using the mechanism: the gin option tag and the exacerbate such issues.
bnc Contact URI parameter. ◾◾ This specification (RFC 6140) provides guidance or
◾◾ The mechanism observes SIP backwards compatibility warnings about how authorization policies may be
principles through using the gin option tag. affected by the mechanism, to address the problems
◾◾ The mechanism works in the presence of a sequence described in RFC 5947.
of intermediate SIP entities on the SIP–PBX-to-SSP ◾◾ The mechanism is extensible to allow a set of assigned
interface (i.e., between the SIP–PBX and the SSP’s telephone numbers to comprise local numbers as speci-
domain proxy), where those intermediate SIP entities fied in RFC 3966 (see Section 4.2.3), which can be in
indicated during registration need to be on the path of contiguous ranges, discrete, or in any combination of
inbound requests to the SIP–PBX. It is accomplished the two. Assignment of telephone numbers to a regis-
through the use of the path mechanism defined in tration is performed by the SSP’s registrar, which is not
RFC 3327 (see Section 2.8). precluded from assigning local numbers in any combi-
◾◾ The mechanism works when a SIP–PBX obtains its IP nation it desires.
address dynamically. It is done by allowing the SIP– ◾◾ The mechanism is also extensible to allow a set of arbi-
PBX to use an IP address in the bulk number contact trarily assigned SIP URIs as specified in RFC 3261 (see
(bnc) URI contained in a REGISTER Contact header Section 4.2), as opposed to just telephone numbers,
field. without requiring a complete change of mechanism as
◾◾ The mechanism works without requiring the SIP–PBX compared with that used for telephone numbers. The
to have a domain name or the ability to publish its mechanism is extensible in such a fashion, as demon-
domain name in the DNS. It is performed by allow- strated by the document “GIN with Literal AORs for
ing the SIP–PBX to use an IP address in the bnc URI SIP in SSPs (GLASS)” [4].
contained in a REGISTER Contact header field.
◾◾ For a given SIP–PBX and its SSP, there is no impact The following desirable requirements described in RFC
on other domains, which are expected to be able to use 5947 are also met by this specification (RFC 6140):
normal RFC 3263 (see Section 8.2.4) procedures to
route requests, including requests needing to be routed ◾◾ The mechanism allows an SSP to exploit its mechanisms
via the SSP in order to reach the SIP–PBX. It is accom- for providing SIP service to normal UAs in order to
plished by allowing the domain name in the Request- provide a SIP trunking service to SIP–PBXs; the rout-
URI used by external entities to resolve to the SSP’s ing mechanism described in this document is identical
servers via normal RFC 3263 resolution procedures. to the routing performed for singly registered AORs.
184 ◾ Handbook on Session Initiation Protocol
◾◾ The mechanism scales to SIP–PBXs of several thou- Contact URI will generally contain an IP address, although
sand assigned telephone numbers. The desired prop- nothing in this mechanism enforces or relies upon that fact.
erty is satisfied; nothing in this specification (RFC If the SIP–PBX operator chooses to maintain DNS
6140) precludes DID number pools of arbitrary size. entries that resolve to the IP address of this SIP–PBX via
◾◾ The mechanism also scales to support several thou- RFC 3263 (see Section 8.2.4) resolution procedures, then
sand SIP–PBXs on a single SSP. The desired property this mechanism works just fine with domain names in the
is satisfied; nothing in this specification (RFC 6140) Contact header field. The bnc URI parameter indicates
precludes an arbitrary number of SIP–PBXs from that special interpretation of the Contact URI is necessary:
attaching to a single SSP. instead of indicating the insertion of a single Contact URI
into the location service, it indicates that multiple URIs (one
for each associated AOR) should be inserted. Any SIP–PBX
implementing the registration mechanism defined in this
3.3.5.1 Mechanism Overview
document MUST also support the path mechanism defined
The overall mechanism is achieved using a REGISTER by RFC 3327 (see Section 2.8), and MUST include a path
request with a specially formatted Contact URI. This docu- option tag in the Supported header field of the REGISTER
ment also defines an option tag that can be used to ensure request (which is a stronger requirement than imposed by the
that a registrar and any intermediaries understand the mech- path mechanism itself). This behavior is necessary because
anism described herein. The Contact URI itself is tagged proxies between the SIP–PBX and the registrar may need
with a URI parameter to indicate that it actually represents to insert Path header field values in the REGISTER request
multiple phone-number-associated contacts. We also define for this document’s mechanism to function properly, and,
some lightweight extensions to the GRUU mechanism per RFC 3327 (see Section 2.8), they can only do so if the
defined by RFC 5627 (see Section 4.3) to allow the use of UAC inserted the option tag in the Supported header field.
public and temporary GRUUs assigned by the SSP. Aside In accordance with the procedures defined in RFC 3327, the
from these extensions, the REGISTER request itself is pro- SIP–PBX is allowed to ignore the Path header fields returned
cessed by a registrar in the same way as normal registrations: in the REGISTER response.
by updating its location service with additional AOR-to-
Contact bindings. Note that the list of AORs associated with
3.3.5.2.2 Registrar Behavior
a SIP–PBX is a matter of local provisioning at the SSP and
the SIP–PBX. The mechanism defined in this document does The registrar, upon receipt of a REGISTER request con-
not provide any means to detect or recover from provisioning taining at least one Contact header field with a bnc param-
mismatches (although the registration event package can be eter, will use the value in the To header field to identify the
used as a standardized means for auditing such AORs). SIP–PBX for which registration is being requested. It then
authenticates the SIP–PBX (e.g., using SIP digest authenti-
cation, mutual TLS [RFC 5246], or some other authentica-
3.3.5.2 Registering for Multiple tion mechanism). After the SIP–PBX is authenticated, the
Phone Numbers registrar updates its location service with a unique AOR-
to-Contact mapping for each of the AORs associated with
3.3.5.2.1 SIP–PBX Behavior
the SIP–PBX. Semantically, each of these mappings will be
To register for multiple AORs, the SIP–PBX sends a treated as a unique row in the location service. The actual
REGISTER request to the SSP. This REGISTER request implementation may, of course, perform internal optimiza-
varies from a typical REGISTER request in two important tions to reduce the amount of memory used to store such
ways. First, it must contain an option tag of gin in both a information.
Require header field and a Proxy-Require header field. (The For each of these unique rows, the AOR will be in the
option tag gin is an acronym for generate implicit num- format that the SSP expects to receive from external parties
bers.) Second, in at least one Contact header field, it must (e.g., sip:[email protected]). The correspond-
include a Contact URI that contains the URI parameter bnc ing contact will be formed by adding to the REGISTER
(which stands for bulk number contact) and has no user por- request’s Contact URI a user portion containing the fully
tion (hence, no “@” symbol). A URI with a bnc parameter qualified, E.164-formatted number (including the preceding
must not contain a user portion. Except for the SIP URI user “+” symbol) and removing the bnc parameter. Aside from the
parameter, this URI may contain any other parameters that initial + symbol, this E.164-formatted number must consist
the SIP–PBX desires. These parameters will be echoed back exclusively of digits from 0 through 9, and explicitly must
by the SSP in any requests bound for the SIP–PBX. Because not contain any visual separator symbols (e.g., “–,” “.,” “(,”
of the constraints discussed earlier, the host portion of the or “)”). For example, if the Contact header field contains the
SIP Message Elements ◾ 185
URI <sip:198.51.100.3:5060;bnc>, then the contact value not include a user parameter. A registrar that receives a
associated with the aforementioned AOR will be <sip:+12145 REGISTER request containing a Contact URI with both
[email protected]:5060>. a bnc parameter and a user parameter must not send a 200-
Although the SSP treats this registration as a number class (success) response. If no other error is applicable, the
of discrete rows for the purpose of retargeting incoming registrar can use a 400 Bad Request response to indicate this
requests, the renewal, expiration, and removal of these rows error condition. Note that the preceding paragraph is talking
is bound to the registered contact. In particular, this means about the user parameter of a URI:
that REGISTER requests that attempt to deregister a single
AOR that has been implicitly registered must not remove sip:[email protected];user=phone
that AOR from the bulk registration. In this circumstance, ^^^^^^^^^^
the registrar simply acts as if the UA attempted to unreg-
ister a contact that was not actually registered (e.g., return When a SIP–PBX receives a request from an SSP, and
the list of presently registered contacts in a success response). the Request-URI contains a user portion corresponding to
A further implication of this property is that an individual an AOR registered using a Contact URI containing a bnc
extension that is implicitly registered may also be explicitly parameter, then the SIP–PBX must not reject the request (or
registered using a normal, nonbulk registration (subject to otherwise cause the request to fail) due to the absence, pres-
SSP policy). If such a registration exists, it is refreshed inde- ence, or value of a user parameter on the Request-URI.
pendently of the bulk registration and is not removed when
the bulk registration is removed. 3.3.5.3 SSP Processing of Inbound Requests
A registrar that receives a REGISTER request containing
a Contact URI with both a bnc parameter and a user portion In general, after processing the AOR-to-Contact mapping
MUST NOT send a 200-class (Success) response. If no other described in the preceding section, the SSP proxy/registrar
error is applicable, the registrar can use a 400 Bad Request (or equivalent entity) performs traditional proxy/registrar
response to indicate this error condition. Note that the pre- behavior, based on the mapping. For any inbound SIP
ceding paragraph is talking about the user portion of a URI: requests whose AOR indicates an E.164 number assigned to
one of the SSP’s customers, this will generally involve setting
sip:[email protected] the target set to the registered contacts associated with that
^^^^^^^^^^^^ AOR, and performing request forwarding as described in
RFC 3261 (see Section 3.11.6). An SSP using the mechanism
A registrar compliant with this document MUST sup- defined in this document must perform such processing for
port the path mechanism defined in RFC 3327 (see Section inbound INVITE requests and SUBSCRIBE requests to the
2.8). The rationale for the support of this mechanism is given reg event package, and should perform such processing for all
earlier. Aside from the bnc parameter, all URI parameters other method types, including unrecognized SIP methods.
present on the Contact URI in the REGISTER request must
be copied to the contact value stored in the location service.
If the SSP servers perform processing based on UA capabili- 3.3.5.4 Interaction with Other Mechanisms
ties (as defined in RFC 3840, see Sections 2.11 and 3.4), they The following sections describe the means by which this
will treat any feature tags present on a Contact header field mechanism interacts with relevant REGISTER-related exten-
with a bnc parameter in its URI as applicable to all of the sions currently defined by the International Engineering Task
resulting AOR-to-Contact mappings. Similarly, any option Force (IETF). To enable advanced services to work with UAs
tags present on the REGISTER request that indicate special behind a SIP–PBX, it is important that the GRUU mecha-
handling for any subsequent requests are also applicable to nism defined by RFC 5627 (see Section 4.3) work correctly
all of the AOR-to-Contact mappings. with the mechanism defined by this document—that is, UAs
served by the SIP–PBX can acquire and use GRUUs for their
own use. Neither the SSP nor the SIP–PBX is required to
3.3.5.2.3 SIP URI User Parameter Handling
support the registration event package defined by RFC 3680.
This specification (RFC 6140) does not modify the behav- However, if they do support the registration event package,
ior specified in RFC 3261 (see Section 3.3.1) for inclusion they must conform to the behavior described in this section
of the user parameter on Request-URIs. However, to avoid and its subsections. As this mechanism inherently deals with
any ambiguity in handling at the SIP–PBX, the following REGISTER transaction behavior, it is imperative to con-
normative behavior is imposed on its interactions with the sider its impact on the registration event package defined by
SSP. When a SIP–PBX registers with an SSP using a Contact RFC 3680. In practice, there will be two main use cases for
URI containing a bnc parameter, that Contact URI must subscribing to registration data: learning about the overall
186 ◾ Handbook on Session Initiation Protocol
registration state for the SIP–PBX and learning about the E.164 number associated with the UA, and return the result
registration state for a single SIP–PBX AOR. to the UA as its public GRUU. The resulting Contact header
field sent from the SIP–PBX to the registering UA would
look something like this:
3.3.5.5 Interaction with Public GRUUs
Support of public GRUUs is optional in SSPs and SIP– <allOneLine>
PBXs. When a SIP–PBX registers a bulk number contact Contact: <sip:[email protected]>;
pub-gruu="sip:+12145550102@ssp.
(a contact with a bnc parameter), and also invokes GRUU
example.com;gr=urn:
procedures for that contact during registration, then the SSP uuid:f81d4fae-7dec-11d0-a765-00a0c91e6
will assign a public GRUU to the SIP–PBX in the normal bf6;sg=00:05:03:5e:70:a6";
fashion. Because the URI being registered contains a bnc +sip.
parameter, the GRUU will also contain a bnc parameter. In instance="<urn:uuid:d0e2f290-104b-11df-8a39-
particular, this means that the GRUU will not contain a user 0800200c9a66>"
;expires=3600
portion. When a UA registers a contact with the SIP–PBX
</allOneLine>
using GRUU procedures, the SIP–PBX provides to the UA
a public GRUU formed by adding an sg parameter to the
When an incoming request arrives at the SSP for a
GRUU parameter it received from the SSP. This sg param-
GRUU corresponding to a bnc, the SSP performs slightly
eter contains a disambiguation token that the SIP–PBX can
different processing for the GRUU than it would for a URI
use to route inbound requests to the proper UA. Thus, for
without a bnc parameter. When the GRUU is retargeted to
example, when the SIP–PBX registers with the following
the registered bnc, the SSP must copy the sg parameter from
Contact header field
the GRUU to the new target. The SIP–PBX can then use this
Contact: <sip:198.51.100.3;bnc>; sg parameter to determine to which UA the request should
+sip.instance="<urn:uuid: be routed. For example, the first line of an INVITE request
f81d4fae-7dec-11d0-a765-00a0c91e6bf6>" that has been retargeted to the SIP–PBX for the UA shown
above would look like this:
the SSP may choose to respond with a Contact header field
that looks like this (<allOneLine> definition per RFC 4475): INVITE sip:[email protected];sg=00:05
:03:5e:70:a6 SIP/2.0
<allOneLine>
Contact: <sip:198.51.100.3;bnc>;
pub-gruu="sip:ssp.example.
com;bnc;gr=urn: 3.3.5.6 Interaction with Temporary GRUUs
uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6"; To provide support for privacy, the SSP should implement
+sip.instance="<urn:uuid: the temporary GRUU mechanism described in this section.
f81d4fae-7dec-11d0-a765-00a0c91e6bf6>" Reasons for not doing so would include systems with an
;expires=7200 alternative privacy mechanism that maintains the integrity
</allOneLine> of public GRUUs (i.e., if public GRUUs are anonymized,
then the anonymizer function would need to be capable of
When its own UAs register using GRUU procedures, providing—as the anonymized URI—a globally routable
the SIP–PBX can then add whatever device identifier it feels URI that routes back only to the target identified by the
appropriate in an sg parameter, and present this value to its original public GRUU). Temporary GRUUs are used to
own UAs. For example, assume the UA associated with the provide anonymity for the party creating and sharing the
AOR +12145550102 sent the following Contact header field GRUU. Being able to correlate two temporary GRUUs as
in its REGISTER request: having originated from behind the same SIP–PBX violates
this principle of anonymity. Consequently, rather than rely-
Contact: <sip:[email protected]>; ing upon a single, invariant identifier for the SIP–PBX in its
+sip.instance="<urn:uuid: UA’s temporary GRUUs, we define a mechanism whereby
d0e2f290-104b-11df-8a39-0800200c9a66>"
the SSP provides the SIP–PBX with sufficient information
for the SIP–PBX to mint unique temporary GRUUs. These
The SIP–PBX will add an sg parameter to the pub-gruu GRUUs have the property that the SSP can correlate them
it received from the SSP with a token that uniquely identifies to the proper SIP–PBX, but no other party can do so. To
the device (possibly the URN itself, or possibly some other achieve this goal, we use a slight modification of the proce-
identifier), insert a user portion containing the fully qualified dure described in RFC 5627 (see Section 4.3).
SIP Message Elements ◾ 187
The SIP–PBX needs to be able to construct a temp-gruu of the temp-gruu-cookie, as long as it fulfills the five proper-
in a way that the SSP can decode. To ensure that the SSP ties listed above. The registrar maintains a counter, I. This
can decode GRUUs, we need to standardize the algorithm counter is 48 bits long and initialized to zero. This counter
for creation of temp-gruus at the SIP–PBX. This allows the is persistently stored, using a back-end database or similar
SSP to reverse the algorithm in order to identify the regis- technique. When the registrar creates the first temporary
tration entry that corresponds to the GRUU. It is equally GRUU for a particular SIP–PBX and instance ID (as defined
important that no party other than the SSP be capable of by RFC 5627, see Section 4.3), the registrar notes the cur-
decoding a temporary GRUU, including other SIP–PBXs rent value of the counter, I_i, and increments the counter in
serviced by the SSP. To achieve this property, an SSP that the database. The registrar then maps I_i to the contact and
supports temporary GRUUs MUST create and store an instance ID using the database, a persistent hash map, or
asymmetric key pair: {K_e1,K_e2}. K_e1 is kept secret by similar technology. If the registration expires such that there
the SSP, while K_e2 is shared with the SIP–PBXs via pro- are no longer any contacts with that particular instance ID
visioning. All base64 encoding discussed in the following bound to the GRUU, the registrar removes the mapping.
sections must use the character set and encoding defined Similarly, if the temporary GRUUs are invalidated
in RFC 4648, except that any trailing “=” characters are due to a change in Call-ID, the registrar removes the cur-
discarded on encoding and added as necessary to decode. rent mapping from I_i to the AOR and instance ID, notes
The following sections make use of the term HMAC- the current value of the counter I_j, and stores a mapping
SHA256-80 to describe a particular Hashed Message from I_j to the contact containing a bnc parameter and
Authentication Code (HMAC) algorithm. In this specifi- instance ID. On the basis of these rules, the hash map
cation (RFC 6140), HMAC-SHA256-80 is defined as the will contain a single mapping for each contact contain-
application of the SHA-256 [5] secure hashing algorithm, ing a bnc parameter and instance ID for which there is
truncating the results to 80 bits by discarding the trailing a currently valid registration. The registrar maintains a
(least-significant) bits. symmetric key SK_a, which is regenerated every time the
counter rolls over or is reset. When the counter rolls over
or is reset, the registrar remembers the old value of SK_a
3.3.5.6.1 Generation of Temp-gruu-cookie
for a while. To generate a temp-gruu-cookie, the registrar
by the SSP computes
An SSP that supports temporary GRUUs must include a
temp-gruu-cookie parameter on all Contact header fields con- SA = HMAC(SK_a, I_i)
taining a bnc parameter in a 200-class REGISTER response. temp-gruu-cookie = base64enc(I_i || SA)
This temp-gruu-cookie must have the following properties:
where || denotes concatenation. HMAC represents any suit-
◾◾ It can be used by the SSP to uniquely identify the reg- ably strong HMAC algorithm (see RFC 2104 for a discus-
istration to which it corresponds. sion of HMAC algorithms). One suitable HMAC algorithm
◾◾ It is encoded using base64. This allows the SIP–PBX for this purpose is HMACSHA256-80.
to decode it in as compact a form as possible for use in
its calculations.
3.3.5.6.2 Generation of Temp-gruu by the SIP–PBX
◾◾ It is of a fixed length. This allows for its extraction once
the SIP–PBX has concatenated a distinguisher onto it. According to RFC 5627 (see Section 4.3), every registra-
◾◾ The temp-gruu-cookie must not be forgeable by any tion refresh generates a new temp-gruu that is valid for
party. In other words, the SSP needs to be able to as long as the contact remains registered. This property is
examine the cookie and validate that it was generated both critical for the privacy properties of temp-gruu and
by the SSP. is expected by UAs that implement the temp-gruu pro-
◾◾ The temp-gruu-cookie must be invariant during the cedures. Nothing in this document should be construed
course of a registration, including any refreshes to that as changing this fundamental temp-gruu property in any
registration. This property is important, as it allows the way. SIP–PBXs that implement temporary GRUUs must
SIP–PBX to examine the temp-gruu-cookie to deter- generate a new temp-gruu according to the procedures in
mine whether the temp-gruus it has issued to its UAs this section for every registration or registration refresh
are still valid. from GRUU-supporting UAs attached to the SIP–PBX.
Similarly, if the registration that a SIP–PBX has with its
The above properties can be met using the following SSP expires or is terminated, then the temp-gruu cookie
algorithm, which is nonnormative. Implementers may chose it maintains with the SSP will change. This change will
to implement any algorithm of their choosing for generation invalidate all the temp-gruus the SIP–PBX has issued to
188 ◾ Handbook on Session Initiation Protocol
its UAs. If the SIP–PBX tracks this information (e.g., to 3.3.5.6.3 Decoding of Temp-gruu by the SSP
include <temp-gruu> elements in registration event bod-
When the SSP proxy receives a request in which the user
ies, as described in RFC 5628), it can determine that
part begins with tgruu., it extracts the remaining portion and
previously issued temp-gruus are invalid by observing
splits it at the “.” character into E and PA. It discards PA. It
a change in the temp-gruu-cookie provided to it by the
then computes E by performing a base64 decode of E. Next,
SSP. A SIP–PBX that issues temporary GRUUs to its UAs
it computes
must maintain an HMAC key: PK_a. This value is used
to validate that incoming GRUUs were generated by the M = RSA-Decrypt(K_e1, E)
SIP–PBX.
To generate a new temporary GRUU for use by its own The SSP proxy extracts the fixed-length temp-gruu-
UAs, the SIP–PBX must generate a random distinguisher cookie information from the beginning of this M and dis-
value: D. The length of this value is up to implementers, but cards the remainder (which will be the distinguisher added
it must be long enough to prevent collisions among all the by the SIP–PBX). It then validates this temp-gruu-cookie.
temporary GRUUs issued by the SIP–PBX. A size of 80 bits If valid, it uses it to locate the corresponding SIP–PBX reg-
or longer is recommended. RFC 4086 describes in detail istration record and routes the message appropriately. If the
considerations on the generation of random numbers in a nonnormative, exemplary algorithm described earlier is used
security context. After generating the distinguisher D, the to generate the temp-gruu-cookie, then this identification is
SIP–PBX must calculate performed by splitting the temp-gruu-cookie information
into its 48-bit counter I and 80-bit HMAC. It validates that
M = base64dec(SSP-cookie) || D the HMAC matches the counter I and then uses counter
E = RSA-Encrypt(K_e2, M) I to locate the SIP–PBX registration record in its map. If
PA = HMAC(PK_a, E) the counter has rolled over or reset, this computation is per-
Temp-Gruu-userpart = "tgruu." ||
base64(E) || "." || base64(PA)
formed with the current and previous SK_a.
where || denotes concatenation. HMAC represents any suit- 3.3.5.6.4 Decoding of Temp-gruu by the SIP–PBX
ably strong HMAC algorithm (see RFC 2104 for a discus-
sion of HMAC algorithms). One suitable HMAC algorithm When the SIP–PBX receives a request in which the user part
for this purpose is HMACSHA256-80. begins with tgruu, it extracts the remaining portion and
Finally, the SIP–PBX adds a gr parameter to the tem- splits it at the “.” character into E and PA. It then computes E
porary GRUU that can be used to uniquely identify the and PA by performing a base64 decode of E and PA, respec-
UA registration record to which the GRUU corresponds. tively. Next, it computes
The means of generation of the gr parameter are left to
PAc = HMAC(PK_a, E)
the implementer, as long as they satisfy the properties of
a GRUU as described in RFC 5627 (see Section 4.3). One
where HMAC is the HMAC algorithm used for the steps
valid approach for generation of the gr parameter is calcula-
described earlier. If this computed value for PAc does not
tion of E and A as described in RFC 5627 and forming the
match the value of PA extracted from the GRUU, then the
gr parameter as
GRUU is rejected as invalid. The SIP–PBX then uses the value
of the gr parameter to locate the UA registration to which the
gr = base64enc(E) || base64enc(A)
GRUU corresponds, and routes the message accordingly.
Using this procedure may result in a temporary GRUU
returned to the registering UA by the SIP–PBX that looks 3.3.5.7 Interaction with SIP–PBX
similar to this: Aggregate Registration State
<allOneLine> If the SIP–PBX (or another interested and authorized party)
Contact: <sip:[email protected]> wishes to monitor or audit the registration state for all of
;temp-gruu="sip:tgruu.MQyaRiLEd78RtaWk the AORs currently registered to that SIP–PBX, it can sub-
cP7N8Q.5qVbsasdo2pkKw@ scribe to the SIP registration event package at the SIP–PBX’s
ssp.example.com;gr=YZGSCjKD42ccxO08pA7 main URI—that is, the URI used in the To header field of
HwAM4XNDIlMSL0HlA"
the REGISTER request. The NOTIFY messages for such
;+sip.instance="<urn:uuid:d0e2f290
-104b-11df-8a39-0800200c9a66>" a subscription will contain a body that contains one record
;expires=3600 for each AOR associated with the SIP–PBX. The AORs will
</allOneLine> be in the format expected to be received by the SSP (e.g.,
SIP Message Elements ◾ 189
sip:[email protected]), and the contacts will incomplete information about the registration state of an
correspond to the mapped contact created by the registration AOR. As an explicit change to the normative behavior of
(e.g., sip:[email protected]). In particular, the bnc RFC 3680, this document stipulates that subscribers to the
parameter is forbidden from appearing in the body of a reg- registration event package may create multiple dialogs as the
event NOTIFY request unless the subscriber has indicated result of a single subscription request. This will allow sub-
knowledge of the semantics of the bnc parameter. The means scribers to create a complete view of an AOR’s registration
for indicating this support are beyond the scope of this doc- state. Defining the behavior as described above is impor-
ument. Because the SSP does not necessarily know which tant, since the reg-event subscriber is interested in finding
GRUUs have been issued by the SIP–PBX to its associated out about the comprehensive list of devices associated with
UAs, these records will not generally contain the <temp- the AOR. Only the SIP–PBX will have authoritative access
gruu> or <pub-gruu> elements defined in RFC 5628. This to this information. For example, if the user has registered
information can be learned, if necessary, by subscribing to multiple UAs with differing capabilities, the SSP will not
the individual AOR registration state, as described below. know about the devices or their capabilities. By contrast,
the SIP–PBX will. If the SIP–PBX is not registered with the
SSP when a registration event subscription for a contact that
3.3.5.8 Interaction with Individual
would be implicitly registered if the SIP–PBX were registered
AOR Registration State is received, then the SSP should accept the subscription and
As described in Section 2.2, the SSP will generally retarget indicate that the user is not currently registered. Once the
all requests addressed to an AOR owned by a SIP–PBX to associated SIP–PBX is registered, the SSP should use the sub-
that SIP–PBX according to the mapping established at reg- scription migration mechanism defined in RFC 6665 (see
istration time. Although policy at the SSP may override this Section 4.5) to migrate the subscription to the SIP–PBX.
generally expected behavior, proper behavior of the registra- When a SIP–PBX receives a registration event subscrip-
tion event package requires that all reg event SUBSCRIBE tion addressed to an AOR that has been registered using the
requests are processed by the SIP–PBX. As a consequence, bulk registration mechanism described in this document,
the requirements on an SSP for processing registration event then each resulting registration information document
package SUBSCRIBE requests are not left to policy. If the SHOULD contain an aor attribute in its <registration/> ele-
SSP receives a SUBSCRIBE request for the registration event ment that corresponds to the AOR at the SSP. For example,
package with a Request-URI that indicates an AOR regis- consider a SIP–PBX that has registered with an SSP that has
tered via the bnc mechanism defined in this document, then a domain of ssp.example.com. The SIP–PBX used a Contact
the SSP must proxy that SUBSCRIBE to the SIP–PBX in URI of sip:198.51.100.3:5060;bnc. After such registration
the same way that it would proxy an INVITE bound for that is complete, a registration event subscription arriving at the
AOR, unless the SSP has and can maintain a copy of com- SSP with a Request-URI of sip:[email protected].
plete, accurate, and up-to-date information from the SIP– com will be retargeted to the SIP–PBX, with a Request-
PBX (e.g., through an active back-end subscription). URI of sip:[email protected]:5060. The result-
If the Request-URI in a SUBSCRIBE request for the reg- ing registration document created by the SIP–PBX would
istration event package indicates a contact that is registered contain a <registration/> element with an aor attribute of
by more than one SIP–PBX, then the SSP proxy will fork the sip:[email protected]. This behavior ensures
SUBSCRIBE request to all the applicable SIP–PBXs. Similarly, that subscribers external to the system (and unaware of GIN
if the Request-URI corresponds to a contact that is both implic- [generate implicit numbers] procedures) will be able to find
itly registered by a SIP–PBX and explicitly registered directly the relevant information in the registration document (since
with the SSP proxy, then the SSP proxy will semantically fork they will be looking for the publicly visible AOR, not the
the SUBSCRIBE request to the applicable SIP–PBX or SIP– address used for sending information from the SSP to the
PBXs and to the registrar function (which will respond with SIP–PBX). A SIP–PBX that supports both GRUU proce-
registration data corresponding to the explicit registrations at dures and the registration event packages should implement
the SSP). The forking in both of these cases can be avoided if the extension defined in RFC 5628.
the SSP has and can maintain a copy of up-to-date information
from the PBXs. RFC 3680 indicates that “a subscriber must
3.3.5.9 Interaction with Client-Initiated
not create multiple dialogs as a result of a single registration
(Outbound) Connections
event subscription request.” Consequently, subscribers who are
not aware of the extension described by this document will RFC 5626 (see Section 13.2) defines a mechanism that
accept only one dialog in response to such requests. allows UAs to establish long-lived TCP connections or UDP
In the case described in the preceding paragraph, this associations with a proxy in a way that allows bidirectional
behavior will result in such clients receiving accurate but traffic between the proxy and the UA. This behavior is
190 ◾ Handbook on Session Initiation Protocol
particularly important in the presence of NATs, and when- First, all location service records that result from expand-
ever TLS (RFC 5246) security is required. Neither the SSP ing a single Contact URI containing a bnc parameter will
nor the SIP–PBX is required to support client-initiated con- necessarily share a single path. Proxies will be unable to
nections. Generally, the outbound mechanism works with make policy decisions on a contact-by-contact basis regard-
the solution defined in this document, without any modifi- ing whether to include themselves in the path. Second, and
cations. Implementers should note that the instance ID used similarly, all AORs on the SIP–PBX that are registered with a
between the SIP–PBX and the SSP’s registrar identifies the common REGISTER request will be forced to share a com-
SIP–PBX itself, and not any of the UAs registered with the mon service route. One interesting technique that the path
SIP–PBX. As a consequence, any attempts to use caller pref- and service route mechanisms enable is the inclusion of a
erences (defined in RFC 3841, see Section 9.9) to target a token or cookie in the user portion of the service route or
specific instance are likely to fail. This should not be an issue, path entries. This token or cookie may convey information to
as the preferred mechanism for targeting specific instances of proxies about the identity, capabilities, or policies associated
a UA is GRUU. with the user. Since this information will be shared among
several AORs and several contacts when multiple AOR reg-
3.3.5.10 Interaction with Nonadjacent istration is employed, care should be taken to ensure that
doing so is acceptable for all AORs and all contacts registered
Contact Registration (Path)
in a single REGISTER request.
and Service-Route Discovery
RFC 3327 (see Section 2.8) defines a means by which a reg-
istrar and its associated proxy can be informed of a route 3.3.5.11 Examples
that is to be used between the proxy and the registered UA. Note that the following examples elide any steps related to
The scope of the route created by a Path header field is con- authentication. This is done for the sake of clarity. Actual
tact specific; if an AOR has multiple contacts associated with deployments will need to provide a level of authentication
it, the routes associated with each contact may be different appropriate to their system.
from each other. Support for nonadjacent contact registra-
tion is required in all SSPs and SIP–PBXs implementing the
multiple-AOR-registration protocol described in this docu- 3.3.5.11.1 Usage Scenario: Basic Registration
ment. At registration time, any proxies between the UA and
This example shows the message flows for a basic bulk
the registrar may add themselves to the Path header field.
REGISTER transaction (Figure 3.3), followed by an
By doing so, they request that any requests destined to the
INVITE addressed to one of the registered UAs. Example
UA as a result of the associated registration include them as
messages are shown after the sequence diagram (Figure 3.3).
part of the Route toward the UA. Although the path mecha-
nism does deliver the final path value to the registering UA,
F1: The SIP–PBX registers with the SSP for a range of
UAs typically ignore the value of the path. To provide similar
AORs.
functionality in the opposite direction—that is, to establish
a route for requests sent by a registering UA—RFC 3608 (see REGISTER sip:ssp.example.com SIP/2.0
Section 2.8) defines a means by which a UA can be informed Via: SIP/2.0/UDP 198.51.100.3:5060;branch=z9h
of a route that is to be used by the UA to route all outbound G4bKnashds7
requests associated with the AOR used in the registration. Max-Forwards: 70
This information is scoped to the AOR within the To: <sip:[email protected]>
UA, and is not specific to the contact (or contacts) in the From: <sip:[email protected]>;tag=a23589
Call-ID: 843817637684230@998sdasdh09
REGISTER request. Support of service route discovery is CSeq: 1826 REGISTER
optional in SSPs and SIP–PBXs. The registrar unilaterally Proxy-Require: gin
generates the values of the service route using whatever local Require: gin
policy it wishes to apply. Although it is common to use the Supported: path
Path or Route header field information in the request in com- Contact: <sip:198.51.100.3:5060;bnc>
posing the service route, registrar behavior is not constrained Expires: 7200
Content-Length: 0
in any way that requires it to do so. In considering the inter-
action between these mechanisms and the registration of F3: The SSP receives a request for an AOR assigned to the
multiple AORs in a single request, implementers of proxies, SIP–PBX.
registrars, and intermediaries must keep in mind the follow-
ing issues, which stem from the fact that GIN effectively reg- INVITE sip:[email protected]
isters multiple AORs and multiple contacts. SIP/2.0
SIP Message Elements ◾ 191
F1. REGISTER
Contact:<sip:198.51.100.4;bnc>
F2. 200 OK
F3. INVITE
sip:[email protected]
F4. INVITE
sip:[email protected]
Figure 3.3 Basic bulk SIP REGISTER transaction. (Copyright IETF. Reproduced with permission.)
F4: The SSP retargets the incoming request according to 3.3.5.11.2 Usage Scenario: Using Path
the information received from the SIP–PBX at regis- to Control Request-URI
tration time. This example shows a bulk REGISTER transaction with the
SSP making use of the Path header field (Figure 3.4) exten-
INVITE sip:[email protected] SIP/2.0
Via: SIP/2.0/UDP ssp.example.com;branch=
sion (RFC 3327, see Section 2.8). This allows the SSP to
z9hG4bKa45cd5c52a6dd50 designate a domain on the incoming Request-URI that does
Via: SIP/2.0/UDP foo.example;branch= not necessarily resolve to the SIP–PBX when the SSP applies
z9hG4bKa0bc7a0131f0ad RFC 3263 (see Section 8.2.4) procedures to it.
F1. REGISTER
Path:<sip:[email protected];lr>
Contact:<sip:pbx.example;bnc>
F2. 200 OK
F3. INVITE
sip:[email protected]
F4. INVITE
sip:[email protected]
Route:<sip:[email protected];lr>
Figure 3.4 Bulk REGISTER transaction with SSP making use of path header.
192 ◾ Handbook on Session Initiation Protocol
F1: The SIP–PBX registers with the SSP for a range of Contact: <sip:[email protected]:2081>
AORs. It includes the form of the URI it expects to Content-Type: application/sdp
Content-Length:...
receive in the Request-URI in its Contact header field, <sdp body here>
and it includes information that routes to the SIP–PBX
in the Path header field.
described earlier. The detail of all ABNF syntaxes can be seen but only those dimensions labeled foo, bar, baz, and bif mat-
in Section 2.4.1.2. ter. The result is that values of other media features do not
matter. The feature collection {foo=A,bar=B,baz=C,bop=F}
is in the feature set described by the predicate, even though
3.4.2 Capability Expression Using the media feature tag bop is not mentioned.
Media Feature Tag Feature set predicates are therefore inclusive by default.
The UA capabilities that are used in the Contact header field A feature collection is present unless the Boolean predicate
using the URI for supporting different media are specified rules it out. This was a conscious design choice in RFC 2533.
in RFC 3840 (see Section 2.11) and other RFCs. Table 2.12 RFC 2533 also talks about matching a preference with a
(Section 2.11) shows those SIP media feature tags, their capability set. This is accomplished by representing both
description, and example usage. with a feature set. A preference is a feature set—it is a speci-
fication of a number of feature collections, any one of which
would satisfy the requirements of the sender. A capability is
3.4.3 Usage of the Content also a feature set—it is a specification of the feature collec-
Negotiation Framework tions that the recipient supports. There is a match when the
RFC 3840 (also see Section 2.11) makes heavy use of the spaces defined by both feature sets overlap. When there is
terminology and concepts in the content negotiation work overlap, there exists at least one feature collection that exists
carried out within the IETF, and documented in several in both feature sets, and therefore a modality or rendering
RFCs, such as RFC 2506 (which provides a template for desired by the sender that is supported by the recipient. This
registering media feature tags), RFC 2533 (which presents a leads directly to the definition of a match. Two feature sets
syntax and matching algorithm for media feature sets), RFC match if there exists at least one feature collection present in
2738 (which provides a minor update to RFC 2533), and both feature sets.
RFC 2703 (which provides a general framework for content Computing a match for two general feature set predi-
negotiation). A feature collection represents a single point in cates is not easy. RFC 2533 presents an algorithm for doing
this space. It represents a particular rendering or instance it by expanding an arbitrary expression into disjunctive nor-
of an entity (in our case, a UA). For example, a rendering mal form. However, the feature set predicates used by this
of a UA would define an instantaneous mode of operation specification are constrained. They are always in conjunctive
that it can support. One such rendering would be processing normal form, with each term in the conjunction describing
the INVITE method, which carried the application/Session values for different media features. This makes computation
Description Protocol (SDP) MIME type, sent to a UA for a of a match easy. It is computed independently for each media
user that is speaking English. feature, and then the feature sets overlap if media features
RFC 2533 describes syntax for writing down these specified in both sets overlap. Computing the overlap of a
N-dimensional Boolean functions, borrowed from LDAP single media feature is very straightforward, and is a simple
defined in RFC 4515. It uses a prolog-style syntax that is matter of computing whether two finite sets overlap. Since
fairly self-explanatory. This representation is called a feature the content negotiation work was primarily meant to apply
set predicate. The base unit of the predicate is a filter, which to documents or other resources with a set of possible render-
is a Boolean expression encased in round brackets. A filter ings, it is not immediately apparent how it is used to model
can be complex, where it contains conjunctions and disjunc- SIP UAs.
tions of other filters, or it can be simple. A simple filter is A feature set is composed of a set of feature collections,
one that expresses a comparison operation on a single media each of which represents a specific rendering supported by
feature tag. For example, consider the feature set predicate: the entity described by the feature set. In the context of a
SIP UA, a feature collection represents an instantaneous
(& (foo=A) modality. That is, if we look at the run time processing of a
(bar=B) SIP UA and take a snapshot in time, the feature collection
(| (baz=C) (& (baz=D) (bif=E)))) describes what it is doing at that very instant. This model is
important, since it provides guidance on how to determine
This defines a function over four media features—foo, whether something is a value for a particular feature tag, or
bar, baz, and bif. Any point in feature space with foo equal a feature tag by itself. If two properties can be exhibited by
to A, bar equal to B, and baz equal to either C or D, and a UA simultaneously so that both are present in an instan-
bif equal to E, is in the feature set defined by this feature taneous modality, they need to be represented by separate
set predicate. Note that the predicate does not say anything media feature tags.
about the number of dimensions in feature space. The predi- For example, a UA may be able to support some number
cate operates on a feature space of any number of dimensions, of media types—audio, video, and control. Should each of
194 ◾ Handbook on Session Initiation Protocol
these be different values for a single media-types feature tag, If a string (as defined in RFC 2533) is used as the value of a
or should each of them be a separate Boolean feature tag? The simple filter, that value must not include the < or > charac-
model provides the answer. Since, at any instance in time, ters, the simple filter must not be negated, and it must be the
a UA could be handling both audio and video, they need only simple filter for that particular feature tag. This contact
to be separate media feature tags. However, the SIP meth- predicate is then converted to a list of feature parameters,
ods supported by a UA can each be represented as different following the procedure outlined below.
values for the same media feature tag (the sip.methods tag), The contact predicate is a conjunction of terms. Each
because fundamentally, a UA processes a single request at a term indicates constraints on a single feature tag, and each
time. It may be multithreading, so that it appears that this term is represented by a separate feature parameter that will
is not so, but at a purely functional level, it is true. Clearly, be present in the Contact header field. The syntax of this
there are weaknesses in this model; however, it serves as a parameter depends on the feature tag. Each forward slash in
useful guideline for applying the concepts of RFC 2533 to the feature tag is converted to a single quote, and each colon
the problem at hand. is converted to an exclamation point. For the base tags—
To construct a set of Contact header field parameters that that is, those feature tags documented in this specification
indicate capabilities, a UA constructs a feature predicate for (sip.audio, sip.automata, sip.class, sip.duplex, sip.data, sip.
that contact. This process is described in terms of RFC 2533 control, sip.mobility, sip.description, sip.events, sip.priority,
and its minor update (RFC 2738), syntax and constructs, sip.methods, sip.extensions, sip.schemes, sip.application, sip.
followed by a conversion to the syntax used in this specifica- video, language, type, sip.isfocus, sip.actor, and sip.text), the
tion. However, this represents a logical flow of processing. leading sip., if present, is stripped. For feature tags not in
There is no requirement that an implementation actually use this list, the leading sip. must not be stripped if present, and
RFC 2533 syntax as an intermediate step. When using the indeed, a plus sign (“+”) must be added as the first character
sip.methods feature tag, a UA must not include values that of the Contact header field parameter. The result is the fea-
correspond to methods not standardized in IETF Standards ture parameter name. As a result of these rules, the base tags
Track RFCs. When using the sip.events feature tag, a UA appear naked in the Contact header field—they have neither
must not include values that correspond to event packages a + nor a sip. prefix. All other tags will always have a leading
not standardized in IETF Standards Track RFCs. When + when present in the Contact header field, and will addi-
using the sip.schemes feature tag, a UA must not include val- tionally have a sip. if the tag is in the SIP tree.
ues that correspond to schemes not standardized in IETF The value of the feature parameter depends on the term
Standards Track RFCs. When using the sip.extensions feature of the conjunction. If the term is a Boolean expression with a
tag, a UA must not include values that correspond to option value of true, that is, (sip.audio=TRUE), the contact param-
tags not standardized in IETF Standards Track RFCs. eter has no value. If the term of the conjunction is a disjunc-
Note that the sip.schemes feature tag does not indi- tion, the value of the contact parameter is a quoted string.
cate the scheme of the registered URI. Rather, it indicates The quoted string is a comma-separated list of strings, each
schemes that a UA is capable of sending requests to, should one derived from one of the terms in the disjunction. If the
such a URI be received in a web page or Contact header term of the conjunction is a negation, the value of the contact
field of a redirect response. It is recommended that a UA pro- parameter is a quoted string. The quoted string begins with
vide complete information in its contact predicate. That is, it an exclamation point (!), and the remainder is constructed
should provide information on as many feature tags as pos- from the expression being negated.
sible. The mechanisms in this specification work best when The remaining operation is to compute a string from a
UAs register complete feature sets. Furthermore, when a UA primitive filter. If the filter is a simple filter that is performing
registers values for a particular feature tag, it MUST list all a numeric comparison, the string starts with an octothorpe
values that it supports. For example, when including the sip. (#), followed by the comparator in the filter (=, >=, or <=),
methods feature tag, a UA must list all methods it supports. followed by the value from the filter. If the value from the
The contact predicate constructed by a UA must be an filter is expressed in rational form (X/Y), then X and Y are
AND of terms (called a conjunction). Each term is either an divided, yielding a decimal number, and this decimal num-
OR (called a disjunction) of simple filters or negations of sim- ber is output to the string. RFC 2533 uses a fractional nota-
ple filters, or a single simple filter or negation of a single filter. tion to describe rational numbers. This specification uses a
In the case of a disjunction, each filter in the disjunction decimal form. The above text merely converts between the
must indicate feature values for the same feature tag (i.e., the two representations. Practically speaking, this conversion is
disjunction represents a set of values for a particular feature not needed since the numbers are the same in either case.
tag), while each element of the conjunction must be for a However, it is described in case implementations wish to
different feature tag. Each simple filter can be an equality, or directly plug the predicates generated by the rules in this sec-
in the case of numeric feature tags, an inequality or range. tion into an RFC 2533 implementation.
SIP Message Elements ◾ 195
If the filter is a range (foo=X...Y), the string is equal to user wishes to forward calls from sip:[email protected] to
X:Y, where X and Y have been converted from fractional sip:[email protected], it could generate a registration that
numbers (A/B) to their decimal equivalent. If the filter is an looks like, in part
equality over a token or Boolean, then that token or Boolean
value (TRUE or FALSE) is output to the string. If the filter REGISTER sip:example.com SIP/2.0
is an equality over a quoted string, the output is a less than To: sip:[email protected]
(<), followed by the quoted string, followed by a greater than Contact: sip:[email protected]
(>). As an example, this feature predicate:
In this case, the registered contact is not identifying a
(& (sip.mobility=fixed) UA, but rather, another AOR. In such a case, the registered
(| (! (sip.events=presence)) (sip.
events=message-summary))
contact would not indicate a feature set. However, in some
(| (language=en) (language=de)) cases, a UA may wish to express feature parameters for an
(sip.description="PC") AOR. One example is an AOR that represents a multiplic-
(sip.newparam=TRUE) ity of devices in a home network, and routes to a proxy
(rangeparam=-4..5125/1000)) server in the user’s home. Since all devices in the home are
for personal use, the AOR itself can be described with the
would be converted into the following feature parameters: ;class=“personal” feature parameter. A registration that for-
wards calls to this home AOR could make use of that feature
mobility="fixed";events="!presence,message
-summary";language="en,de"; parameter.
description="<PC>";+sip.newparam;+rangepa Generally speaking, a feature parameter can only be
ram="#-4:+5.125" associated with an AOR if all devices bound to that AOR
share the exact same set of values for that feature parameter.
These feature tags would then appear as part of the Similarly, in some cases, a UA can exhibit one characteris-
Contact header field: tic or another; however, the characteristic is not known in
advance. For example, a UA could represent a device that is a
Contact: <sip:[email protected]>;
phone with an embedded answering machine. The ideal way
mobility="fixed";events="!presence,message
-summary"; to treat such devices is to model them as if they were actu-
language=”en,de";description="<PC>"; ally a proxy fronting two devices—a phone (which is never
+sip.newparam;+rangeparam="#-4:+5.125” an answering machine) and an answering machine (which
is never a phone). The registration from this device would
Notice how the leading sip. was stripped from the sip. be constructed as if it were an AOR, as per the procedures
mobility, sip.events, and sip.description feature tags before above. Generally, this means that, unless the characteristic
encoding them in the Contact header field. This is because is identical between the logical devices, that characteristic
these feature tags are among the base tags listed above. It will not be present in any registration generated by the actual
is for this reason that these feature tags were not encoded device.
with a leading + either. However, the sip.newparam feature This feature set that a UA would like to associate with a
tag was encoded with both the + and its leading sip., and contact that it is registering is constructed and converted to
the rangeparam was also encoded with a leading +. This is a series of Contact header field parameters, as described ear-
because neither of these feature tags is defined in this speci- lier, and those feature parameters are added to the Contact
fication. As such, the leading sip. is not stripped off, and a + header field value containing the URI to which the param-
is added. eters apply. The Allow, Accept, Accept-Language, and Allow-
Events (defined in RFC 6665, see Sections 5.1 and 2.8.2)
header fields are allowed in REGISTER requests, and also
3.4.3.1 Expressing Capabilities in a Registration
indicate capabilities. However, their semantic in REGISTER
When a UA registers, it can choose to indicate a feature set is different, indicating capabilities, used by the registrar, for
associated with a registered contact. Whether or not a UA generation of the response. As such, they are not a substitute
does so depends on what the registered URI represents. If the or an alternate for the Contact feature parameters, which
registered URI represents a UA instance (the common case indicate the capabilities of the UA generally speaking.
in registrations), a UA compliant to this specification should The REGISTER request may contain a Require header
indicate a feature set using the mechanisms described here. field with the value pref if the client wants to be sure that the
If, however, the registered URI represents an AOR, or some registrar understands the extensions defined in this specifi-
other resource that is not representable by a single feature cation. This means that the registrar will store the feature
set, it should not include a feature set. As an example, if a parameters, and make them available to elements accessing
196 ◾ Handbook on Session Initiation Protocol
the location service within the domain. In the absence of the description provided earlier. These are then added as Contact
Require header field, a registrar that does not understand this header field parameters in the request or response. The fea-
extension will simply ignore the Contact header field param- ture parameters can be included in both initial requests and
eters. If a UA registers against multiple separate AORs, and mid-dialog requests, and may change mid-dialog to signal a
the contacts registered for each have different capabilities, a change in UA capabilities.
UA must use different URIs in each registration. This allows There is overlap in the callee capabilities mechanism
the UA to uniquely determine the feature set that is associ- with the Allow, Accept, Accept-Language, and Allow-Events
ated with the Request-URI of an incoming request. As an (defined in RFC 6665, see Section 2.8.2) header fields,
example, a voice-mail server that is a UA that supports audio which can also be used in target refresh requests. Specifically,
and video media types and is not mobile would construct a the Allow header field and sip.methods feature tag indicate
feature predicate like this: the same information. The Accept header field and the type
feature tag indicate the same information. The Accept-
(& (sip.audio=TRUE) Language header field and the language feature tag indicate
(sip.video=TRUE) the same information. The Allow-Events header field and
(sip.actor=msg-taker)
the sip.events feature tag indicate the same information. It is
(sip.automata=TRUE)
(sip.mobility=fixed) possible that other header fields and feature tags defined in
(| (sip.methods=INVITE) (sip.methods=BYE) the future may also overlap. When there exists a feature tag
(sip.methods=OPTIONS) that describes a capability that can also be represented with
(sip.methods=ACK) (sip.methods=CANCEL))) a SIP header field, a UA must use the header field to describe
the capability. A UA receiving a message that contains both
These would be converted into feature parameters and the header field and the feature tag must use the header field,
included in the REGISTER request: and not the feature tag.
REGISTER sip:example.com SIP/2.0
From: sip:[email protected];tag=asd98 3.4.5 OPTIONS Processing
To: sip:[email protected]
Call-ID: [email protected] When a UAS compliant to this specification receives an
CSeq: 9987 REGISTER OPTIONS request, it may add feature parameters to the
Max-Forwards: 70 Contact header field in the OPTIONS response for the pur-
Via: SIP/2.0/UDP host.example.com; pose of indicating the capabilities of the UA. To do that,
branch=z9hG4bKnashds8
it constructs a set of feature parameters according to the
Contact: <sip:[email protected]>;
audio;video; description provided earlier. These are then added as Contact
actor=”msg-taker”;automata;mobility=”fixed”; header field parameters in OPTIONS response. Indeed, if
methods=”INVITE,BYE,OPTIONS,ACK,CANCEL” feature parameters were included in the registration gen-
Content-Length: 0 erated by that UA, those same parameters should be used
in the OPTIONS response. The guidelines regarding the
Note that a voice-mail server is usually an automata and a overlap of the various callee capabilities feature tags with
message taker. When a UAC refreshes its registration, it must SIP header fields described earlier apply to the generation of
include its feature parameters in that refresh if it wishes for OPTIONS responses as well. In particular, they apply when
them to remain active. Furthermore, when a registrar returns a Contact header field is describing the UA that generated
a 200 OK response to a REGISTER request, each Contact the OPTIONS response. When a Contact header field in the
header field value must include all the feature parameters OPTIONS response is identifying a different UA, there is
associated with that URI. no overlap.
target of the OPTIONS request is identified by the Request- accept an INVITE request. An OPTIONS request received
URI, which could identify another UA or a SIP server. If within a dialog generates a 200 OK response that is identical
the OPTIONS method is addressed to a proxy server, the to one constructed outside a dialog and does not have any
Request-URI is set without a user part, similar to the way a impact on the dialog. This use of OPTIONS has limitations
Request-URI is set for a REGISTER request. due to the differences in proxy handling of OPTIONS and
Alternatively, a server receiving an OPTIONS request INVITE requests. While a forked INVITE can result in mul-
with a Max-Forwards header field value of 0 may respond to tiple 200 OK responses being returned, a forked OPTIONS
the request regardless of the Request-URI. This behavior is method will only result in a single 200 OK response, since it
common with HTTP/1.1. This behavior can be used as a tra- is treated by proxies using the non-INVITE handling.
ceroute functionality to check the capabilities of individual If the response to an OPTIONS method is generated by
hop servers by sending a series of OPTIONS requests with a proxy server, the proxy returns a 200 OK, listing the capa-
incremented Max-Forwards values. As is the case for gen- bilities of the server. The response does not contain a message
eral UA behavior, the transaction layer can return a timeout body. Allow, Accept, Accept-Encoding, Accept-Language,
error if the OPTIONS yields no response. This may indi- and Supported header fields should be present in a 200 OK
cate that the target is unreachable and hence unavailable. response to an OPTIONS request. If the response is gener-
An OPTIONS request may be sent as part of an established ated by a proxy, the Allow header field should be omitted as
dialog to query the peer on capabilities that may be utilized it is ambiguous since a proxy is method agnostic. Contact
later in the dialog. header fields may be present in a 200 OK response and have
the same semantics as in a 3xx response. That is, they may list
a set of alternative names and methods of reaching the user.
3.5.1 OPTIONS Request
A Warning header field may be present.
A Contact header field may be present in an OPTIONS method. A message body may be sent, the type of which is deter-
An Accept header field should be included to indicate the type mined by the Accept header field in the OPTIONS request
of message body the UAC wishes to receive in the response. (application/SDP is the default if the Accept header field is
Typically, this is set to a format that is used to describe the not present). If the types include one that can describe media
media capabilities of a UA, such as SDP (application/SDP). The capabilities, the UAS should include a body in the response
response to an OPTIONS request is assumed to be scoped for that purpose. Details on the construction of such a body
to the Request-URI in the original request. However, only in the case of application/SDP are described in RFC 3264
when an OPTIONS method is sent as part of an established (see Section 3.8.4).
dialog is it guaranteed that future requests will be received
by the server that generated the OPTIONS response. An SIP/2.0 200 OK
Via: SIP/2.0/UDP pc33.atlanta.com;
example OPTIONS request is shown below: branch=z9hG4bKhjhs8ass877;
received=192.0.2.4
OPTIONS sip:[email protected] SIP/2.0 To: <sip:[email protected]>;tag=93810874
Via: SIP/2.0/UDP pc33.atlanta.com; From: Alice <sip:[email protected]>;
branch=z9hG4bKhjhs8ass877 tag=1928301774
Max-Forwards: 70 Call-ID: a84b4c76e66710
To: <sip:[email protected]> CSeq: 63104 OPTIONS
From: Alice <sip:[email protected]>; Contact: <sip:[email protected]>
tag=1928301774 Contact: <mailto:[email protected]>
Call-ID: a84b4c76e66710 Allow: INVITE, ACK, CANCEL, OPTIONS, BYE
CSeq: 63104 OPTIONS Accept: application/SDP
Contact: <sip:[email protected]> Accept-Encoding: gzip
Accept: application/SDP Accept-Language: en
Content-Length: 0 Supported: foo
Content-Type: application/SDP
3.5.2 Response to OPTIONS Request Content-Length: 274
(SDP not shown)
The response code chosen to an OPTIONS method must
be the same as that which would have been chosen had the
request been an INVITE. For example, a 200 OK would be
returned if the UAS is ready to accept a call, and a 486 Busy
3.6 Dialogs
Here would be returned if the UAS is busy. This allows an A dialog represents a peer-to-peer SIP relationship between two
OPTIONS request to be used to determine the basic state of UAs that persists for some time. The dialog facilitates sequenc-
a UAS, which can be an indication of whether the UAS will ing of messages between the UAs and proper routing of requests
198 ◾ Handbook on Session Initiation Protocol
between both of them. The dialog represents a context in which into the response (including the URIs, URI parameters, and
to interpret SIP messages. We will describe how the SIP requests any Record-Route header field parameters, whether they are
and responses are used to construct a dialog, and then how sub- known or unknown to the UAS) and must maintain the
sequent requests and responses are sent within a dialog. A dialog order of those values. The UAS must add a Contact header
is identified at each UA with a dialog ID, which consists of a field to the response. The Contact header field contains an
Call-ID value, a local tag, and a remote tag. The dialog ID at address where the UAS would like to be contacted for subse-
each UA involved in the dialog is not the same. Specifically, the quent requests in the dialog (which includes the ACK for a
local tag at one UA is identical to the remote tag at the peer 2xx response in the case of an INVITE).
UA. The tags are opaque tokens that facilitate the generation of Generally, the host portion of this URI is the IP address
cryptographically random unique dialog IDs. or FQDN of the host. The URI provided in the Contact
A dialog ID is also associated with all responses and with header field must be a SIP or SIPS URI. If the request that
any request that contains a tag in the To field. The rules for initiated the dialog contained a SIPS URI in the Request-
computing the dialog ID of a message depend on whether URI or in the top Record-Route header field value, if there
the SIP element is a UAC or UAS. For a UAC, the Call-ID was any, or the Contact header field if there was no Record-
value of the dialog ID is set to the Call-ID of the message, Route header field, the Contact header field in the response
the remote tag is set to the tag in the To field of the message, must be a SIPS URI. The URI should have global scope (i.e.,
and the local tag is set to the tag in the From field of the mes- the same URI can be used in messages outside this dialog).
sage (these rules apply to both requests and responses). As The same way, the scope of the URI in the Contact header
one would expect for a UAS, the Call-ID value of the dialog field of the INVITE is not limited to this dialog either. It
ID is set to the Call-ID of the message, the remote tag is set can therefore be used in messages to the UAC even outside
to the tag in the From field of the message, and the local tag this dialog.
is set to the tag in the To field of the message. The UAS then constructs the state of the dialog. This
A dialog contains certain pieces of state needed for fur- state must be maintained for the duration of the dialog. If
ther message transmissions within the dialog. This state con- the request arrived over TLS, and the Request-URI con-
sists of the dialog ID, a local sequence number (used to order tained a SIPS URI, the secure flag is set to TRUE. The route
requests from the UA to its peer), a remote sequence number set must be set to the list of URIs in the Record-Route header
(used to order requests from its peer to the UA), a local URI, field from the request, taken in order and preserving all URI
a remote URI, remote target, a Boolean flag called secure, parameters. If no Record-Route header field is present in
and a route set, which is an ordered list of URIs. The route the request, the route set must be set to the empty set. This
set is the list of servers that need to be traversed to send a route set, even if empty, overrides any preexisting route set
request to the peer. A dialog can also be in the early state, for future requests in this dialog. The remote target must be
which occurs when it is created with a provisional response set to the URI from the Contact header field of the request.
and then transition to the confirmed state when a 2xx final The remote sequence number must be set to the value of
response arrives. For other responses, or if no response arrives the sequence number in the CSeq header field of the request.
at all on that dialog, the early dialog terminates. The local sequence number must be empty. The call identifier
component of the dialog ID must be set to the value of the
Call-ID in the request. The local tag component of the dialog
3.6.1 Creation of a Dialog
ID must be set to the tag in the To field in the response to
Dialogs are created through the generation of nonfailure the request (which always includes a tag), and the remote tag
responses to requests with specific methods. Within this component of the dialog ID must be set to the tag from the
specification, only 2xx and 101–199 responses with a To tag, From field in the request. A UAS must be prepared to receive
where the request was INVITE, will establish a dialog. A a request without a tag in the From field, in which case the
dialog established by a nonfinal response to a request is in the tag is considered to have a value of null. This is to maintain
early state, and it is called an early dialog. Here, we describe backwards compatibility with RFC 2543 (obsoleted by RFC
the process for the creation of a dialog state that is not depen- 3261), which did not mandate From tags. The remote URI
dent on the method. UAs must assign values to the dialog ID must be set to the URI in the From field, and the local URI
components as described below. must be set to the URI in the To field.
outside this dialog) in the Contact header field of the request. URI, and not the route set formed from the Record-Route.
If the request has a Request-URI or a topmost Route header Updating the latter would introduce severe backwards com-
field value with a SIPS URI, the Contact header field must patibility problems with RFC 2543 (obsoleted by RFC
contain a SIPS URI. When a UAC receives a response that 3261)-compliant systems.
establishes a dialog, it constructs the state of the dialog. This
state MUST be maintained for the duration of the dialog. If
the request was sent over TLS, and the Request-URI con- 3.6.2.1 UAC Behavior
tained a SIPS URI, the secure flag is set to TRUE. The route
3.6.2.1.1 Generating the Request
set must be set to the list of URIs in the Record-Route header
field from the response, taken in reverse order and preserving A request within a dialog is constructed by using many of
all URI parameters. If no Record-Route header field is pres- the components of the state stored as part of the dialog. The
ent in the response, the route set must be set to the empty URI in the To field of the request must be set to the remote
set. This route set, even if empty, overrides any preexisting URI from the dialog state. The tag in the To header field of
route set for future requests in this dialog. The remote target the request must be set to the remote tag of the dialog ID.
must be set to the URI from the Contact header field of the The From URI of the request must be set to the local URI
response. from the dialog state. The tag in the From header field of
The local sequence number must be set to the value of the request must be set to the local tag of the dialog ID. If
the sequence number in the CSeq header field of the request. the value of the remote or local tags is null, the tag param-
The remote sequence number must be empty (it is established eter must be omitted from the To or From header fields,
when the remote UA sends a request within the dialog). The respectively.
call identifier component of the dialog ID MUST be set to Usage of the URI from the To and From fields in the
the value of the Call-ID in the request. The local tag compo- original request within subsequent requests is done for back-
nent of the dialog ID MUST be set to the tag in the From wards compatibility with RFC 2543 (obsoleted by RFC
field in the request, and the remote tag component of the dia- 3261), which used the URI for dialog identification. In this
log ID must be set to the tag in the To field of the response. specification, only the tags are used for dialog identification.
A UAC must be prepared to receive a response without a tag It is expected that mandatory reflection of the original To
in the To field, in which case the tag is considered to have and From URI in mid-dialog requests will be deprecated in
a value of null. This is to maintain backwards compatibil- a subsequent revision of this specification. The Call-ID of the
ity with RFC 2543 (obsoleted by RFC 3261), which did not request MUST be set to the Call-ID of the dialog. Requests
mandate To tags. The remote URI must be set to the URI within a dialog must contain strictly monotonically increas-
in the To field, and the local URI must be set to the URI in ing and contiguous CSeq sequence numbers (increasing by
the From field. one) in each direction (expecting ACK and CANCEL of
course, whose numbers equal the requests being acknowl-
edged or cancelled). Therefore, if the local sequence number
3.6.2 Requests within a Dialog is not empty, the value of the local sequence number must be
Once a dialog has been established between two UAs, either incremented by one, and this value must be placed into the
of them may initiate new transactions as needed within the CSeq header field. The method field in the CSeq header field
dialog. The UA sending the request will take the UAC role value must match the method of the request.
for the transaction. The UA receiving the request will take With a length of 32 bits, a client could generate, within
the UAS role. Note that these may be different roles than a single call, one request a second for about 136 years before
the UAs held during the transaction that established the needing to wrap around. The initial value of the sequence num-
dialog. Requests within a dialog may contain Record-Route ber is chosen so that subsequent requests within the same call
and Contact header fields. However, these requests do not will not wrap around. A nonzero initial value allows clients to
cause the dialog’s route set to be modified, although they use a time-based initial sequence number. A client could, for
may modify the remote target URI. example, choose the 31 most significant bits of a 32-bit second
Specifically, requests that are not target refresh requests clock as an initial sequence number. The UAC uses the remote
do not modify the dialog’s remote target URI, and requests target and route set to build the Request-URI and Route header
that are target refresh requests do. For dialogs that have field of the request. If the route set is empty, the UAC must
been established with an INVITE, the only target refresh place the remote target URI into the Request-URI. The UAC
request defined is re-INVITE. Other extensions may define must not add a Route header field to the request.
different target refresh requests for dialogs established in If the route set is not empty, and the first URI in the route
other ways. Note that an ACK is not a target refresh request. set contains the lr parameter, the UAC must place the remote
Target refresh requests only update the dialog’s remote target target URI into the Request-URI and must include a Route
200 ◾ Handbook on Session Initiation Protocol
header field containing the route set values in order, including treated as a 408 Request Timeout response. The behavior of
all parameters. If the route set is not empty, and its first URI a UAC that receives a 3xx response for a request sent within
does not contain the lr parameter, the UAC must place the a dialog is the same as if the request had been sent outside a
first URI from the route set into the Request-URI, stripping dialog. Note, however, that when the UAC tries alternative
any parameters that are not allowed in a Request-URI. The locations, it still uses the route set for the dialog to build the
UAC must add a Route header field containing the remainder Route header of the request. When a UAC receives a 2xx
of the route set values in order, including all parameters. The response to a target refresh request, it must replace the dialog’s
UAC must then place the remote target URI into the Route remote target URI with the URI from the Contact header
header field as the last value. For example, if the remote target field in that response, if present. If the response for a request
is sip:user@remoteua and the route set contains within a dialog is a 481 Call/Transaction Does Not Exist or a
408 Request Timeout, the UAC should terminate the dialog.
<sip:proxy1>,<sip:proxy2>,<sip:proxy3;lr>, A UAC should also terminate a dialog if no response at all is
<
sip:proxy4>
received for the request (the client transaction would inform
the TU about the timeout). For INVITE-initiated dialogs,
the request will be formed with the following Request-URI
terminating the dialog consists of sending a BYE.
and Route header field:
METHOD sip:proxy1
3.6.2.2 UAS Behavior
Route: <sip:proxy2>,<sip:proxy3;lr>,
<sip:proxy4>,<sip:user@remoteua> Requests sent within a dialog, as any other requests, are
atomic. If a particular request is accepted by the UAS, all the
If the first URI of the route set does not contain the lr state changes associated with it are performed. If the request
parameter, the proxy indicated does not understand the rout- is rejected, none of the state changes are performed. Note
ing mechanisms described in this document and will act as that some requests, such as INVITEs, affect several pieces
specified in RFC 2543 (obsoleted by RFC 3261), replacing the of state. The UAS will receive the request from the transac-
Request-URI with the first Route header field value it receives tion layer. If the request has a tag in the To header field, the
while forwarding the message. Placing the Request-URI at UAS core computes the dialog identifier corresponding to
the end of the Route header field preserves the information in the request and compares it with existing dialogs. If there
that Request-URI across the strict router (it will be returned is a match, this is a mid-dialog request. If the request has a
to the Request-URI when the request reaches a loose router). tag in the To header field, but the dialog identifier does not
A UAC should include a Contact header field in any tar- match any existing dialogs, the UAS may have crashed and
get refresh requests within a dialog, and unless there is a need restarted, or it may have received a request for a different
to change it, the URI should be the same as used in previ- (possibly failed) UAS (the UASs can construct the To tags so
ous requests within the dialog. If the secure flag is true, that that a UAS can identify that the tag was for a UAS for which
URI must be a SIPS URI. A Contact header field in a target it is providing recovery).
refresh request updates the remote target URI. This allows Another possibility is that the incoming request has been
a UA to provide a new contact address, should its address simply misrouted. On the basis of the To tag, the UAS may
change during the duration of the dialog. However, requests either accept or reject the request. Accepting the request for
that are not target refresh requests do not affect the remote acceptable To tags provides robustness, so that dialogs can per-
target URI for the dialog. sist even through crashes. UAs wishing to support this capabil-
Once the request has been constructed, the address of ity must take into consideration some issues such as choosing
the server is computed and the request is sent, using the same monotonically increasing CSeq sequence numbers even across
procedures for requests outside of a dialog. The procedures reboots, reconstructing the route set, and accepting out-of-
will normally result in the request being sent to the address range Real-Time Transport Protocol (RTP) time stamps and
indicated by the topmost Route header field value or the sequence numbers. If the UAS wishes to reject the request
Request-URI if no Route header field is present. Subject to because it does not wish to recreate the dialog, it must respond
certain restrictions, they allow the request to be sent to an to the request with a 481 Call/Transaction Does Not Exist
alternate address (such as a default outbound proxy not rep- status code and pass that to the server transaction. Requests
resented in the route set). that do not change in any way the state of a dialog may be
received within a dialog (e.g., an OPTIONS request). They
are processed as if they had been received outside the dialog.
3.6.2.1.2 Processing the Responses
If the remote sequence number is empty, it must be set
The UAC will receive responses to the request from the trans- to the value of the sequence number in the CSeq header field
action layer. If the client transaction returns a timeout, this is value in the request. If the remote sequence number was not
SIP Message Elements ◾ 201
empty, but the sequence number of the request is lower than remote target URI with the URI from the Contact header
the remote sequence number, the request is out of order and field in that request, if present.
must be rejected with a 500 Server Internal Error response.
If the remote sequence number was not empty, and the 3.6.3 Termination of a Dialog
sequence number of the request is greater than the remote
Independent of the method, if a request outside of a dialog
sequence number, the request is in order. It is possible for
generates a non-2xx final response, any early dialogs created
the CSeq sequence number to be higher than the remote
through provisional responses to that request are terminated.
sequence number by more than one. This is not an error con-
The mechanism for terminating confirmed dialogs is method
dition, and a UAS should be prepared to receive and process
specific. In this specification, the BYE method terminates a
requests with CSeq values more than one higher than the
session and the dialog associated with it.
previous received request. The UAS must then set the remote
sequence number to the value of the sequence number in the
CSeq header field value in the request. If a proxy challenges a
3.6.4 Example of Dialog State
request generated by the UAC, the UAC has to resubmit the We are going back to an earlier example shown in Figure 3.1a
request with credentials. The resubmitted request will have a of SIP trapezoid operations. It shows that Bob’s UA is send-
new CSeq number. The UAS will never see the first request, ing the 180 Ringing (F11) message to proxy 2, and proxy 2
and thus, it will notice a gap in the CSeq number space. Such sends 180 Ringing (F12) to proxy 1, and proxy 1 forwards
a gap does not represent any error condition. When a UAS the 180 Ringing message to Alice’s UA. The early dialog state
receives a target refresh request, it must replace the dialog’s of Alice’s and Bob’s UA is shown in Figure 3.5.
(F13)
Alice Bob
(U1) (U2)
Alice (U1) “early” dialog state Bob (U2) “early” dialog state
Dialog ID Dialog ID
Call-ID : [email protected] Call-ID : [email protected]
Local Tag : 9fxced76sl Local Tag : 314159
RemoteTag : 314159 Remote Tag : 9fxced76sl
Local seqnum :-
Local seqnum :2 Remote seqnum :2
Remote seqnum : - Local URI : sip:[email protected]
Local URI : sip:[email protected] Remote URI : sip:[email protected]
Remote URI : sip:[email protected] Remote target : sip:[email protected]
Remote target : sip:[email protected];transport=tcp Secure flag :-
Secure flag :- Route set : sip:ss2.biloxi.com;lr
Route set : sip:ss1.atlanta.com;lr sip:ss1.atlanta.com;lr>
sip:ss2.biloxi.com;lr
3.6.5 Multiple Dialogs forked. At some point, the proxy uses a specified mecha-
nism to determine the best final response code, and forwards
RFC 5057 explains in great detail that handling mul- a final response using that response code upstream toward
tiple usages within a single dialog is complex and intro- the sender of the associated request. When an upstream
duces scenarios where the right thing to do is not clear. SIP entity receives the non-2xx final response, it will release
Implementations should avoid entering into multiple usages resources associated with the session. The UAC will termi-
whenever possible. New applications should be designed to nate, or retry, the session setup.
never introduce multiple usages. There are some accepted SIP Since the forking proxy does not always immediately for-
practices, including transfer, that currently require multiple ward non-2xx final responses, upstream SIP entities (includ-
usages. Recent work, most notably GRUU (see Section 4.3), ing the UAC that initiated the request) are not immediately
makes those practices unnecessary. The standardization of informed that an early dialog has been terminated, and will
those practices and the implementations should be revised as therefore maintain resources associated with the early dialog
soon as possible to use only single-usage dialogs. More about reserved until a final response is sent by the proxy, even if the
multiple dialogs is discussed in Section 16.2 with respect to early dialog has already been terminated. A SIP entity could
call transfer using REFER method. use the resources for other things, for example, to accept sub-
sequent early dialogs that it otherwise would reject. RFC 6228
that is described here defines a new SIP response code, 199
3.6.6 Early Dialog Termination Indication Early Dialog Terminated, that a SIP forking proxy and a UAS
A SIP early dialog is created when a non-100 provisional can use to indicate to upstream SIP entities including the UAC
response is sent to the initial dialog initiation request (e.g., that an early dialog has been terminated, before a final response
INVITE, outside an existing dialog). The dialog is consid- is sent toward the SIP entities. A UAS can send a 199 response
ered to be in early state until a final response is sent. When a code, before sending a non-2xx final response, for the same pur-
proxy receives an initial dialog initiation request, it can for- pose. SIP entities that receive the 199 Early Dialog Terminated
ward the request toward multiple remote destinations. When response can use it to trigger the release of resources associ-
the proxy does that, it performs forking (RFC 3261; see ated with the terminated early dialog. In addition, SIP enti-
Sections 2.2, 2.8, and 3.7). When a forking proxy receives ties might also use the 199 response to make policy decisions
a non-100 provisional response, or a 2xx final response, it related to early dialogs. For example, a media gate controlling a
forwards the response upstream toward the sender of the SIP entity might use the 199 Early Dialog Terminated response
associated request. After a forking proxy has forwarded a 2xx when deciding for which early dialogs media will be passed.
final response, it normally generates and sends CANCEL
requests downstream toward all remote destinations where
3.6.6.1 Applicability and Limitation
it previously forked the request associated with the 2xx final
response, and from which it has still not received a final The 199 response code is an optimization, and it only opti-
response. The CANCEL requests are sent in order to ter- mizes how quickly recipients might be informed about ter-
minate any outstanding early dialogs associated with the minated early dialogs. The achieved optimization is limited.
request. Upstream SIP entities might receive multiple 2xx Since the response is normally not sent reliably by a UAS,
final responses. and cannot be sent reliably when generated and sent by a
When a SIP entity receives the first 2xx final response, proxy, it is possible that some or all of the 199 Early Dialog
and it does not intend to accept any subsequent 2xx final Terminated responses will get lost before they reach the
responses, it will automatically terminate any other out- recipients. In such cases, recipients will behave the same as
standing early dialog associated with the request. If the SIP if the 199 response code were not used at all. One example
entity receives a subsequent 2xx final response, it will nor- for which a UAC could use the 199 Early Dialog Terminated
mally generate and send an ACK request, followed with a response is that when it receives a 199 response, it releases
BYE request, using the dialog identifier retrieved from the resources associated with the terminated early dialog. The
2xx final response. A UAC can use the Request-Disposition UAC could also use the 199 response to make policy deci-
header field (RFC 3841, see Section 9.9) to request that prox- sions related to early dialogs. For example, if a UAC is play-
ies do not generate and send CANCEL requests downstream ing media associated with an early dialog, and it then receives
once they have received the first 2xx final response. When a a 199 Early Dialog Terminated response indicating the early
forking proxy receives a non-2xx final response, it does not dialog has been terminated; it could start playing media
always immediately forward the response upstream toward associated with a different early dialog. Application designers
the sender of the associated request. Instead, the proxy stores utilizing the 199 response code must ensure that the applica-
the response and waits for subsequent final responses from tion’s user experience is acceptable if all 199 responses are lost
other remote destinations where the associated request was and not delivered to the recipients.
SIP Message Elements ◾ 203
3.6.6.2 UAC Behavior state (RFC 3261, Section 3.6.2) and waits for possible sub-
sequent early dialogs to be established, and eventually for a
When a UAC sends an initial dialog initiation request, and final response to be received.
if it is willing to receive 199 responses, it must insert a 199
option tag in the Supported header field (RFC 3261, see
3.6.6.3 UAS Behavior
Section 2.8) of the request. The option tag indicates that the
UAC supports, and is willing to receive, 199 responses. A If a UAS receives an initial dialog initiation request with
UAC should not insert a 199 option tag in the Require or the a Supported header field that contains a 199 option tag,
Proxy-Require header field (RFC 3261, see Section 2.8) of it should not send a 199 response on an early dialog asso-
the request, since in many cases it would result in unneces- ciated with the request before it sends a non-2xx final
sary session establishment failures. The UAC always needs response. Cases where a UAS might send a 199 response
to insert a 199 option tag in the Supported header field, in are if it has been configured to do so due to lack of sup-
order to indicate that it supports, and is willing to receive, port for the 199 response code by forking proxies or other
199 responses, even if it also inserts the option tag in the intermediate SIP entities, or if it is used in an environment
Require or Proxy-Require header field. It is recommended that specifies that it shall send a 199 response before send-
that a UAC not insert a 100rel option tag (RFC 3262, see ing a non-2xx response. If a UAS has created multiple early
Section 2.8) in the Require header field when it also indicates dialogs associated with an initial dialog initiation request
support for 199 responses, unless the UAC also uses some (the UAS is acting similarly to a forking proxy), it does not
other SIP extension or procedure that mandates it to do so. always intend to send a final response on all of those early
The reason is that proxies are not allowed to generate and dialogs. If the Require header field of an initial dialog ini-
send 199 responses when the UAC has required provisional tiation request contains a 100rel option tag, proxies will
responses to be sent reliably. not be able to generate and send 199 responses. In such
When a UAC receives a 199 response, it might release cases, the UAS might choose to send a 199 response on an
resources associated with the terminated early dialog. A early dialog before it sends a non-2xx final response, even if
UAC might also use the 199 response to make policy deci- it would not do so in other cases. If the Supported header
sions related to early dialogs. The 199 response indicates field of an initial dialog initiation request does not contain
that the early dialog has been terminated, so there is no a 199 option tag, the UAC must not send a 199 response on
need for the UAC to send a BYE request in order to ter- any early dialog associated with the request. When a UAS
minate the early dialog when it receives the 199 response. generates a 199 response, the response must contain a To
The 199 response does not affect other early dialogs associ- header field tag parameter (RFC 3261, see Section 2.9), in
ated with the session establishment. For those dialogs, the order for other entities to identify the early dialog that has
normal SIP rules regarding transaction timeout, etc., still been terminated.
apply. Once a UAC has received and accepted a 199 Early The UAS must also insert a Reason header field (RFC
Dialog Terminated response, it must not send any media 3326, see Section 2.8) that contains a response code describ-
associated with the early dialog. In addition, if the UAC is ing the reason why the early dialog was terminated. The UAS
able to associate received media with early dialogs, it must must not insert a 199 option tag in the Supported, Require,
not process any received media associated with the early or Proxy-Require header field of the 199 response. If a UAS
dialog that was terminated. If multiple usages (RFC 5057, intends to send 199 responses, and if it supports the pro-
see Sections 3.6.5 and 16.2) are used within an early dialog, cedures defined in RFC 3840 (see Section 2.11 and 3.4), it
and it is not clear which dialog usage the 199 response ter- may, during the registration procedure, use the sip.extensions
minates, SIP entities that keep dialog state shall not release feature tag (RFC 3840, see Section 2.11) to indicate support
resources associated with the early dialog when they receive for the 199 response code. A 199 response should not contain
the 199 response. a SDP offer–answer message body, unless required by the
If a UAC receives an unreliably sent 199 response on rules in RFC 3264. According to RFC 3264, if an INVITE
a dialog that has not previously been established (this can request does not contain an SDP offer, and the 199 response
happen if a 199 response reaches the client before the 18x is the first reliably sent response associated with the request,
response that would establish the early dialog), it shall dis- the 199 response is required to contain an SDP offer. In this
card the 199 response. If a UAC receives a reliably sent 199 case, the UAS should send the 199 response unreliably, or
response on a dialog that has not previously been created, it send the 199 response reliably and include an SDP offer with
must acknowledge the 199 response, as described in RFC no m= lines in the response. Since a 199 response is only used
3262 (see Section 2.8). If a UAC has received a 199 response for information purposes, the UAS should send it unreliably,
for all early dialogs, and no early dialogs associated with the unless the 100rel option tag is present in the Require header
session establishment remain, it maintains the Proceeding field of the associated request.
204 ◾ Handbook on Session Initiation Protocol
3.6.6.4 Proxy Behavior response code of the response that triggered the 199 response.
The SIP response code in the Reason header field informs the
When a proxy receives a 199 response to an initial dialog initi-
receiver of the 199 response about the SIP response code that
ation request, it MUST process the response as any other non- was used by the UAS to terminate the early dialog, and the
100 provisional response. The proxy will forward the response receiver might use that information for triggering different
upstream toward the sender of the associated request. The types of actions and procedures. The proxy must not insert a
proxy may release resources it has reserved associated with 199 option tag in the Supported, Require, or Proxy-Require
the early dialog that is terminated. If a proxy receives a 199 header field of the 199 response.
response out of dialog, it must process it as other non-100 A forking proxy that supports the generation of 199
provisional responses received out of dialog. When a forking responses must keep track of early dialogs, in order to determine
proxy receives a non-2xx final response to an initial dialog ini- whether to generate a 199 response when the proxy receives a
tiation request that it recognizes as terminating one or more non-2xx final response. In addition, a proxy must keep track
early dialogs associated with the request, it must generate and on which early dialogs it has received and forwarded 199
send a 199 response upstream for each of the terminated early responses, in order to not generate additional 199 responses
dialogs that satisfy each of the following conditions: for those early dialogs. If a forking proxy receives a reliably sent
199 response for a dialog for which it has previously generated
◾◾ The forking proxy does not intend to forward the final and sent a 199 response, it must forward the 199 response. If a
response immediately (in accordance with rules for a proxy receives an unreliably sent 199 response for which it has
forking proxy). previously generated and sent a 199 response, it may forward
◾◾ The UAC has indicated support (by inserting the 199 the response, or it may discard it. When a forking proxy gener-
option tag in a Supported header field) for the 199 ates and sends a 199 response, the response should not contain
response code in the associated request. a Contact header field or a Record-Route header field (RFC
◾◾ The UAC has not required provisional responses to be 3261, see Section 2.8). If the Require header field of an initial
sent reliably, that is, has not inserted the 100rel option dialog initiation request contains a 100rel option tag, a proxy
tag in a Require or Proxy-Require header field, in the must not generate and send 199 responses associated with that
associated request. request. The reason is that a proxy is not allowed to generate
◾◾ The forking proxy has not already received and for- and send 199 responses reliably.
warded a 199 response for the early dialog.
◾◾ The forking proxy has not already sent a final response
for any of the early dialogs. 3.6.6.5 Backwards Compatibility
Since all SIP entities involved in a session setup do not neces-
As a consequence, once a final response to an initial dia- sarily support the specific meaning of the 199 Early Dialog
log initiation request has been issued by the proxy, no further Terminated provisional response, the sender of the response
199 responses associated with the request will be generated or must be prepared to receive SIP requests and responses asso-
forwarded by the proxy. When a forking proxy forks an initial ciated with the dialog for which the 199 response was sent (a
dialog initiation request, it generates a unique Via header branch proxy can receive SIP messages from either direction). If such
parameter value for each forked leg. A proxy can determine a request is received by a UA, it must act in the same way
whether additional forking has occurred downstream of the as if it had received the request after sending the final non-
proxy by storing the top Via branch value from each response 2xx response to the INVITE request, as specified in RFC
that creates an early dialog. If the same top Via branch value is 3261. A UAC that receives a 199 response for an early dialog
received for multiple early dialogs, the proxy knows that addi- must not send any further requests on that dialog, except for
tional forking has occurred downstream of the proxy. A non-2xx requests that acknowledge reliable responses. A proxy must
final response received for a specific early dialog also terminates forward requests according to RFC 3261, even if the proxy
all other early dialogs for which the same top Via branch value has knowledge that the early dialog has been terminated. A
was received in the responses that created those early dialogs. 199 response does not replace a final response. RFC 3261
On the basis of the implementation policy, a forking specifies when a final response is sent.
proxy may wait before sending the 199 response, for exam-
ple, if it expects to receive a 2xx final response on another
dialog shortly after it received the non-2xx final response that
3.6.6.6 Usage with SDP Offer–Answer
triggered the 199 response. When a forking proxy generates a A 199 response should not contain an SDP offer–answer
199 response, the response must contain a To header field tag (RFC 3264, see Section 3.8.4) message body, unless required
parameter that identifies the terminated early dialog. A proxy by the rules in RFC 3264. If an INVITE request does not
must also insert a Reason header field that contains the SIP contain an SDP offer, and the 199 response is the first reliably
SIP Message Elements ◾ 205
sent response, the 199 response is required to contain an SDP responses in order to establish early dialogs between them-
offer. In this case, the UAS should send the 199 response selves and the UAC. UAS 2 and UAS 3 each reject the
unreliably, or include an SDP offer with no m= lines in a INVITE by sending a 4xx error response. When P1 receives
reliable 199 response. the 4xx responses, it immediately sends 199 Early Dialog
Terminated responses toward the UAC, to indicate that the
early dialogs for which it received the 4xx responses have been
3.6.6.7 Message Flow Examples terminated. The early dialog leg is shown in parentheses.
3.6.6.7.1 Example with a Forking Proxy
that Generates 199 3.6.6.7.2 Example with a Forking Proxy
that Receives 200 OK
Figure 3.6 shows an example where a proxy P1 forks an
INVITE received from a UAC. The forked INVITE reaches Figure 3.7 shows an example where a proxy P1 forks an
UAS 2, UAS 3, and UAS 4, which send 18x provisional INVITE request received from a UAC. The forked request
F1. INVITE
F2. INVITE
F3. INVITE
F4. INVITE
Figure 3.6 Forking proxy generating 199 Early Dialog Terminated. (Copyright IETF. Reproduced with permission.)
206 ◾ Handbook on Session Initiation Protocol
F1. INVITE
F2. INVITE
F3. INVITE
F4. INVITE
Figure 3.7 Forking proxy receiving 200 OK. (Copyright IETF. Reproduced with permission.)
reaches UAS 2, UAS 3, and UAS 4, all of which send 18x to establish early dialogs between themselves and the
provisional responses in order to establish early dialogs UAC.
between themselves and the UAC. Later, UAS 4 accepts the Later, UAS 3 and UAS 4 each reject the INVITE request
session and sends a 200 OK final response. When P1 receives by sending a 4xx error response. Proxy P2 does not support
the 200 OK response, it immediately forwards it toward the the 199 response code and forwards a single 4xx response.
UAC. Proxy P1 does not send 199 responses for the early dia- Proxy P1 supports the 199 response code, and when it
logs from UAS 2 and UAS 3, since P1 has still not received receives the 4xx response from proxy P2, it also manages to
any final responses on those early dialogs (even if proxy P1 associate the early dialogs from both UAS 3 and UAS 4 with
sends CANCEL requests to UAS 2 and UAS 3, proxy P1 the response. Therefore, proxy P1 generates and sends two
may still receive a 200 OK final response from UAS 2 or 199 responses to indicate that the early dialogs from UAS 3
UAS 3, which proxy P1 would have to forward toward the and UAS 4 have been terminated. The early dialog leg is
UAC). The early dialog leg is shown in parentheses. shown in parentheses.
F1. INVITE
F2. INVITE
F3'. INVITE
F3. INVITE
F4. INVITE
Figure 3.8 Two forking proxies, one of them generating 199. (Copyright IETF. Reproduced with permission.)
INVITE sip:[email protected] SIP/2.0 invitation. These UASs will frequently need to query the user
Via: SIP/2.0/UDP pc33.atlanta.com; about whether to accept the invitation. After some time, those
branch=z9hG4bK776asdhds
Max-Forwards: 70
UASs can accept the invitation (meaning the session is to be
To: Bob <sip:[email protected]> established) by sending a 2xx response. If the invitation is not
From: Alice <sip:[email protected]>; accepted, a 3xx, 4xx, 5xx, or 6xx response is sent, depending on
tag=1928301774 the reason for the rejection. Before sending a final response, the
Call-ID: [email protected] UAS can also send provisional responses (1xx) to advise the UAC
CSeq: 314159 INVITE of progress in contacting the called user. After possibly receiving
Contact: <sip:[email protected]>
Content-Type: application/sdp
one or more provisional responses, the UAC will get one or more
Content-Length: 142 2xx responses or one non-2xx final response. Because of the pro-
tracted amount of time it can take to receive final responses to
(Alice’s SDP not shown) INVITE, the reliability mechanisms for INVITE transactions
differ from those of other requests (like OPTIONS).
This request may be forwarded by proxies, eventually Once it receives a final response, the UAC needs to send
arriving at one or more UAS that can potentially accept the an ACK for every final response it receives. The procedure
208 ◾ Handbook on Session Initiation Protocol
F1 INVITE
F2 INVITE
F3 100 Trying F4 INVITE
F5 100 Trying
F6 180 Ringing
F7 180 Ringing
F8 180 Ringing F9 200 OK
F10 200 OK
F11 200 OK
F12 ACK
Media session
F13 BYE
F14 200 OK
Figure 3.9 SIP session setup. (Copyright IETF. Reproduced with permission.)
for sending this ACK depends on the type of response. For the INVITE, for the duration of the dialog. For example,
final responses between 300 and 699, the ACK processing is a UA capable of receiving INFO requests within a dialog
done in the transaction layer and follows one set of rules (see described in RFC 6086 (see Section 16.8) should include an
Section 3.12). For 2xx responses, the ACK is generated by the Allow header field listing the INFO method. A Supported
UAC core. A 2xx response to an INVITE establishes a ses- header field (Section 2.8.2) should be present in the
sion, and it also creates a dialog between the UA that issued INVITE. It enumerates all the extensions understood by
the INVITE and the UA that generated the 2xx response. the UAC. An Accept (see Section 2.8.2) header field may be
Therefore, when multiple 2xx responses are received from present in the INVITE. It indicates which Content-Types
different remote UAs (because the INVITE forked), each are acceptable to the UA, in both the response received by
2xx establishes a different dialog. All these dialogs are part it, and in any subsequent requests sent to it within dia-
of the same call. This section provides details on the estab- logs established by the INVITE. The Accept header field
lishment of a session using INVITE. A UA that supports is especially useful for indicating support of various ses-
INVITE must also support ACK, CANCEL, and BYE. sion description formats. The UAC may add an Expires
header field (see Section 2.8.2) to limit the validity of the
invitation. If the time indicated in the Expires header field
3.7.2 UAC Processing is reached and no final answer for the INVITE has been
received, the UAC core should generate a CANCEL request
3.7.2.1 Creating the Initial INVITE for the INVITE, as per Section 3.2.
Since the initial INVITE represents a request outside of a A UAC may also find it useful to add, among others,
dialog, its construction follows the procedures of Section Subject (see Section 2.8.2), Organization (see Section 2.8.2),
3.6.2.1.1. Additional processing is required for the specific and User-Agent (Section 2.8.2) header fields. They all contain
case of INVITE. An Allow header field (see Section 2.8.2) information related to the INVITE. The UAC may choose
should be present in the INVITE. It indicates what meth- to add a message body to the INVITE. We have described
ods can be invoked within a dialog, on the UA sending earlier how to construct the header fields—Content-Type
SIP Message Elements ◾ 209
among others—needed to describe the message body. There Concretely, the above rules specify two exchanges for
are special rules for message bodies that contain a session UAs compliant to this specification alone—the offer is in the
description—their corresponding Content-Disposition is ses- INVITE, and the answer in the 2xx (and possibly in a 1xx
sion. SIP uses an offer–answer model where one UA sends as well, with the same value), or the offer is in the 2xx, and
a session description, called the offer, which contains a pro- the answer is in the ACK. All UAs that support INVITE
posed description of the session. The offer indicates the desired must support these two exchanges. The SDP described in
communications means (audio, video, games), parameters of RFC 4566 (see Section 7.7) must be supported by all UAs
those means (such as codec types), and addresses for receiv- as a means to describe sessions, and its usage for construct-
ing media from the answerer. The other UA responds with ing offers and answers must follow the procedures defined in
another session description, called the answer, which indi- RFC 3264 (see Section 3.8.4). The restrictions of the offer–
cates which communications means are accepted, the param- answer model just described only apply to bodies whose
eters that apply to those means, and addresses for receiving Content-Disposition header field value is session. Therefore,
media from the offerer. it is possible that both the INVITE and the ACK contain a
An offer–answer exchange is within the context of a message body (e.g., the INVITE carries a photo [Content-
dialog, so that if a SIP INVITE results in multiple dialogs, Disposition: render] and the ACK a session description
each is a separate offer–answer exchange. The offer–answer [Content-Disposition: session]). If the Content-Disposition
model defines restrictions on when offers and answers can header field is missing, bodies of Content-Type application/
be made (e.g., you cannot make a new offer while one is in SDP imply the disposition session, while other content types
progress). This results in restrictions on where the offers and imply render. Once the INVITE has been created, the UAC
answers can appear in SIP messages. In this specification, follows the procedures defined for sending requests outside
offers and answers can only appear in INVITE requests and of a dialog (Sections 3.6 and 3.7.2). This results in the con-
responses, and ACK. The usage of offers and answers is fur- struction of a client transaction that will ultimately send the
ther restricted. For the initial INVITE transaction, the rules request and deliver responses to the UAC.
are as follows:
3.7.2.2 Processing INVITE Responses
◾◾ The initial offer must be in either an INVITE or, if not
there, in the first reliable nonfailure message from the Once the INVITE has been passed to the INVITE client
UAS back to the UAC. In this specification, that is the transaction, the UAC waits for responses for the INVITE. If
final 2xx response. the INVITE client transaction returns a timeout rather than
◾◾ If the initial offer is in an INVITE, the answer must a response, the TU acts as if a 408 Request Timeout response
be in a reliable nonfailure message from the UAS back had been received.
to the UAC that is correlated to that INVITE. For this
specification, that is only the final 2xx response to that
3.7.2.2.1 1xx Responses
INVITE. That same exact answer may also be placed
in any provisional responses sent before the answer. The Zero, one, or multiple provisional responses may arrive
UAC must treat the first session description it receives before one or more final responses are received. Provisional
as the answer, and must ignore any session descriptions responses for an INVITE request can create early dialogs.
in subsequent responses to the initial INVITE. If a provisional response has a tag in the To field, and if the
◾◾ If the initial offer is in the first reliable nonfailure mes- dialog ID of the response does not match an existing dialog,
sage from the UAS back to the UAC, the answer must one is constructed using the procedures defined in Section
be in the acknowledgement for that message (in this 3.6. The early dialog will only be needed if the UAC needs to
specification, ACK for a 2xx response). send a request to its peer within the dialog before the initial
◾◾ After having sent or received an answer to the first offer, INVITE transaction completes. Header fields present in a
the UAC may generate subsequent offers in requests provisional response are applicable as long as the dialog is
based on rules specified for that method, but only if it in the early state (e.g., an Allow header field in a provisional
has received answers to any previous offers and has not response contains the methods that can be used in the dialog
sent any offers to which it has not gotten an answer. while this is in the early state).
◾◾ Once the UAS has sent or received an answer to the
initial offer, it must not generate subsequent offers in
3.7.2.2.2 3xx Responses
any responses to the initial INVITE. This means that
a UAS based on this specification alone can never gen- A 3xx response may contain one or more Contact header
erate subsequent offers until completion of the initial field values providing new addresses where the callee might
transaction. be reachable. Depending on the status code of the 3xx
210 ◾ Handbook on Session Initiation Protocol
response (see Section 2.6), the UAC may choose to try those to determine the destination address, port, and transport.
new addresses. However, the request is passed to the transport layer directly
for transmission, rather than a client transaction.
This is because the UAC core handles retransmissions of
3.7.2.2.3 4xx, 5xx, and 6xx Responses
the ACK, not the transaction layer. The ACK must be passed
A single non-2xx final response may be received for the to the client transport every time a retransmission of the 2xx
INVITE. 4xx, 5xx, and 6xx responses may contain a Contact final response that triggered the ACK arrives. The UAC core
header field value indicating the location where additional considers the INVITE transaction completed 64*T1 seconds
information about the error can be found. Subsequent final after the reception of the first 2xx response. At this point,
responses (which would only arrive under error conditions) all the early dialogs that have not transitioned to established
must be ignored. All early dialogs are considered terminated dialogs are terminated. Once the INVITE transaction is
upon reception of the non-2xx final response. After having considered completed by the UAC core, no more new 2xx
received the non-2xx final response, the UAC core consid- responses are expected to arrive. If, after acknowledging any
ers the INVITE transaction completed. The INVITE client 2xx response to an INVITE, the UAC does not want to con-
transaction handles the generation of ACKs for the response tinue with that dialog, then the UAC must terminate the
(see Section 3.12). dialog by sending a BYE request as described in Section 3.10.
to the same multicast conference by multiple other partic- A 486 Busy Here should be returned in such a scenario. If
ipants. If desired, the UAS may use identifiers within the the UAS knows that no other end system will be able to
session description to detect this duplication. For example, accept this call, a 600 Busy Everywhere response should be
SDP contains a session id and version number in the origin sent instead. However, it is unlikely that a UAS will be able
(o) field. If the user is already a member of the session, and to know this in general, and thus this response will not usu-
the session parameters contained in the session description ally be used. The response is passed to the INVITE server
have not changed, the UAS may silently accept the INVITE transaction, which will deal with its retransmissions. A UAS
(i.e., send a 2xx response without prompting the user). If the rejecting an offer contained in an INVITE should return a
INVITE does not contain a session description, the UAS 488 Not Acceptable Here response. Such a response should
is being asked to participate in a session, and the UAC has include a Warning header field value explaining why the
asked that the UAS provide the offer of the session. It must offer was rejected.
provide the offer in its first nonfailure reliable message back
to the UAC. In this specification, that is a 2xx response to the
3.7.3.1.4 INVITE Is Accepted
INVITE. The UAS can indicate progress, accept, redirect,
or reject the invitation. In all of these cases, it formulates a The UAS core generates a 2xx response. This response
response using the procedures described in Section 3.1.3.6. establishes a dialog, and therefore follows the procedures of
Section 3.6.1.1 in addition to those of Section 3.1.3.6. A 2xx
response to an INVITE should contain the Allow header
3.7.3.1.1 Progress
field and the Supported header field, and may contain the
If the UAS is not able to answer the invitation immediately, it Accept header field. Including these header fields allows the
can choose to indicate some kind of progress to the UAC (e.g., UAC to determine the features and extensions supported by
an indication that a phone is ringing). This is accomplished with the UAS for the duration of the call, without probing. If the
a provisional response between 101 and 199. These provisional INVITE request contained an offer, and the UAS had not
responses establish early dialogs and therefore follow the proce- yet sent an answer, the 2xx must contain an answer. If the
dures of Section 3.6.1.1 in addition to those of Section 3.1.3.6. INVITE did not contain an offer, the 2xx must contain an
A UAS may send as many provisional responses as it likes. Each offer if the UAS had not yet sent an offer.
of these must indicate the same dialog ID. However, these will Once the response has been constructed, it is passed to
not be delivered reliably. If the UAS desires an extended period the INVITE server transaction. To ensure reliable end-to-end
of time to answer the INVITE, it will need to ask for an exten- transport of the response, it is necessary to periodically pass
sion in order to prevent proxies from canceling the transaction. the response directly to the transport until the ACK arrives.
A proxy has the option of canceling a transaction when there The 2xx response is passed to the transport with an interval
is a gap of 3 minutes between responses in a transaction. To that starts at T1 seconds and doubles for each retransmission
prevent cancellation, the UAS must send a non-100 provi- until it reaches T2 seconds (T1 and T2 are defined in Section
sional response at every minute, to handle the possibility of lost 3.12). Response retransmissions cease when an ACK request
provisional responses. An INVITE transaction can go on for for the response is received. This is independent of whatever
extended durations when the user is placed on hold, or when transport protocols are used to send the response. (Note that
interworking with PSTN systems that allow communications this paragraph has been updated/modified per RFC 6026.)
to take place without answering the call. The latter is common Since 2xx is retransmitted end-to-end, there may be hops
in Interactive Voice Response systems. between UAS and UAC that are UDP. To ensure reliable
delivery across these hops, the response is retransmitted peri-
odically even if the transport at the UAS is reliable. If the
3.7.3.1.2 INVITE Is Redirected
server retransmits the 2xx response for 64*T1 seconds with-
If the UAS decides to redirect the call, a 3xx response is out receiving an ACK, the dialog is confirmed, but the ses-
sent. A 300 Multiple Choices, 301 Moved Permanently, or sion should be terminated.
302 Moved Temporarily response should contain a Contact
header field containing one or more URIs of new addresses
to be tried. The response is passed to the INVITE server
transaction, which will deal with its retransmissions.
3.8 Modifying an Existing Session
A successful INVITE request (see Section 3.7) establishes
both a dialog between two UAs and a session using the offer–
3.7.3.1.3 INVITE Is Rejected
answer model. Section 3.6 explains how to modify an exist-
A common scenario occurs when the callee is currently not ing dialog using a target refresh request (e.g., changing the
willing or able to take additional calls at this end system. remote target URI of the dialog). This section describes how
212 ◾ Handbook on Session Initiation Protocol
to modify the actual session. This modification can involve 2. If there is an ongoing INVITE server transaction, the
changing addresses or ports, adding a media stream, deleting TU must wait until the transaction reaches the con-
a media stream, and so on. This is accomplished by sending a firmed or terminated state before initiating the new
new INVITE request within the same dialog that established INVITE.
the session. An INVITE request sent within an existing dia-
log is known as a re-INVITE. Note that a single re-INVITE However, a UA may initiate a regular transaction while
can modify the dialog and the parameters of the session an INVITE transaction is in progress. A UA MAY also initi-
at the same time. Either the caller or callee can modify an ate an INVITE transaction while a regular transaction is in
existing session. The behavior of a UA on detection of media progress. If a UA receives a non-2xx final response to a re-
failure is a matter of local policy. However, automated gen- INVITE, the session parameters must remain unchanged,
eration of re-INVITE or BYE is not recommended to avoid as if no re-INVITE had been issued. Note that, as stated in
flooding the network with traffic when there is congestion. Section 3.6.2.1.2, if the non-2xx final response is a 481 Call/
In any case, if these messages are sent automatically, they Transaction Does Not Exist, or a 408 (Request Timeout), or
should be sent after some randomized interval. Note that the no response at all is received for the re-INVITE (i.e., a time-
paragraph above refers to automatically generated BYEs and out is returned by the INVITE client transaction), the UAC
re-INVITEs. If the user hangs up upon media failure, the will terminate the dialog. If a UAC receives a 491 response to
UA would send a BYE request as usual. a re-INVITE, it should start a timer with a value T chosen
as follows:
3.8.1 UAC Behavior 1. If the UAC is the owner of the Call-ID of the dialog
The same offer–answer model that applies to session descrip- ID (meaning it generated the value), T has a randomly
tions in INVITEs (Section 3.7.2.1) applies to re-INVITEs. chosen value between 2.1 and 4 seconds in units of
As a result, a UAC that wants to add a media stream, for 10 milliseconds.
example, will create a new offer that contains this media 2. If the UAC is not the owner of the Call-ID of the dia-
stream, and send that in an INVITE request to its peer. It is log ID, T has a randomly chosen value of between 0
important to note that the full description of the session, not and 2 seconds in units of 10 milliseconds.
just the change, is sent. This supports stateless session process-
ing in various elements, and supports failover and recovery When the timer fires, the UAC should attempt the re-
capabilities. Of course, a UAC may send a re-INVITE with INVITE once more, if it still desires for that session modi-
no session description, in which case the first reliable nonfail- fication to take place. For example, if the call was already
ure response to the re-INVITE will contain the offer (in this hung up with a BYE, the re-INVITE would not take place.
specification, that is a 2xx response). If the session descrip- The rules for transmitting a re-INVITE and for generating
tion format has the capability for version numbers, the offerer an ACK for a 2xx response to re-INVITE are the same as for
should indicate that the version of the session description has the initial INVITE (Section 3.7.2.1).
changed. The To, From, Call-ID, CSeq, and Request-URI of
a re-INVITE are set following the same rules as for regular
requests within an existing dialog, described in Section 3.6.
3.8.2 UAS Behavior
A UAC may choose not to add an Alert-Info header field Section 3.7.3.1 describes the procedure for distinguishing
or a body with Content-Disposition alert to re-INVITEs incoming re-INVITEs from incoming initial INVITEs and
because UASs do not typically alert the user upon reception handling a re-INVITE for an existing dialog. A UAS that
of a re-INVITE. Unlike an INVITE, which can fork, a re- receives a second INVITE before it sends the final response
INVITE will never fork, and therefore, only ever generate to a first INVITE with a lower CSeq sequence number on
a single final response. The reason a re-INVITE will never the same dialog must return a 500 Server Internal Error
fork is that the Request-URI identifies the target as the UA response to the second INVITE and must include a Retry-
instance it established the dialog with, rather than identify- After header field with a randomly chosen value of between
ing an AOR for the user. Note that a UAC must not initiate 0 and 10 seconds. A UAS that receives an INVITE on a
a new INVITE transaction within a dialog while another dialog while an INVITE it had sent on that dialog is in
INVITE transaction is in progress in either direction. progress MUST return a 491 Request Pending response to
the received INVITE. If a UA receives a re-INVITE for an
1. If there is an ongoing INVITE client transaction, the existing dialog, it must check any version identifiers in the
TU must wait until the transaction reaches the com- session description or, if there are no version identifiers, the
pleted or terminated state before initiating the new content of the session description to see if it has changed. If
INVITE. the session description has changed, the UAS must adjust the
SIP Message Elements ◾ 213
session parameters accordingly, possibly after asking the user response to the initial INVITE request is generated. The
for confirmation. Versioning of the session description can UPDATE method, defined in RFC 3311, fulfills that need.
be used to accommodate the capabilities of new arrivals to a It can be sent by a UA within a dialog (early or confirmed)
conference, add or delete media, or change from a unicast to to update session parameters without affecting the dialog
a multicast conference. state itself. UPDATE allows a client to update parameters of
If the new session description is not acceptable, the a session (such as the set of media streams and their codecs)
UAS can reject it by returning a 488 Not Acceptable Here but has no impact on the state of a dialog. In that sense, it
response for the re-INVITE. This response should include is like a re-INVITE, but unlike re-INVITE, it can be sent
a Warning header field. If a UAS generates a 2xx response before the initial INVITE has been completed. This makes
and never receives an ACK, it should generate a BYE to ter- it very useful for updating session parameters within early
minate the dialog. A UAS may choose not to generate 180 dialogs.
Ringing responses for a re-INVITE because UACs do not
typically render this information to the user. For the same
3.8.3.2 Overview of Operation
reason, UASs may choose not to use an Alert-Info header
field or a body with Content-Disposition alert in responses The operation of this extension is straightforward. The caller
to a re-INVITE. A UAS providing an offer in a 2xx (because begins with an INVITE transaction, which proceeds nor-
the INVITE did not contain an offer) should construct the mally. Once a dialog is established, either early or confirmed,
offer as if the UAS was making a brand new call, subject to the caller can generate an UPDATE method that contains
the constraints of sending an offer that updates an existing an SDP offer defined in RFC 3264 (see Section 3.8.4) for
session, as described in RFC 3264 (see Section 3.8.4) in the the purposes of updating the session. The response to the
case of SDP. Specifically, this means that it should include as UPDATE method contains the answer. Similarly, once a
many media formats and media types that the UA is willing dialog is established, the callee can send an UPDATE with
to support. The UAS must ensure that the session descrip- an offer, and the caller places its answer in the 2xx to the
tion overlaps with its previous session description in media UPDATE. The Allow header field is used to indicate support
formats, transports, or other parameters that require support for the UPDATE method. There are additional constraints
from the peer. This is to avoid the need for the peer to reject on when UPDATE can be used, based on the restrictions of
the session description. If, however, it is unacceptable to the the offer–answer model.
UAC, the UAC should generate an answer with a valid session
description, and then send a BYE to terminate the session.
3.8.3.3 Determining Support
for This Extension
3.8.3 UPDATE A UAC compliant to this specification should also include
an Allow header field in the INVITE request, listing the
3.8.3.1 Overview
method UPDATE, to indicate its ability to receive an
The SIP specified in RFC 3261 defines the INVITE method UPDATE request. When a UAS compliant to this specifica-
for the initiation and modification of sessions. However, tion receives an INVITE request for a new dialog, and gen-
this method actually affects two important pieces of state. It erates a reliable provisional response containing SDP, that
affects the session (the media streams SIP sets up) and also response should contain an Allow header field that lists the
the dialog (the state that SIP itself defines). While this is rea- UPDATE method. This informs the caller that the callee is
sonable in many cases, there are important scenarios in which capable of receiving an UPDATE request at any time. An
this coupling causes complications. The primary difficulty is unreliable provisional response may contain an Allow header
when aspects of the session need to be modified before the field listing the UPDATE method, and a 2xx response
initial INVITE has been answered. An example of this situ- should contain an Allow header field listing the UPDATE
ation is early media, a condition where the session is estab- method. Responses are processed normally as per RFC 3261,
lished, for the purpose of conveying the progress of the call, and in the case of reliable provisional responses, according
but before the INVITE itself is accepted. It is important that to RFC 3262 (see Sections 9, 2.8.2, and 14). It is important
either caller or callee be able to modify the characteristics of to note that a reliable provisional response will always create
that session (e.g., putting the early media on hold), before the an early dialog at the UAC. Creation of this dialog is neces-
call is answered. However, a re-INVITE cannot be used for sary in order to receive UPDATE requests from the callee. If
this purpose because the re-INVITE has an impact on the the response contains an Allow header field containing the
state of the dialog, in addition to the session. value UPDATE, the UAC knows that the callee supports
As a result, a solution is needed that allows the caller or UPDATE, and the UAC is allowed to follow the procedures
callee to provide updated session information before a final described in the next section.
214 ◾ Handbook on Session Initiation Protocol
response for the UPDATE. This response should include a Caller Callee
Warning header field.
F1 INVITE with offer 1
by each participant. As a result, even though SDP has the offer if it has received an offer that it has not yet answered or
expressiveness to describe unicast sessions, it is missing the rejected. Furthermore, it must not generate a new offer if it
semantics and operational details of how it is actually done. has generated a prior offer for which it has not yet received
RFC 3264 that is described here defines a mechanism an answer or a rejection. If an agent receives an offer after
by which two entities can make use of SDP to arrive at a having sent one, but before receiving an answer to it, this is
common view of a multimedia session between them. In this considered a glare condition. The term glare was originally
model, one participant offers the other a description of the used in circuit switched telecommunications networks to
desired session from their perspective, and the other partici- describe the condition where two switches both attempt to
pant answers with the desired session from their perspective. seize the same available circuit on the same trunk at the same
This offer–answer model is most useful in unicast sessions time. Here, it means both agents have attempted to send
where information from both participants is needed for the an updated offer at the same time. The higher-layer proto-
complete view of the session. The offer–answer model is used col needs to provide a means for resolving such conditions.
by protocols like SIP. More specifically, one participant in The higher-layer protocol will need to provide a means for
the session generates an SDP message that constitutes the ordering of messages in each direction and SIP meets these
offer—the set of media streams and codecs the offerer wishes requirements.
to use, along with the IP addresses and ports the offerer
would like to use to receive the media. The offer is conveyed
3.8.4.3 Generating the Initial Offer
to the other participant, called the answerer.
The answerer generates an answer, which is an SDP mes- The offer (and answer) must be a valid SDP message, as
sage that responds to the offer provided by the offerer. The defined by RFC 4566 (see Section 7.7), with one exception.
answer has a matching media stream for each stream in the RFC 4566 mandates that either an e or a p line is present in
offer, indicating whether the stream is accepted or not, along the SDP message. This specification relaxes that constraint;
with the codecs that will be used and the IP addresses and ports an SDP formulated for an offer–answer application may omit
that the answerer wants to use to receive media. It is also pos- both the e and p lines. The numeric value of the session id
sible for a multicast session to work similar to a unicast one; and version in the o line must be representable with a 64-bit
its parameters are negotiated between a pair of users as in the signed integer. The initial value of the version must be less
unicast case, but both sides send packets to the same multicast than (2**62) − 1, to avoid rollovers. Although the SDP speci-
address, rather than unicast ones. This document also discusses fication allows for multiple session descriptions to be concat-
the application of the offer–answer model to multicast streams. enated together into a large SDP message, an SDP message
We also define guidelines for how the offer–answer model is used in the offer–answer model must contain exactly one
used to update a session after an initial offer–answer exchange. session description. The SDP s= line conveys the subject of
The means by which the offers and answers are conveyed are the session, which is reasonably defined for multicast but ill
outside the scope of this document. The offer–answer model defined for unicast. For unicast sessions, it is recommended
defined here is the mandatory baseline mechanism used by SIP. that it consist of a single space character (0×20) or a dash (–).
Unfortunately, SDP does not allow the s= line to be
empty. The SDP t= line conveys the time of the session.
3.8.4.2 Protocol Operation
Generally, streams for unicast sessions are created and
The offer–answer exchange assumes the existence of a higher- destroyed through external signaling means, such as SIP.
layer protocol (such as SIP) that is capable of exchanging SDP In that case, the t= line should have a value of 0 0. The
for the purposes of session establishment between agents. offer will contain zero or more media streams (each media
Protocol operation begins when one agent sends an initial stream is described by an m= line and its associated attri-
offer to another agent. An offer is initial if it is outside of butes). Zero media streams implies that the offerer wishes
any context that may have already been established through to communicate, but that the streams for the session will be
the higher-layer protocol. It is assumed that the higher-layer added at a later time through a modified offer. The streams
protocol provides maintenance of some kind of context that may be for a mix of unicast and multicast; the latter obvi-
allows the various SDP exchanges to be associated together. ously implies a multicast address in the relevant c= line(s).
The agent receiving the offer may generate an answer, or it may Construction of each offered stream depends on whether
reject the offer. The means for rejecting an offer are dependent the stream is multicast or unicast.
on the higher-layer protocol. The offer–answer exchange is
atomic; if the answer is rejected, the session reverts to the state
3.8.4.3.1 Unicast Streams
before the offer (which may be absence of a session).
At any time, either agent may generate a new offer that If the offerer wishes to only send media on a stream to its
updates the session. However, it must not generate a new peer, it must mark the stream as sendonly with the a=sendonly
SIP Message Elements ◾ 217
attribute. We refer to a stream as being marked with a cer- value of the payload type field in RTP packets the offerer is
tain direction if a direction attribute was present as either a planning to send for that codec. For sendrecv RTP streams,
media stream attribute or a session attribute. If the offerer the payload type numbers indicate the value of the payload
wishes to only receive media from its peer, it must mark the type field the offerer expects to receive, and would prefer
stream as recvonly. If the offerer wishes to communicate, but to send. However, for sendonly and sendrecv streams, the
wishes to neither send nor receive media at this time, it must answer might indicate different payload type numbers for
mark the stream with an a=inactive attribute. The inactive the same codecs, in which case, the offerer MUST send with
direction attribute is specified in RFC 3108. Note that in the the payload type numbers from the answer. Different pay-
case of RTP specified in RFC 3550 (see Section 7.2), Real- load type numbers may be needed in each direction because
Time Transport Control Protocol (RTCP) is still sent and of interoperability concerns with H.323. As per RFC 4566
received for sendonly, recvonly, and inactive streams. That is, (see Section 7.7), fmtp parameters may be present to pro-
the directionality of the media stream has no impact on the vide additional parameters of the media format. In the
RTCP usage. If the offerer wishes to both send and receive case of RTP streams, all media descriptions should contain
media with its peer, it may include an a=sendrecv attribute, or a=rtpmap mappings from RTP payload types to encodings.
it may omit it, since sendrecv is the default. For recvonly and If there is no a=rtpmap, the default payload type mapping, as
sendrecv streams, the port number and address in the offer defined by the current profile in use, for example, RFC 3551,
indicate where the offerer would like to receive the media is to be used. This allows easier migration away from static
stream. For sendonly RTP streams, the address and port payload types.
number indirectly indicate where the offerer wants to receive In all cases, the formats in the m= line must be listed in
RTCP reports. order of preference, with the first format listed being pre-
Unless there is an explicit indication otherwise, reports ferred. In this case, preferred means that the recipient of the
are sent to the port number one higher than the number offer should use the format with the highest preference that is
indicated. The IP address and port present in the offer indi- acceptable to it. If the ptime attribute is present for a stream,
cate nothing about the source IP address and source port of it indicates the desired packetization interval that the offerer
RTP and RTCP packets that will be sent by the offerer. A would like to receive. The ptime attribute must be greater
port number of zero in the offer indicates that the stream is than zero. If the bandwidth attribute is present for a stream,
offered but must not be used. This has no useful semantics it indicates the desired bandwidth that the offerer would like
in an initial offer, but is allowed for reasons of completeness, to receive. A value of zero is allowed, but discouraged. It
since the answer can contain a zero port indicating a rejected indicates that no media should be sent. In the case of RTP,
stream. Furthermore, existing streams can be terminated by it would also disable all RTCP. If multiple media streams of
setting the port to zero. In general, a port number of zero different types are present, it means that the offerer wishes
indicates that the media stream is not wanted. to use those streams at the same time. A typical case is an
The list of media formats for each media stream con- audio and a video stream as part of a videoconference. If
veys two pieces of information, namely the set of formats multiple media streams of the same type are present in an
(codecs and any parameters associated with the codec, in the offer, it means that the offerer wishes to send (and/or receive)
case of RTP) that the offerer is capable of sending and/or multiple streams of that type at the same time. When send-
receiving (depending on the direction attributes), and, in the ing multiple streams of the same type, it is a matter of local
case of RTP, the RTP payload type numbers used to iden- policy as to how each media source of that type (e.g., a video
tify those formats. If multiple formats are listed, it means camera and VCR in the case of video) is mapped to each
that the offerer is capable of making use of any of those for- stream.
mats during the session. In other words, the answerer may When a user has a single source for a particular media
change formats in the middle of the session, making use of type, only one policy makes sense: the source is sent to each
any of the formats listed, without sending a new offer. For stream of the same type. Each stream may use different encod-
a sendonly stream, the offer should indicate those formats ings. When receiving multiple streams of the same type, it is
the offerer is willing to send for this stream. For a recvonly a matter of local policy as to how each stream is mapped to
stream, the offer should indicate those formats the offerer is the various media sinks for that particular type (e.g., speak-
willing to receive for this stream. For a sendrecv stream, the ers or a recording device in the case of audio). There are a
offer should indicate those codecs that the offerer is willing few constraints on the policies, however. First, when receiv-
to send and receive with. ing multiple streams of the same type, each stream must be
For recvonly RTP streams, the payload type numbers mapped to at least one sink for the purpose of presentation
indicate the value of the payload type field in RTP packets to the user. In other words, the intent of receiving multiple
the offerer is expecting to receive for that codec. For send- streams of the same type is that they should all be presented
only RTP streams, the payload type numbers indicate the in parallel, rather than choosing just one. Another constraint
218 ◾ Handbook on Session Initiation Protocol
is that when multiple streams are received and sent to the in the o= line of the answer is unrelated to the version num-
same sink, they must be combined in some media-specific ber in the o= line of the offer. For each m= line in the offer,
way. For example, in the case of two audio streams, the there must be a corresponding m= line in the answer. The
received media from each might be mapped to the speakers. answer must contain exactly the same number of m= lines
In that case, the combining operation would be to mix them. as the offer. This allows for streams to be matched up on the
In the case of multiple instant messaging streams, where the basis of their order. This implies that if the offer contained
sink is the screen, the combining operation would be to pres- zero m= lines, the answer must contain zero m= lines. The t=
ent all of them to the user interface. The third constraint is line in the answer MUST equal that of the offer. The time of
that if multiple sources are mapped to the same stream, those the session cannot be negotiated. An offered stream may be
sources must be combined in some media-specific way before rejected in the answer, for any reason. If a stream is rejected,
they are sent on the stream. the offerer and answerer must not generate media (or RTCP
Although policies beyond these constraints are flexible, packets) for that stream. To reject an offered stream, the port
an agent would not generally want a policy that will copy number in the corresponding stream in the answer must be
media from its sinks to its sources unless it is a conference set to zero. Any media formats listed are ignored. At least one
server (i.e., do not copy received media on one stream to must be present, as specified by SDP. Constructing an answer
another stream). A typical usage example for multiple media for each offered stream differs for unicast and multicast.
streams of the same type is a prepaid calling card applica-
tion, where the user can press and hold the pound (#) key
3.8.4.4.1 Unicast Streams
at any time during a call to hang up and make a new call
on the same card. This requires media from the user to If a stream is offered with a unicast address, the answer for
two destinations—the remote gateway and the DTMF pro- that stream must contain a unicast address. The media type
cessing application that looks for the pound. This could be of the stream in the answer must match that of the offer.
accomplished with two media streams, one sendrecv to the If a stream is offered as sendonly, the corresponding stream
gateway, and the other sendonly (from the perspective of must be marked as recvonly or inactive in the answer. If a
the user) to the DTMF application. Once the offerer has media stream is listed as recvonly in the offer, the answer
sent the offer, it must be prepared to receive media for any must be marked as sendonly or inactive in the answer. If an
recvonly streams described by that offer. It must be prepared offered media stream is listed as sendrecv (or if there is no
to send and receive media for any sendrecv streams in the direction attribute at the media or session level, in which
offer, and send media for any sendonly streams in the offer case the stream is sendrecv by default), the corresponding
(of course, it cannot actually send until the peer provides an stream in the answer may be marked as sendonly, recvonly,
answer with the needed address and port information). In sendrecv, or inactive. If an offered media stream is listed as
the case of RTP, even though it may receive media before inactive, it must be marked as inactive in the answer. For
the answer arrives, it will not be able to send RTCP receiver streams marked as recvonly in the answer, the m= line must
reports until the answer arrives. contain at least one media format the answerer is willing to
receive with from among those listed in the offer. The stream
may indicate additional media formats, not listed in the cor-
3.8.4.3.2 Multicast Streams
responding stream in the offer, that the answerer is willing
If a session description contains a multicast media stream to receive. For streams marked as sendonly in the answer, the
that is listed as receive (send) only, it means that the partici- m= line must contain at least one media format the answerer
pants, including the offerer and answerer, can only receive is willing to send from among those listed in the offer. For
(send) on that stream. This differs from the unicast view, streams marked as sendrecv in the answer, the m= line must
where the directionality refers to the flow of media between contain at least one codec the answerer is willing to both
offerer and answerer. Beyond that clarification, the seman- send and receive, from among those listed in the offer.
tics of an offered multicast stream are exactly as described in The stream may indicate additional media formats, not
RFC 4566 (see Section 7.7). listed in the corresponding stream in the offer, that the
answerer is willing to send or receive (of course, it will not be
able to send them at this time, since it was not listed in the
3.8.4.4 Generating the Answer
offer). For streams marked as inactive in the answer, the list
The answer to an offered session description is based on the of media formats is constructed on the basis of the offer. If
offered session description. If the answer is different from the the offer was sendonly, the list is constructed as if the answer
offer in any way (different IP addresses, ports, etc.), the origin was recvonly. Similarly, if the offer was recvonly, the list is
line must be different in the answer, since the answer is gen- constructed as if the answer was sendonly, and if the offer
erated by a different entity. In that case, the version number was sendrecv, the list is constructed as if the answer was
SIP Message Elements ◾ 219
sendrecv. If the offer was inactive, the list is constructed as media stream by setting the port to zero. If there are no
if the offer was actually sendrecv and the answer was send- media formats in common for all streams, the entire offered
recv. The connection address and port in the answer indicate session is rejected. Once the answerer has sent the answer, it
the address where the answerer wishes to receive media (in must be prepared to receive media for any recvonly streams
the case of RTP, RTCP will be received on the port that is described by that answer. It must be prepared to send and
one higher unless there is an explicit indication otherwise). receive media for any sendrecv streams in the answer, and
This address and port MUST be present even for sendonly it may send media immediately. The answerer must be pre-
streams; in the case of RTP, the port one higher is still used pared to receive media for recvonly or sendrecv streams using
to receive RTCP. In the case of RTP, if a particular codec any media formats listed for those streams in the answer,
was referenced with a specific payload type number in the and it may send media immediately. When sending media,
offer, that same payload type number should be used for that it should use a packetization interval equal to the value of
codec in the answer. Even if the same payload type number the ptime attribute in the offer, if any was present. It should
is used, the answer must contain rtpmap attributes to define send media using a bandwidth no higher than the value of
the payload type mappings for dynamic payload types, and the bandwidth attribute in the offer, if any was present. The
should contain mappings for static payload types. answerer must send using a media format in the offer that
The media formats in the m= line must be listed in order is also listed in the answer, and should send using the most
of preference, with the first format listed being preferred. In preferred media format in the offer that is also listed in the
this case, preferred means that the offerer should use the for- answer. In the case of RTP, it must use the payload type
mat with the highest preference from the answer. Although numbers from the offer, even if they differ from those in the
the answerer may list the formats in their desired order of answer.
preference, it is recommended that unless there is a specific
reason, the answerer lists formats in the same relative order
3.8.4.4.2 Multicast Streams
they were present in the offer. In other words, if a stream in
the offer lists audio codecs 8, 22, and 48 (see Section 7.2.2.4), Unlike unicast, where there is a two-sided view of the
in that order, and the answerer only supports codecs 8 and stream, there is only a single view of the stream for multicast.
48, it is recommended that, if the answerer has no reason to As such, generating an answer to a multicast offer generally
change it, the ordering of codecs in the answer should be 8, involves modifying a limited set of aspects of the stream. If a
48, and not 48, 8 (see Section 7.2.2.4). This helps assure that multicast stream is accepted, the address and port informa-
the same codec is used in both directions. tion in the answer must match that of the offer. Similarly, the
The interpretation of fmtp parameters in an offer directionality information in the answer (sendonly, recvonly,
depends on the parameters. In many cases, those param- or sendrecv) must equal that of the offer. This is because
eters describe specific configurations of the media format, all participants in a multicast session need to have equiva-
and should therefore be processed as the media format value lent views of the parameters of the session, an underlying
itself would be. This means that the same fmtp parameters assumption of the multicast bias of RFC 4566 (see Section
with the same values must be present in the answer if the 7.7). The set of media formats in the answer must be equal
media format they describe is present in the answer. Other to or be a subset of those in the offer. Removing a format is
fmtp parameters are more like parameters, for which it is a way for the answerer to indicate that the format is not sup-
perfectly acceptable for each agent to use different values. ported. The ptime and bandwidth attributes in the answer
In that case, the answer may contain fmtp parameters, and must equal the ones in the offer, if present. If not present, a
those may have the same values as those in the offer, or they nonzero ptime may be added to the answer.
may be different. SDP extensions that define new parame-
ters should specify the proper interpretation in offer–answer.
3.8.4.5 Offerer Processing of the Answer
The answerer may include a nonzero ptime attribute for any
media stream; this indicates the packetization interval that When the offerer receives the answer, it may send media on
the answerer would like to receive. There is no requirement the accepted stream(s) (assuming it is listed as sendrecv or
that the packetization interval be the same in each direction recvonly in the answer). It must send using a media format
for a particular stream. The answerer may include a bandwidth listed in the answer, and it should use the first media format
attribute for any media stream; this indicates the bandwidth listed in the answer when it does send. The reason this is a
that the answerer would like the offerer to use when sending SHOULD, and not a MUST (its also a SHOULD, and not
media. a MUST, for the answerer), is because there will oftentimes
The value of zero is allowed, interpreted as described ear- be a need to change codecs on the fly. For example, during
lier. If the answerer has no media formats in common for silence periods, an agent might like to switch to a comfort
a particular offered stream, the answerer must reject that noise codec. Or, if the user presses a number on the keypad,
220 ◾ Handbook on Session Initiation Protocol
the agent might like to send that using RFCs 4733, 4734, 3.8.4.6.1 Adding a Media Stream
and 5244 (see Section 16.10). Congestion control might
New media streams are created by new additional media
necessitate changing to a lower rate codec based on feed-
descriptions below the existing ones, or by reusing the slot
back. The offerer should send media according to the value
used by an old media stream that had been disabled by set-
of any ptime and bandwidth attribute in the answer. The
ting its port to zero. Reusing its slot means that the new
offerer may immediately cease listening for media formats
media description replaces the old one, but retains its posi-
that were listed in the initial offer, but not present in the
tioning relative to other media descriptions in the SDP. New
answer.
media descriptions must appear below any existing media
sections. The rules for formatting these media descriptions
3.8.4.6 Modifying the Session are identical to those described earlier. When the answerer
receives an SDP with more media descriptions than the
At any point during the session, either participant may issue
previous SDP from the offerer, or it receives an SDP with
a new offer to modify the characteristics of the session. It
a media stream in a slot where the port was previously zero,
is fundamental to the operation of the offer–answer model
the answerer knows that new media streams are being added.
that the exact same offer–answer procedure defined above
These can be rejected or accepted by placing an appropriately
is used for modifying the parameters of an existing ses-
structured media description in the answer. The procedures
sion. The offer may be identical to the last SDP provided to
for constructing the new media description in the answer are
the other party (which may have been provided in an offer
described here.
or an answer), or it may be different. We refer to the last
SDP provided as the previous SDP. If the offer is the same,
the answer may be the same as the previous SDP from the 3.8.4.6.2 Removing a Media Stream
answerer, or it may be different. If the offered SDP is dif-
ferent from the previous SDP, some constraints are placed Existing media streams are removed by creating a new SDP
on its construction, discussed below. Note that nearly all with the port number for that stream set to zero. The stream
characteristics of a media stream can be modified. Nearly description may omit all attributes present previously, and
all aspects of the session can be modified. New streams can may list just a single media format. A stream that is offered
be added, existing streams can be deleted, and parameters with a port of zero must be marked with port zero in the
of existing streams can change. When issuing an offer that answer. Like the offer, the answer may omit all attributes
modifies the session, the o= line of the new SDP must be present previously, and may list just a single media format
identical to that in the previous SDP, except that the version from among those in the offer. Removal of a media stream
in the origin field must increment by one from the previous implies that media is no longer sent for that stream, and
SDP. any media that is received is discarded. In the case of RTP,
If the version in the origin line does not increment, the RTCP transmission also ceases, as does processing of any
SDP must be identical to the SDP with that version num- received RTCP packets. Any resources associated with it can
ber. The answerer must be prepared to receive an offer that be released. The user interface might indicate that the stream
contains SDP with a version that has not changed; this is has terminated, by closing the associated window on a per-
effectively a no-op. However, the answerer must generate a sonal computer (PC), for example.
valid answer (which may be the same as the previous SDP
from the answerer, or may be different), according to the pro-
3.8.4.6.3 Modifying Address, Port,
cedures defined here. If an SDP is offered, which is different
or Transport of Media
from the previous SDP, the new SDP must have a matching
media stream for each media stream in the previous SDP. In The port number for a stream may be changed. To do this,
other words, if the previous SDP had N m= lines, the new the offerer creates a new media description, with the port
SDP must have at least N m= lines. The i-th media stream in number in the m= line different from the corresponding
the previous SDP, counting from the top, matches the i-th stream in the previous SDP. If only the port number is to
media stream in the new SDP, counting from the top. This be changed, the rest of the media stream description should
matching is necessary in order for the answerer to determine remain unchanged. The offerer must be prepared to receive
which stream in the new SDP corresponds to a stream in the media on both the old and new ports as soon as the offer is
previous SDP. Because of these requirements, the number sent. The offerer should not cease listening for media on the
of m= lines in a stream never decreases, but either stays the old port until the answer is received and media arrives on
same or increases. Deleted media streams from a previous the new port. Doing so could result in loss of media during
SDP must not be removed in a new SDP; however, attributes the transition. Received, in this case, means that the media is
for these streams need not be present. passed to a media sink. This means that if there is a playout
SIP Message Elements ◾ 221
buffer, the agent would continue to listen on the old port in the answer (assuming the stream allows for sending), and
until the media on the new port reached the top of the play- must not send using any formats that are not in the offer,
out buffer. At that time, it may cease listening for media on even if they were present in a previous SDP from the peer.
the old port. The corresponding media stream in the answer Similarly, when the offerer receives the answer, it must begin
may be the same as the stream in the previous SDP from sending media using any formats in the answer, and should
the answerer, or it may be different. If the updated stream is use the most preferred one (assuming the stream allows for
accepted by the answerer, the answerer should begin sending sending), and must not send using any formats that are not
traffic for that stream to the new port immediately. in the answer, even if they were present in a previous SDP
If the answerer changes the port from the previous SDP, from the peer. When an agent ceases using a media format
it must be prepared to receive media on both the old and (by not listing that format in an offer or answer, even though
new ports as soon as the answer is sent. The answerer must it was in a previous SDP), the agent will still need to be pre-
not cease listening for media on the old port until media pared to receive media with that format for a brief time. How
arrives on the new port. At that time, it may cease listening does it know when it can be prepared to stop receiving with
for media on the old port. The same is true for an offerer that that format? If it needs to know, there are three techniques
sends an updated offer with a new port; it must not cease that can be applied. First, the agent can change ports in addi-
listening for media on the old port until media arrives on tion to changing formats.
the new port. Of course, if the offered stream is rejected, When media arrives on the new port, it knows that the
the offerer can cease being prepared to receive using the new peer has ceased sending with the old format, and it can cease
port as soon as the rejection is received. To change the IP being prepared to receive with it. This approach has the ben-
address where media is sent, the same procedure is followed efit of being media format independent. However, changes in
for changing the port number. The only difference is that ports may require changes in resource reservation or rekey-
the connection line is updated, not the port number. The ing of security protocols. The second approach is to use a
transport for a stream may be changed. The process for doing totally new set of dynamic payload types for all codecs when
this is identical to changing the port, except the transport is one is discarded. When media is received with one of the
updated, not the port. new payload types, the agent knows that the peer has ceased
sending with the old format. This approach does not affect
reservations or security contexts, but it is RTP specific and
3.8.4.6.4 Changing the Set of Media Formats
wasteful of a very small payload type space. A third approach
The list of media formats used in the session may be changed. is to use a timer. When the SDP from the peer is received, the
To do this, the offerer creates a new media description, with timer is set. When it fires, the agent can cease being prepared
the list of media formats in the m= line different from the to receive with the old format. A value of 1 minute would
corresponding media stream in the previous SDP. This list typically be more than sufficient. In some cases, an agent
may include new formats, and may remove formats present may not care and thus continually be prepared to receive
from the previous SDP. However, in the case of RTP, the with the old formats. Nothing needs to be done in this case.
mapping from a particular dynamic payload type number to Of course, if the offered stream is rejected, the offer can cease
a particular codec within that media stream must not change being prepared to receive using any new formats as soon as
for the duration of a session. For example, if A generates an the rejection is received.
offer with G.711 [6] assigned to dynamic payload type num-
ber 46, payload type number 46 must refer to G.711 from
3.8.4.6.5 Changing Media Types
that point forward in any offers or answers for that media
stream within the session. However, it is acceptable for mul- The media type (audio, video, etc.) for a stream may be
tiple payload type numbers to be mapped to the same codec, changed. It is recommended that the media type be changed
so that an updated offer could also use payload type number (as opposed to adding a new stream), when the same logical
72 for G.711. The mappings need to remain fixed for the data is being conveyed, but just in a different media format.
duration of the session because of the loose synchronization This is particularly useful for changing between voiceband
between signaling exchanges of SDP and the media stream. fax and fax in a single stream, which are both separate media
The corresponding media stream in the answer is formulated types [7]. To do this, the offerer creates a new media descrip-
as described here, and may result in a change in media for- tion, with a new media type, in place of the description in
mats as well. the previous SDP that is to be changed. The correspond-
Similarly, as described here, as soon as it sends its answer, ing media stream in the answer is formulated as described
the answerer must begin sending media using any formats here. Assuming the stream is acceptable, the answerer should
in the offer that were also present in the answer, and should begin sending with the new media type and formats as soon
use the most preferred format in the offer that was also listed as it receives the offer. The offerer must be prepared to receive
222 ◾ Handbook on Session Initiation Protocol
media with both the old and new types until the answer is such queries to indicate capabilities. This section describes
received, and media with the new type is received and reaches how such an SDP message is formatted. Since SDP has no
the top of the playout buffer. way to indicate that the message is for the purpose of capa-
bility indication, this is determined from the context of the
higher-layer protocol. The ability of baseline SDP to indicate
3.8.4.6.6 Changing Media Attributes
capabilities is very limited. It cannot express allowed param-
Any other attributes in a media description may be updated eter ranges or values, and cannot be done in parallel with
in an offer or answer. Generally, an agent must send media (if an offer–answer itself. Extensions might address such limita-
the directionality of the stream allows) using the new param- tions in the future. An SDP constructed to indicate media
eters once the SDP with the change is received. capabilities is structured as follows. It must be a valid SDP,
except that it may omit both e= and p= lines. The t= line
must be equal to 0 0. For each media type supported by the
3.8.4.6.7 Putting a Unicast Media Stream on Hold
agent, there must be a corresponding media description of
If a party in a call wants to put the other party on hold, that that type. The session ID in the origin field must be unique
is, request that it temporarily stops sending one or more uni- for each SDP constructed to indicate media capabilities.
cast media streams, a party offers the other an updated SDP. The port must be set to zero, but the connection address
If the stream to be placed on hold was previously a sendrecv is arbitrary. The usage of port zero makes sure that an SDP
media stream, it is placed on hold by marking it as sendonly. formatted for capabilities does not cause media streams to
If the stream to be placed on hold was previously a recvonly be established if it is interpreted as an offer or answer. The
media stream, it is placed on hold by marking it inactive. transport component of the m= line indicates the transport
This means that a stream is placed on hold separately in each for that media type. For each media format of that type sup-
direction. Each stream is placed on hold independently. The ported by the agent, there should be a media format listed in
recipient of an offer for a stream on hold should not auto- the m= line. In the case of RTP, if dynamic payload types are
matically return an answer with the corresponding stream used, an rtpmap attribute must be present to bind the type
on hold. An SDP with all streams on hold is referred to as to a specific format. There is no way to indicate constraints,
held SDP. such as how many simultaneous streams can be supported
Certain third-party call control scenarios do not work for a particular codec, and so on.
when an answerer responds to held SDP with held SDP. The SDP of Figure 3.11 indicates that the agent can sup-
Typically, when a user presses hold, the agent will generate port three audio codecs (PCMU, 1016, and GSM) and two
an offer with all streams in the SDP indicating a direction video codecs (H.261 and H.263).
of sendonly, and it will also locally mute, so that no media is
sent to the far end, and no media is played out. RFC 2543
3.8.4.8 Example Offer–Answer Exchanges
(obsoleted by RFC 3261) specified that placing a user on
hold was accomplished by setting the connection address This section provides example offer–answer exchanges.
to 0.0.0.0. Its usage for putting a call on hold is no longer
recommended, since it does not allow for RTCP to be used
3.8.4.8.1 Basic Exchange
with held streams, does not work with IPv6, and breaks with
connection oriented media. However, it can be useful in an Assume that the caller, Alice, has included the following
initial offer when the offerer knows it wants to use a particu- description in her offer. It includes a bidirectional audio
lar set of media streams and formats, but does not know the stream and two bidirectional video streams, using H.261
addresses and ports at the time of the offer. Of course, when
used, the port number must not be zero, which would specify
that the stream has been disabled. An agent must be capable v=0
o=carol 28908764872 28908764872 IN IP4 100.3.6.6
of receiving SDP with a connection address of 0.0.0.0, in s=-
which case it means that neither RTP nor RTCP should be t=0 0
sent to the peer. c=IN IP4 192.0.2.4
m=audio 0 RTP/AVP 0 1 3
a=rtpmap:0 PCMU/8000
a=rtpmap:1 1016/8000
3.8.4.7 Indicating Capabilities a=rtpmap:3 GSM/8000
m=video 0 RTP/AVP 31 34
Before an agent sends an offer, it is helpful to know if the a=rtpmap:31 H261/90000
media formats in that offer would be acceptable to the a=rtpmap:34 H263/90000
answerer. Certain protocols, like SIP, provide a means to
query for such capabilities. SDP can be used in responses to Figure 3.11 SDP indicating capabilities.
SIP Message Elements ◾ 223
(payload type 31) and MPEG (payload type 32). The offered t=0 0
SDP is m=audio 49170 RTP/AVP 0
a=rtpmap:0 PCMU/8000
v=0 m=video 0 RTP/AVP 31
o=alice 2890844526 2890844526 IN IP4 host. a=rtpmap:31 H261/90000
anywhere.com m=video 53000 RTP/AVP 32
s= a=rtpmap:32 MPV/90000
c=IN IP4 host.anywhere.com m=audio 53122 RTP/AVP 110
t=0 0 a=rtpmap:110 telephone-events/8000
m=audio 49170 RTP/AVP 0 a=sendonly
a=rtpmap:0 PCMU/8000
m=video 51372 RTP/AVP 31
a=rtpmap:31 H261/90000
3.8.4.9 One of N Codec Selection
m=video 53000 RTP/AVP 32
a=rtpmap:32 MPV/90000 A common occurrence in embedded phones is that the
Digital Signal Processor (DSP) used for compression can
The callee, Bob, does not want to receive or send the first support multiple codecs at a time; however, once that codec
video stream, so he returns the SDP below as the answer: is selected, it cannot be readily changed on the fly. This
v=0
example shows how a session can be set up using an initial
o=bob 2890844730 2890844730 IN IP4 host. offer–answer exchange, followed immediately by a second
example.com one to lock down the set of codecs. The initial offer from
s= Alice to Bob indicates a single audio stream with the three
c=IN IP4 host.example.com audio codecs that are available in the DSP. The stream is
t=0 0 marked as inactive, since media cannot be received until a
m=audio 49920 RTP/AVP 0
a=rtpmap:0 PCMU/8000
codec is locked down:
m=video 0 RTP/AVP 31
v=0
m=video 53000 RTP/AVP 32
o=alice 2890844526 2890844526 IN IP4 host.
a=rtpmap:32 MPV/90000
anywhere.com
s=
At some point later, Bob decides to change the port where c=IN IP4 host.anywhere.com
he will receive the audio stream (from 49920 to 65422), and t=0 0
at the same time, add an additional audio stream as receive m=audio 62986 RTP/AVP 0 4 18
only, using the RTP payload format for events (RFC 4733). a=rtpmap:0 PCMU/8000
Bob offers the following SDP in the offer: a=rtpmap:4 G723/8000
a=rtpmap:18 G729/8000
v=0 a=inactive
o=bob 2890844730 2890844731 IN IP4 host.
example.com Bob can support dynamic switching between PCMU
s= and G.723. So, he sends the following answer:
c=IN IP4 host.example.com
t=0 0 v=0
m=audio 65422 RTP/AVP 0 o=bob 2890844730 2890844731 IN IP4 host.
a=rtpmap:0 PCMU/8000 example.com
m=video 0 RTP/AVP 31 s=
m=video 53000 RTP/AVP 32 c=IN IP4 host.example.com
a=rtpmap:32 MPV/90000 t=0 0
m=audio 51434 RTP/AVP 110 m=audio 54344 RTP/AVP 0 4
a=rtpmap:110 telephone-events/8000 a=rtpmap:0 PCMU/8000
a=recvonly a=rtpmap:4 G723/8000
a=inactive
Alice accepts the additional media stream, and so gener-
ates the following answer: Alice can then select any one of these two codecs. So, she
sends an updated offer with a sendrecv stream:
v=0
o=alice 2890844526 2890844527 IN IP4 host. v=0
anywhere.com o=alice 2890844526 2890844527 IN IP4 host.
s= anywhere.com
c=IN IP4 host.anywhere.com s=
224 ◾ Handbook on Session Initiation Protocol
UAC
UAS there are situations where a UAS wants to accept some but
not all the changes requested in a re-INVITE. In these cases,
F1. INVITE SDP1 the UAS generates a 200 OK response with a SDP indicat-
ing which changes were accepted and which were not. The
F2. 200 OK SDP2
example in Figure 3.13 illustrates this point.
The UAs perform an offer–answer exchange to establish
F3. ACK
an audio-only session:
SDP1:
F4. INVITE SDP3 m=audio 30000 RTP/AVP 0
c=IN IP4 192.0.2.1
F5. 4xx
SDP2:
F6. ACK
m=audio 31000 RTP/AVP 0
c=IN IP4 192.0.2.5
Figure 3.12 Rejection of a re-INVITE. (Copyright IETF. At a later point, the UAC moves to an access that provides
Reproduced with permission.) a higher bandwidth. Therefore, the UAC sends a re-INVITE
(F4) in order to change the IP address where it receives the
audio stream to its new IP address and add a video stream
3261 (see Section 3.3.1.3), if a re-INVITE is rejected, no to the session.
state changes are performed. These state changes include
state changes associated to the re-INVITE transaction and SDP3:
all other transactions within the re-INVITE (this section m=audio 30000 RTP/AVP 0
deals with changes to the session state; target refreshes are c=IN IP4 192.0.2.2
discussed in the next section). That is, the session state is the m=video 30002 RTP/AVP 31
same as before the re-INVITE was received. The example in c=IN IP4 192.0.2.2
Figure 3.12 illustrates this point.
The UAs perform an offer–answer exchange to establish The UAS is automatically configured to reject video
an audio-only session: streams. However, the UAS needs to accept the change of
the audio stream’s remote IP address. Consequently, the UAS
SDP1: returns a 200 OK response and sets the port of the video
stream to zero in its SDP.
m=audio 30000 RTP/AVP 0
SDP2: UAS
UAC
At a later point, the UAC sends a re-INVITE (4) in order F2. 200 OK SDP2
to add a video stream to the session.
SDP3: F3. ACK
SDP4: SDP5:
3.8.5.1.3 UAS Behavior be facing a legacy UAS that does not support this specifica-
tion (i.e., a UAS that does not follow the guidelines described
UASs should only return an error response to a re-INVITE
earlier). There are also certain race condition situations that
if no changes to the session state have been executed since
get both UAs out of synchronization. To cope with these race
the re-INVITE was received. Such an error response indi-
condition situations, a UAC that receives an error response to
cates that no changes have been executed as a result of the
a re-INVITE for which changes have been already executed
re-INVITE or any other transaction within it. If any of the
should generate a new re-INVITE or UPDATE request in
changes requested in a re-INVITE or in any transaction
order to make sure that both UAs have a common view of
within it have already been executed, the UAS should return
the state of the session (the UAC uses the criteria described in
a 2xx response. A change to the session state is considered to
the earlier section in order to decide whether or not changes
have been executed if an offer–answer without preconditions
have been executed for a particular stream). The purpose of
(RFC 4032, see Sections 3.8.3 and 15.4.12) for the stream
this new offer–answer exchange is to synchronize both UAs,
has completed successfully, or the UA has sent or received
not to request changes that the UAS may choose to reject.
media using the new parameters. Connection establishment
Therefore, session parameters in the offer–answer exchange
messages (e.g., TCP SYN), connectivity checks, for example,
should be as close to those in the pre-re-INVITE state as
when using Interactive Connectivity Establishment (ICE)
possible.
(RFC 5245, see Section 14.3), and any other messages used
in the process of meeting the preconditions for a stream are
not considered media. Normally, a UA receiving media can 3.8.5.1.5 Glare Situations
easily detect when the new parameters for the media stream
RFC 3264 (see Section 3.8.4) defines glare conditions as a
are used (e.g., media is received on a new port). However, in
UA receiving an offer after having sent one but before having
some scenarios, the UA will have to process incoming media
received an answer to it. That section specifies rules to avoid
packets to detect whether they use the old or new parameters.
glare situations in most cases. When, despite following those
The successful completion of an offer–answer exchange
rules, a glare condition occurs (as a result of a race condition),
without preconditions indicates that the new parameters for
it is handled as specified in RFC 3261 (see Sections 3.8.1 and
the media stream are already considered to be in use. The
3.8.2). The UAS returns a 491 Request Pending response and
successful completion of an offer–answer exchange with pre-
the UAC retries the offer after a randomly selected time, which
conditions means something different. The fact that all man-
depends on which UA is the owner of the Call-ID of the dia-
datory preconditions for the stream are met indicates that the
log. The rules in RFC 3261 not only cover collisions between
new parameters for the media stream are ready to be used.
re-INVITEs that contain offers, but they also cover collisions
However, they will not actually be used until the UAS decides
between two re-INVITEs in general, even if they do not con-
to use them. During a session establishment, the UAS can wait
tain offers. RFC 3311 (see Section 3.8.3) extends those rules to
before using the media parameters until the callee starts being
also cover collisions between an UPDATE request carrying an
alerted or until the callee accepts the session. During a ses-
offer and another message (UPDATE, PRACK, or INVITE)
sion modification, the UAS can wait until its user accepts the
also carrying an offer. The rules in RFC 3261 do not cover
changes to the session. When dealing with streams where the
collisions between an UPDATE request and a non-2xx final
UAS sends media more or less continuously, the UAC notices
response to a re-INVITE. Since both the UPDATE request
that the new parameters are in use because the UAC receives
and the reliable response could be requesting changes to the
media that uses the new parameters. However, this mecha-
session state, it would not be clear which changes would need
nism does not work with other types of streams. Therefore, it is
to be executed first. However, the procedures discussed in the
recommended that when a UAS decides to start using the new
earlier section already cover this type of situation. Therefore,
parameters for a stream for which all mandatory preconditions
there is no need to specify further rules here.
have been met, the UAS either sends media using the new
parameters or sends a new offer where the precondition-related
attributes for the stream have been removed. As indicated 3.8.5.1.6 Example of UAS Behavior
above, the successful completion of an offer–answer exchange
This section contains an example of a UAS that implements
without preconditions indicates that the new parameters for
this specification using an UPDATE request and a 2xx
the media stream are already considered to be in use.
response to a re-INVITE in order to revert to the pre-re-
INVITE state. The example shown in Figure 3.15 assumes
that the UAS requires its user’s input in order to accept or
3.8.5.1.4 UAC Behavior
reject the addition of a video stream and uses reliable provi-
A UAC that receives an error response to a re-INVITE that sional responses (RFC 3262, see Sections 2.5, 2.8, and 2.10)
undoes already executed changes within the re-INVITE may (PRACK transactions are not shown for clarity).
228 ◾ Handbook on Session Initiation Protocol
SDP4:
F2. 200 OK SDP2
At a later point, the UAC sends a re-INVITE (4) in order The UAC accepts the change in the audio codec in its
to add a new codec to the audio stream and to add a video 200 OK response (F9) to the UPDATE request.
stream to the session.
SDP8:
SDP3:
m=audio 30000 RTP/AVP 0
m=audio 30000 RTP/AVP 0 3 c=IN IP4 192.0.2.1
c=IN IP4 192.0.2.1 m=video 0 RTP/AVP 31
m=video 30002 RTP/AVP 31 c=IN IP4 192.0.2.1
c=IN IP4 192.0.2.1
The UAS now returns a 200 OK response (F10) to the
In (F5), the UAS accepts the addition of the audio codec re-INVITE. Note that the media state after this 200 (OK)
but does not accept the video stream yet (it provides a null response is the same as the pre-re-INVITE media state.
SIP Message Elements ◾ 229
a:sendrecv a:sendrecv
v:inactive v:inactive
F12. ACK
a:sendrecv
v:inactive
Figure 3.16 Message flow with race condition. (Copyright IETF. Reproduced with permission.)
230 ◾ Handbook on Session Initiation Protocol
address), the UA simply communicates its new local target to receives a target refresh request that has been properly authen-
the remote UA (e.g., the UA communicates its new IP address ticated (RFC 3261, see Section 19.4), the UA should generate
to the remote UA to remain reachable by the remote UA). UAs a reliable provisional response or a 2xx response to the target
need to follow the behavior specified in the next few sections refresh request. If generating such responses is not possible
of this specification (RFC 6141) instead of that specified in (e.g., the UA does not support reliable provisional responses
RFC 3261, which was discussed in the earlier section. The new and needs user input before generating a final response), the
behavior regarding target refresh requests implies that a target UA should send an in-dialog request to the remote UA using
refresh request can, in some cases, update the remote target the new remote target (if the UA does not need to send a
even if the request is responded to with a final error response. request for other reasons, the UAS can send an UPDATE
This means that target refresh requests are not atomic. request). On sending a reliable provisional response or a 2xx
response to the target refresh request, or a request to the new
remote target, the UA must replace the dialog’s remote target
3.8.5.2.4 UA Updating the Dialog’s
URI with the URI from the Contact header field in the tar-
Local Target in a Request
get refresh request.
To update its local target, a UA can send a target refresh Reliable provisional responses in SIP are specified in
request. If the UA receives an error response to the target RFC 3262 (see Sections 2.5, 2.8, and 2.10). In this docu-
refresh request, the remote UA has not updated its remote tar- ment, reliable provisional responses are those that use the
get. This allows UASs to authenticate target refresh requests mechanism defined in RFC 3262. Other specifications may
(RFC 3261, see Section 19.4). If the UA receives a reliable define ways to send provisional responses reliably using
provisional response or a 2xx response to the target refresh non-SIP mechanisms (e.g., using media-level messages to
request, or the UA receives an in-dialog request on the new acknowledge the reception of the SIP response). For the pur-
local target, the remote UA has updated its remote target. poses of this document, provisional responses using those
The UA can consider the target refresh operation completed. non-SIP mechanisms are considered unreliable responses.
Even if the target request was a re-INVITE and the final Note that non-100 provisional responses are only applicable
response to the re-INVITE was an error response, the UAS to INVITE transactions (RFC4320, see Section 3.12.2.5).
would not revert to the pre-re-INVITE remote target. If instead of sending a reliable provisional response or a 2xx
A UA should not use the same target refresh request to response to the target refresh request, or a request to the
refresh the target and to make session changes unless the ses- new target, the UA generates an error response to the target
sion changes can be trivially accepted by the remote UA (e.g., refresh request, the UA must not update its dialog’s remote
an IP address change). Piggybacking a target refresh with target.
more complicated session changes would make it unneces-
sarily complicated for the remote UA to accept the target
3.8.5.2.7 Response Updating the Dialog’s
refresh while rejecting the session changes. Only in case the
Remote Target
target refresh request is a re-INVITE and the UAS supports
reliable provisional response or UPDATE requests, the UAC If a UA receives a reliable provisional response or a 2xx
may piggyback session changes and a target refresh in the response to a target refresh request, the UA must replace the
same re-INVITE. dialog’s remote target URI with the URI from the Contact
header field in that response, if present. If a UA receives an
3.8.5.2.5 UA Updating the Dialog’s Local unreliable provisional response to a target refresh request, the
UA must not refresh the dialog’s remote target.
Target in a Response
A UA processing an incoming target refresh request can
3.8.5.2.8 Race Conditions and Target Refreshes
update its local target by returning a reliable provisional
response or a 2xx response to the target refresh request. The SIP provides request ordering by using the Cseq header field.
response needs to contain the updated local target URI in its That is, a UA that receives two requests at roughly the same
Contact header field. On sending the response, the UA can time can know which one is newer. However, SIP does not
consider the target refresh operation completed. provide ordering between responses and requests. For exam-
ple, if a UA receives a 200 OK response to an UPDATE
request and an UPDATE request at roughly the same time,
3.8.5.2.6 Request Updating the Dialog’s
the UA cannot know which one was sent last. Since both
Remote Target
messages can refresh the remote target, the UA needs to
The behavior of a UA after having received a target refresh know which message was sent last in order to know which
request updating the remote target is as follows: if the UA remote target needs to be used.
232 ◾ Handbook on Session Initiation Protocol
RFC 6141 specifies the following rule to avoid the situa- requests for 2xx responses and for non-2xx final responses are
tion just described. If the protocol allows a UA to use a tar- generated in different ways. As specified in RFC 3261 (see
get refresh request at the point in time that the UA wishes Sections 3.8.1 and 3.7.2.1), ACK requests for 2xx responses
to refresh its local target, the UA must use a target refresh are generated by the UAC core and are routed using the dia-
request instead of a response to refresh its local target. This log’s route set. As specified in RFC 3261 (see Section 3.12),
rule implies that a UA only uses a response (i.e., a reliable ACK requests for non-2xx final responses are generated by
provisional response or a 2xx response to a target refresh the INVITE client transaction (i.e., they are generated in a
request) to refresh its local target if the UA is unable to use hop-by-hop fashion by the proxy servers in the path) and are
a target refresh request at that point in time (e.g., the UAS sent to the same transport address as the re-INVITE.
of an ongoing re-INVITE without support for UPDATE).
3.8.5.3.2 Problems with UAs Losing Their Contact
3.8.5.2.9 Early Dialogs
Refreshing the dialog’s remote target during a re-INVITE
The rules given in this section about which messages can transaction as described earlier presents some issues because
refresh the target of a dialog also apply to early dialogs cre- of the fact that re-INVITE transactions can be long lived.
ated by an initial INVITE transaction. Additionally, as spec- As described in the previous section, the way responses to
ified in RFC 3261 (see Section 3.7.2.2.4), on receiving a 2xx the re-INVITE and ACKs for non-2xx final responses are
response to the initial INVITE, the UAC recomputes the routed is fixed once the re-INVITE is sent. The routing of
whole route set of the dialog, which transitions from the early these messages does not depend on the dialog’s route set
state to the confirmed state. RFC 3261 (see Section 3.6.1) and, thus, target refreshes within an ongoing re-INVITE
allows unreliable provisional responses to create early dia- do not affect their routing. A UA that changes its location
logs. However, per the rules given in this section, unreliable (i.e., performs a target refresh) but is still reachable at its old
provisional responses cannot refresh the target of a dialog. location will be able to receive those messages (which will
Therefore, the UAC of an initial INVITE transaction will be sent to the old location). However, a UA that cannot be
not perform any target refresh as a result of the reception of reachable at its old location any longer will not be able to
an unreliable provisional response with an updated Contact receive them.
value on an (already established) early dialog. Note also that The following sections describe the errors UAs face when
a given UAS can establish additional early dialogs, which can they lose their transport address during a re-INVITE. On
have different targets, by returning additional unreliable pro- detecting some of these errors, UAs following the rules speci-
visional responses with different To tags. fied in RFC 3261 will terminate the dialog. When the dia-
log is terminated, the only option for the UAs is to establish
a new dialog. The following sections change the require-
3.8.5.3 A UA Losing Its Contact
ments RFC 3261 places on UAs when certain errors occur
The following sections discuss the case where a UA loses its so that the UAs can recover from those errors. In short, the
transport address during an ongoing re-INVITE transac- UAs generate a new re-INVITE transaction to synchronize
tion. Such a UA will refresh the dialog’s local target so that it both UAs. Note that there are existing UA implementations
reflects its new transport address. Note that target refreshes deployed that already implement this behavior.
that do not involve changes in the UA’s transport address
are beyond the scope of this section. Also, UAs losing their
3.8.5.3.3 UAS Losing Its Contact: UAC Behavior
transport address during a non-re-INVITE transaction (e.g.,
a UA losing its transport address right after having sent an When a UAS that moves to a new contact and loses its old
UPDATE request before having received a response to it) are contact generates a non-2xx final response to the re-INVITE,
beyond the scope as well. The rules given in this section are it will not be able to receive the ACK request. The entity
also applicable to initial INVITE requests that have estab- receiving the response, and thus generating the ACK request,
lished early dialogs. will either get a transport error or a timeout error, which,
as described in Section 8.1.3.1 of RFC 3261 (see Section
3.1.2.3.1), will be treated as a 503 (Service Unavailable)
3.8.5.3.1 Background on Re-INVITE
response and as a 408 Request Timeout response, respec-
Transaction Routing
tively. If the sender of the ACK request is a proxy server,
Re-INVITEs are routed using the dialog’s route set, which it will typically ignore this error. If the sender of the ACK
contains all the proxy servers that need to be traversed by request is the UAC, according to RFC 3261 (see Section
requests sent within the dialog. Responses to the re-INVITE 3.6.2.1.2), it is supposed to (at the should level) terminate
are routed using the Via entries in the re-INVITE. ACK the dialog by sending a BYE request.
SIP Message Elements ◾ 233
However, because of the special properties of ACK a final response to the previous INVITE request, which had
requests for non-2xx final responses, most existing UACs do a lower CSeq sequence number.
not terminate the dialog when an ACK request fails, which
is fortunate. A UAC that accepts a target refresh within a re-
INVITE MUST ignore transport and timeout errors when
generating an ACK request for a non-2xx final response.
3.9 Handling Message Body
Additionally, the UAC should generate a new re-INVITE in 3.9.1 Objective
order to make sure that both UAs have a common view of
the state of the session. It is possible that the errors ignored Message-body handling in SIP was originally specified in RFC
by the UAC were not related to the target refresh operation. 3261, which relied on earlier specifications (e.g., MIME) to
If that was the case, the second re-INVITE would fail and describe some areas. RFC 5621 that is described here contains
the UAC would terminate the dialog because, per the rules background material on how bodies are handled in SIP, and
above, UACs only ignore errors when they accept a target normative material on areas that had not been specified before
refresh within the re-INVITE. or whose specifications needed to be completed. Sections con-
taining background material are clearly identified as such by
their titles. The material on the normative sections is based on
3.8.5.3.4 UAC Losing Its Contact: UAS Behavior experience gained since RFC 3261 was written. Implementers
When a UAC moves to a new contact and loses its old need to implement what is specified in RFC 3261 and its refer-
contact, it will not be able to receive responses to the re- ences in addition to what is specified in this RFC 5621.
INVITE. Consequently, it will never generate an ACK
request. As described in RFC 3261 (see Section 3.11.9), a 3.9.2 Message-Body Encoding
proxy server that gets an error when forwarding a response
does not take any measures. Consequently, proxy serv- This section deals with the encoding of message bodies in
ers relaying responses will effectively ignore the error. If SIP.
there are no proxy servers in the dialog’s route set, the UAS
will get an error when sending a non-2xx final response. 3.9.2.1 Background on Message-Body Encoding
The UAS core will be notified of the transaction failure,
as described in RFC 3261 (see Section 3.12). Most exist- SIP (RFC 3261, see Section 2.4.2.4) messages consist of
ing UASs do not terminate the dialog on encountering this an initial line (request line in requests and status line in
failure, which is fortunate. Regardless of the presence or responses), a set of header fields, and an optional message
absence of proxy servers in the dialog’s route set, a UAS gen- body. The message body is described using header fields such
erating a 2xx response to the re-INVITE will never receive as Content-Disposition, Content-Encoding, and Content-
an ACK request for it. According to RFC 3261 (see Section Type, which provide information on its contents. Figure 3.17
3.8.2), such a UAS is supposed to (at the SHOULD level) shows a SIP message that carries a body. Some of the header
terminate the dialog by sending a BYE request. A UAS fields are not shown for simplicity:
that accepts a target refresh within a re-INVITE and never The message body of a SIP message can be divided into
receives an ACK request after having sent a final response to various body parts. Multipart message bodies are encoded
the re-INVITE should not terminate the dialog if the UA using the MIME (RFC 2045) format. Body parts are also
has received a new re-INVITE with a higher CSeq sequence described using header fields such as Content-Disposition,
number than the original one. Content-Encoding, and Content-Type, which provide infor-
mation on the contents of a particular body part. Figure 3.18
3.8.5.3.5 UAC Losing Its Contact: UAC Behavior INVITE sip:[email protected] SIP/2.0
Content-Type: application/sdp
When a UAC moves to a new contact and loses its old con- Content-Length: 192
tact, it will not be able to receive responses to the re-INVITE. v=0
Consequently, it will never generate an ACK request. Such o=alice 2890844526 2890842807 IN IP4 atlanta.example.com
s=-
a UAC should generate a CANCEL request to cancel the c=IN IP4 192.0.2.1
re-INVITE and cause the INVITE client transaction cor- t=0 0
responding to the re-INVITE to enter the Terminated state. m=audio 20000 RTP/AVP 0
a=rtpmap:0 PCMU/8000
The UAC should also send a new re-INVITE in order to m=video 20002 RTP/AVP 31
make sure that both UAs have a common view of the state of a=rtpmap:31 H261/90000
the session. Per RFC 3261 (see Section 3.8.2), the UAS will
accept new incoming re-INVITEs as soon as it has generated Figure 3.17 SIP message carrying a body.
234 ◾ Handbook on Session Initiation Protocol
communicating with a legacy SIP UA that predates this mapping between any two parts is not necessarily without
specification. It has been observed in the field that a number information loss. For example, information can be lost when
of legacy SIP UAs without support for multipart bodies sim- translating text/html to text/plain. RFC 2046 recommends
ply ignored those bodies when they were received. These UAs that each part should have a different Content-ID value in
did not return any error response. Unsurprisingly, SIP UAs the case where the information content of the two parts is
not being able to report this type of error have caused serious not identical.
interoperability problems in the past.
3.9.5.2 UA Behavior to Generate Multipart/
3.9.3.3 UA Behavior to Generate Alternative Message Bodies
Multipart Message Bodies
RFC 5621 mandates all the top-level body parts within a
UAs should avoid unnecessarily nesting body parts because multipart/alternative to have the same disposition type. The
doing so would, unnecessarily, make processing the body session and early-session (RFC 3959, see Section 11.4) dis-
more laborious for the receiver. However, RFC 2046 states position types require that all the body parts of a multipart/
that a multipart media type with a single body part is use- alternative body have different content types. Consequently,
ful in some circumstances (e.g., for sending nontext media for the session and early-session disposition types, UAs must
types). In any case, UAs should not nest one multipart/ not place more than one body part with a given content type
mixed within another unless there is a need to reference the in a multipart/alternative body. That is, for session and early-
nested one (i.e., using the Content-ID of the nested body session, no body part within a multipart/alternative can have
part). Moreover, UAs should not nest one multipart/alterna- the same content type as another body part within the same
tive within another. Note that UAs receiving unnecessarily multipart/alternative.
nested body parts treat them as if they were not nested.
3.9.5.3 UA Behavior to Process Multipart/
3.9.4 Message Bodies: Multipart/Mixed Alternative Message Bodies
This section does not specify any additional behavior regard- RFC 5621 does not specify any additional behavior regard-
ing how to generate and process multipart/mixed bodies. ing how to process multipart/alternative bodies. We have
This section is simply included for completeness. included this section simply for completeness.
3.9.6.2 UA Behavior to Generate Multipart/ ignores the message body; if the parameter has the value
Related Message Bodies required, the UAS returns a 415 Unsupported Media Type
response. The default value for the handling parameter
RFC 5621 does not specify any additional behavior regarding is required. The following is an example of a Content-
how to generate multipart/related bodies. We have included Disposition header field:
this section simply for completeness.
Content-Disposition: signal; handling=optional
3.9.6.3 UA Behavior to Process Multipart/
Related Message Bodies RFC 3204 identifies two situations where a UAS needs to
reject a request with a body part whose handling is required:
Per RFC 2387, a UA processing a multipart/related body
processes the body as a compound object ignoring the dis- ◾◾ If it has an unknown content type
position types of the body parts within it. Ignoring the ◾◾ If it has an unknown disposition type
disposition types of the individual body parts makes sense
in the context in which multipart/related was originally If the UAS did not understand the content type of the
specified. For instance, in the example of the web page, the body part, the UAS can add an Accept header field to its
implicit disposition type for the images would be inline, 415 Unsupported Media Type response listing the content
since the images are displayed as indicated by the root html types that the UAS does understand. Nevertheless, there is
file. However, in SIP, the disposition types of the individual no mechanism for a UAS that does not understand the dis-
body parts within a multipart/related play an important role position type of a body part to inform the UAC about which
and, thus, need to be considered by the UA processing the disposition type was not understood or about the disposi-
multipart/related. tion types that are understood by the UAS. The reason for
Different SIP extensions that use the same disposition not having such a mechanism is that disposition types are
type for the multipart/related body can be distinguished typically supported within a context. Outside that context,
by the disposition types of the individual body parts within a UA need not support the disposition type. For example, a
the multipart/related. Consequently, SIP UAs processing a UA can support the session disposition type for body parts
multipart/related body with a given disposition type must in INVITE and UPDATE requests and their responses.
process the disposition types of the body parts within it However, the same UA would not support the session dispo-
according to the SIP extension making use the disposition sition type in MESSAGE requests.
type of the multipart/related body. Note that UAs that do In another example, a UA can support the render dis-
not understand multipart/related will treat multipart/related position type for text/plain and text/html body parts in
bodies as multipart/mixed bodies. These UAs will not be able MESSAGE requests. In addition, the UA can support the
to process a given body as a compound object. Instead, they session disposition type for application/sdp body parts
will process the body parts according to their disposition in INVITE and UPDATE requests and their responses.
type as if each body part was independent from each other. However, the UA might not support the render disposition
type for application/sdp body parts in MESSAGE requests,
3.9.7 Disposition Types even if, in different contexts, the UA supported all of the
following: the render disposition type, the application/sdp
This section deals with disposition types in message bodies. content type, and the MESSAGE method.
A given context is generally (but not necessarily) defined
by a method, a disposition type, and a content type. Support
3.9.7.1 Background on Content
for a specific context is usually defined within an extension.
and Disposition Types in SIP
For example, the extension for instant messaging in SIP
The Content-Disposition header field (see Section 2.8.2), RFC 3428 (see Sections 2.5 and 6.3.1) mandates support
defined in RFC 2183 and extended by RFC 3261, describes for the MESSAGE method, the render disposition type, and
how to handle a SIP message’s body or an individual the text/plain content type. Note that, effectively, content
body part. Examples of disposition types used in SIP in types are also supported within a context. Therefore, the
the Content-Disposition header field are session and ren- use of the Accept header field in a 415 Unsupported Media
der. RFCs 3204 and 3459 define the handling parameter Type response is not enough to describe in which contexts
for the Content-Disposition header field. This parameter a particular content type is supported. Therefore, support
describes how a UAS reacts if it receives a message body for a particular disposition type within a given context is
whose content type or disposition type it does not under- typically signaled by the use of a particular method or an
stand. If the parameter has the value optional, the UAS option tag in a Supported or a Require header field. When
SIP Message Elements ◾ 237
support for a particular disposition type within a context is the UA must set the handling parameter of the multipart/
mandated, support for a default content type is also man- alternative body to required. Otherwise, the UA must set it
dated (e.g., a UA that supports the session disposition type to optional. The UA should also set the handling parameter
in an INVITE request needs to support the application/sdp of all the top-level body part within the multipart/alterna-
content type). tive to optional. The receiver will process the body parts
based on the handling parameter of the multipart/alterna-
tive body. The receiver will ignore the handling parameters
3.9.7.2 UA Behavior to Set
of the body parts. That is why setting them to optional is at
the Handling Parameter
the SHOULD level and not at the MUST level—their value
As stated earlier, the handling Content-Disposition param- is irrelevant.
eter can take two values: required or optional. While it is The UA must use the same disposition type for the
typically easy for a UA to decide which type of handling an multipart/alternative body and all its top-level body parts.
individual body part requires, setting the handling param- If the handling of a multipart/related body as a whole is
eter of multipart bodies requires extra considerations. required for processing its enclosing body part or mes-
If the handling of a multipart/mixed body as a whole is sage, the UA must set the handling parameter of the mul-
required for processing its enclosing body part or message, tipart/related body to required. Otherwise, the UA must
the UA must set the handling parameter of the multipart/ set it to optional. The handling parameters of the top-level
mixed body to required. Otherwise, the UA must set it to body parts within the multipart/related body are set inde-
optional. The handling parameters of the top-level body pendently from the handling parameter of the multipart/
parts within the multipart/mixed body are set indepen- related body. If the handling of a particular top-level body
dently from the handling parameter of the multipart/ part is required, the UA must set the handling parameter of
mixed body. If the handling of a particular top-level body that body part to required. Otherwise, the UA must set it
part is required, the UA must set the handling parameter to optional. If at least one top-level body part within a mul-
of that body part to required. Otherwise, the UA must set tipart/related body has a handling parameter of required,
it to optional. the UA should set the handling parameter of the root body
Per the previous rules, a multipart/mixed body whose part to required.
handling is optional can contain body parts whose handling
is required. In such a case, the receiver is required to pro-
cess the body parts whose handling is required if and only if
3.9.7.3 UA Behavior to Process
the receiver decides to process the optional multipart/mixed Multipart/Alternative
body. Also per the previous rules, a multipart/mixed body The receiver of a multipart/alternative body must process the
whose handling is required can contain only body parts body based on its handling parameter. The receiver should
whose handling is optional. In such a case, the receiver is ignore the handling parameters of the body parts within the
required to process the body as a whole; however, when pro- multipart/alternative.
cessing it, the receiver may decide (on the basis of its local
policy) not to process any of the body parts. The handling
3.9.7.4 UAS Behavior to Report
parameter is a Content-Disposition parameter.
Unsupported Message Bodies
Therefore, to set this parameter, it is necessary to pro-
vide the multipart/mixed body with a disposition type. Per If a UAS cannot process a request because, in the given con-
RFC 3261, the default disposition type for application/sdp text, the UAS does not support the content type or the dis-
is session and for other bodies, it is render. UAs should assign position type of a body part whose handling is required, the
multipart/mixed bodies a disposition type of render. Note UAS should return a 415 Unsupported Media Type response
that the fact that multipart/mixed bodies have a default even if the UAS supported the content type, the disposi-
disposition type of render does not imply that they will be tion type, or both in a different context. Consequently, it is
rendered to the user. The way the body parts within the possible to receive a 415 Unsupported Media Type response
multipart/mixed are handled depends on the disposition with an Accept header field containing all the content types
types of the individual body parts. The actual disposition used in the request. If a UAS receives a request with a body
type of the whole multipart/mixed is irrelevant. The render part whose disposition type is not compatible with the way
disposition type has been chosen for multipart/mixed bod- the body part is supposed to be handled according to other
ies simply because render is the default disposition type in parts of the SIP message (e.g., a Refer-To header field with a
SIP. Content-ID URL pointing to a body part whose disposition
If the handling of a multipart/alternative body as a whole type is session), the UAS should return a 415 Unsupported
is required for processing its enclosing body part or message, Media Type response.
238 ◾ Handbook on Session Initiation Protocol
3.9.8 Message-Body Processing If the SIP message contains more than one reference to
the body part (e.g., two header fields contain Content-ID
This section deals with the processing of message bodies and URLs pointing to the body part), the UA processes the body
how that processing is influenced by the presence of refer- part as many times as the number of references. Note that,
ences to them. following the rules in RFC 3204, if a UA does not under-
stand a body part whose handling is optional, the UA ignores
3.9.8.1 Background on References it. Also note that the content indirection mechanism in SIP
to Message-Body Parts (RFC 4483, see Section 16.6) allows UAs to point to exter-
nal bodies. Therefore, a UA receiving a SIP message that uses
Content-ID URLs allow creating references to body parts. content indirection could need to fetch a body part (e.g.,
A given Content-ID URL (RFC 2392, see Section 2.8.2), using HTTP (RFCs 7230–7235) in order to process it.
which can appear in a header field or within a body part (e.g.,
in an SDP attribute), points to a particular body part. The
way to handle that body part is defined by the field where 3.9.8.4 Disposition Type: By-Reference
the Content-ID URL appears. For example, the extension Per the rules described above, if a SIP message contains a
to refer to multiple resources in SIP (RFC 5368, see Section reference to a body part, the UA processes the body part
16.4) places a Content-ID URL in a Refer-To header field. according to the reference. Since the reference provides the
Such a Content-ID URL points to a body part that carries a context in which the body part needs to be processed, the
URI list. In another example, the extension for file transfer disposition type of the body part is irrelevant. However, a
in SDP (RFC 5547) places a Content-ID URL in a fileicon UA that missed a reference to a body part (e.g., because the
SDP attribute. This Content-ID URL points to a body part reference was in a header field the UA did not support) would
that carries a (typically small) picture. attempt to process the body part according to its disposition
type alone. To keep this from happening, we define a new
3.9.8.2 UA Behavior to Generate disposition type for the Content-Disposition header field:
References to Message Bodies by-reference.
A body part whose disposition type is by-reference needs
UAs must only include forward references in the SIP mes- to be handled according to a reference to the body part that is
sages they generate. That is, an element in a SIP message located in the same SIP message as the body part (given that
can reference a body part only if the body part appears after SIP only allows forward references, the reference will appear
the element. Consequently, a given body part can only be in the same SIP message before the body part). A recipient of
referenced by another body part that appears before it or a body part whose disposition type is by-reference that can-
by a header field. Having only forward references allows not find any reference to the body part (e.g., the reference was
recipients to process body parts as they parse them. They in a header field the recipient does not support and, thus, did
do not need to parse the remainder of the message in order not process) must not process the body part. Consequently, if
to process a body part. It was considered to only allow (for- the handling of the body part was required, the UA needs to
ward) references among body parts that belonged to the report an error. Note that extensions that predate this speci-
same multipart/related (RFC 2387) wrapper. However, fication use references to body parts whose disposition type
it was finally decided that this extra constraint was not is not by-reference. Those extensions use option tags to make
necessary. sure the recipient understands the whole extension and, thus,
cannot miss the reference and attempt to process the body
part according to its disposition type alone.
3.9.8.3 UA Behavior to Process
Message Bodies
To process a message body or a body part, a UA needs to
3.9.9 Future SIP Extensions
know whether a SIP header field or another body part con- These guidelines are intended for authors of SIP extensions
tains a reference to the message body or body part (e.g., a that involve, in some way, message bodies or body parts.
Content-ID URL pointing to it). If the body part is not ref- These guidelines discuss aspects that authors of such exten-
erenced in any way (e.g., there are no header fields or other sions need to consider when designing them. This specifica-
body parts with a Content-ID URL pointing to it), the UA tion mandates support for multipart/mixed and multipart/
processes the body part as indicated by its disposition type alternative. At present, there are no SIP extensions that use
and the context in which the body part was received. If the different multipart subtypes such as parallel (RFC 2046) or
SIP message contains a reference to the body part, the UA digest (RFC 2046). If such extensions were to be defined in
processes the body part according to the reference. the future, their authors would need to make sure (e.g., by
SIP Message Elements ◾ 239
using an option tag or by other means) that entities receiving have defined other application layer states associated with
those multipart subtypes were able to process them. As stated the dialog, the BYE also terminates the dialog. The impact
earlier, UAs treat unknown multipart subtypes as multipart/ of a non-2xx final response to INVITE on dialogs and ses-
mixed. sions makes the use of CANCEL attractive. The CANCEL
Authors of SIP extensions making use of multipart/ attempts to force a non-2xx response to the INVITE (in par-
related bodies have to explicitly address the handling of the ticular, a 487).
disposition types of the body parts within the multipart/ Therefore, if a UAC wishes to give up on its call attempt
related body. Authors wishing to make use of multipart/ entirely, it can send a CANCEL. If the INVITE results in
related bodies should keep in mind that UAs that do not 2xx final response(s) to the INVITE, this means that a UAS
understand multipart/related will treat it as multipart/mixed. accepted the invitation while the CANCEL was in progress.
If such treatment by a recipient is not acceptable for a par- The UAC may continue with the sessions established by any
ticular extension, the authors of such extension would need 2xx responses, or may terminate them with BYE. The notion
to make sure (e.g., by using an option tag or by other means) of hanging up is not well defined within SIP. It is specific
that entities receiving the multipart/related body were able to to a particular, albeit common, user interface. Typically,
correctly process them. when the user hangs up, it indicates a desire to terminate
As stated earlier, SIP extensions can also include mul- the attempt to establish a session, and to terminate any ses-
tipart MIME bodies in responses. Hence, a response can sions already created. For the caller’s UA, this would imply a
be extremely complex and the UAC receiving the response CANCEL request if the initial INVITE has not generated a
might not be able to process it correctly. Because UACs final response and a BYE to all confirmed dialogs after a final
receiving a response cannot report errors to the UAS that response. For the callee’s UA, it would typically imply a BYE;
generated the response (i.e., error responses can only be gen- presumably, when the user picked up the phone, a 2xx was
erated for requests), authors of SIP extensions need to make generated, and so hanging up would result in a BYE after the
sure that requests clearly indicate (e.g., by using an option tag ACK is received. This does not mean a user cannot hang up
or by other means) the capabilities of the UAC so that UASs before receipt of the ACK; it just means that the software in
can decide what to include in their responses. his phone needs to maintain state for a short while in order
to clean up properly. If the particular user interface allows for
the user to reject a call before it is answered, a 403 Forbidden
is a good way to express that. As per the rules above, a BYE
3.10 Terminating a Session cannot be sent.
This section describes the procedures for terminating a ses-
sion established by SIP. The state of the session and the state
of the dialog are very closely related. When a session is initi- 3.10.1 Terminating a Session
ated with an INVITE, each 1xx or 2xx response from a dis- with a BYE Request
tinct UAS creates a dialog, and if that response completes the 3.10.1.1 UAC Behavior
offer–answer exchange, it also creates a session. As a result,
each session is associated with a single dialog—the one that A BYE request is constructed as would any other request
resulted in its creation. If an initial INVITE generates a non- within a dialog, as described in Section 3.6. Once the BYE is
2xx final response, that terminates all sessions (if any) and constructed, the UAC core creates a new non-INVITE client
all dialogs (if any) that were created through responses to the transaction, and passes it the BYE request. The UAC must
request. By virtue of completing the transaction, a non-2xx consider the session terminated (and therefore stop sending
final response also prevents further sessions from being cre- or listening for media) as soon as the BYE request is passed
ated as a result of the INVITE. The BYE request is used to to the client transaction. If the response for the BYE is a 481
terminate a specific session or attempted session. In this case, Call/Transaction Does Not Exist or a 408 Request Timeout,
the specific session is the one with the peer UA on the other or no response at all is received for the BYE (i.e., a timeout is
side of the dialog. When a BYE is received on a dialog, any returned by the client transaction), the UAC must consider
session associated with that dialog should terminate. A UA the session and the dialog terminated.
must not send a BYE outside of a dialog. The caller’s UA may
send a BYE for either confirmed or early dialogs, and the cal-
3.10.1.2 UAS Behavior
lee’s UA may send a BYE on confirmed dialogs, but must not
send a BYE on early dialogs. A UAS first processes the BYE request according to the
However, the callee’s UA must not send a BYE on a con- general UAS processing described in Section 3.1.3. A UAS
firmed dialog until it has received an ACK for its 2xx response core receiving a BYE request checks if it matches an existing
or until the server transaction times out. If no SIP extensions dialog. If the BYE does not match an existing dialog, the
240 ◾ Handbook on Session Initiation Protocol
UAS core should generate a 481 Call/Transaction Does Not In some circumstances, a proxy may forward requests
Exist response and pass that to the server transaction. This using stateful transports (such as TCP) without being trans-
rule means that a BYE sent without tags by a UAC will be action stateful. For instance, a proxy may forward a request
rejected. This is a change from RFC 2543, which allowed from one TCP connection to another transaction statelessly
BYE without tags. as long as it places enough information in the message to
A UAS core receiving a BYE request for an existing dia- be able to forward the response down the same connection
log must follow the procedures of Section 3.6.2.2 to pro- the request arrived on. Requests forwarded between different
cess the request. Once done, the UAS should terminate the types of transports where the proxy’s TU must take an active
session (and therefore stop sending and listening for media). role in ensuring reliable delivery on one of the transports
The only case where it can elect not to are multicast sessions, must be forwarded transaction statefully.
where participation is possible even if the other participant A stateful proxy may transition to stateless operation at
in the dialog has terminated its involvement in the session. any time during the processing of a request, so long as it did
Whether or not it ends its participation on the session, the not do anything that would otherwise prevent it from being
UAS core must generate a 2xx response to the BYE, and must stateless initially (forking, for example, or generation of a 100
pass that to the server transaction for transmission. The UAS response). When performing such a transition, all state is
must still respond to any pending requests received for that simply discarded. The proxy should not initiate a CANCEL
dialog. It is recommended that a 487 Request Terminated request. Much of the processing involved when acting state-
response be generated to those pending requests. lessly or statefully for a request is identical. The next several
subsections are written from the point of view of a stateful
proxy. The last section calls out those places where a stateless
proxy behaves differently.
3.11 Proxy Behavior
3.11.1 Overview 3.11.2 Stateful Proxy
SIP proxies are elements that route SIP requests to UASs and When stateful, a proxy is purely a SIP transaction-processing
SIP responses to UACs. A request may traverse several prox- engine. Its behavior is modeled here in terms of the server
ies on its way to a UAS. Each will make routing decisions, and client transactions defined in Section 3.12. A stateful
modifying the request before forwarding it to the next ele- proxy has a server transaction associated with one or more
ment. Responses will route through the same set of proxies client transactions by a higher-layer proxy-processing com-
traversed by the request in the reverse order. Being a proxy ponent (see Figure 3.19), known as a proxy core. An incom-
is a logical role for a SIP element. When a request arrives, ing request is processed by a server transaction. Requests
an element that can play the role of a proxy first decides if from the server transaction are passed to a proxy core. The
it needs to respond to the request on its own. For instance, proxy core determines where to route the request, choosing
the request may be malformed or the element may need cre- one or more next-hop locations. An outgoing request for each
dentials from the client before acting as a proxy. The ele- next-hop location is processed by its own associated client
ment may respond with any appropriate error code. When transaction. The proxy core collects the responses from the
responding directly to a request, the element is playing the client transactions and uses them to send responses to the
role of a UAS and must behave as described in Section 3.1.3. server transaction.
A proxy can operate in either a stateful or stateless mode A stateful proxy creates a new server transaction for
for each new request. When stateless, a proxy acts as a simple each new request received. Any retransmissions of the
forwarding element. It forwards each request downstream request will then be handled by that server transaction
to a single element determined by making a targeting and
routing decision based on the request. It simply forwards
every response it receives upstream. A stateless proxy dis- CT
cards information about a message once the message has
been forwarded. A stateful proxy remembers information ST
Proxy
CT
“higher” layer
(specifically, transaction state) about each incoming request
and any requests it sends as a result of processing the incom- CT
ing request. It uses this information to affect the processing
of future messages associated with that request. A stateful CT = Client transaction ST = Server transaction
proxy may choose to fork a request, routing it to multiple
destinations. Any request that is forwarded to more than one Figure 3.19 Stateful proxy model. (Copyright IETF.
location must be handled statefully. Reproduced with permission.)
SIP Message Elements ◾ 241
per Section 3.12. The proxy core must behave as a UAS 3.11.3.2 URI Scheme Check
with respect to sending an immediate provisional on that
If the Request-URI has a URI whose scheme is not under-
server transaction (such as 100 Trying), as described in
stood by the proxy, the proxy should reject the request with a
Section 3.1.3.6. Thus, a stateful proxy should not generate
416 Unsupported URI Scheme response.
100 Trying responses to non-INVITE requests. This is a
model of proxy behavior, not of software. An implementa-
tion is free to take any approach that replicates the external 3.11.3.3 Max-Forwards Check
behavior this model defines. For all new requests, includ-
ing any with unknown methods, an element intending to The Max-Forwards header field (Section 2.8.2) is used to
proxy the request must limit the number of elements a SIP request can traverse. If
the request does not contain a Max-Forwards header field,
1. Validate the request (Section 3.11.3) this check is passed. If the request contains a Max-Forwards
2. Preprocess routing information (Section 3.11.4) header field with a field value greater than zero, the check is
3. Determine target(s) for the request (Section 3.11.5) passed. If the request contains a Max-Forwards header field
4. Forward the request to each target (Section 3.11.6) with a field value of zero (0), the element must not forward
5. Process all responses (Section 3.11.7) the request. If the request was for OPTIONS, the element
may act as the final recipient and respond per Section 3.5.
Otherwise, the element must return a 483 Too Many Hops
response.
3.11.3 Request Validation
Before elements can proxy a request, it must verify the mes- 3.11.3.4 Optional Loop-Detection Check
sage’s validity. A valid message must pass the following
checks: 3.11.3.4.1 Per RFC 3261
An element may check for forwarding loops before forward-
1. Reasonable syntax ing a request. If the request contains a Via header field with a
2. URI scheme sent-by value that equals a value placed into previous requests
3. Max-Forwards by the proxy, the request has been forwarded by this element
4. (Optional) loop detection before. The request has either looped or is legitimately spiraling
5. Proxy-Require through the element. To determine if the request has looped,
6. Proxy-Authorization the element may perform the branch parameter calculation
described in Section 3.11.6 on this message and compare it with
If any of these checks fail, the element must behave as the parameter received in that Via header field. If the param-
a UAS (see Section 3.1.3) and respond with an error code. eters match, the request has looped. If they differ, the request
Notice that a proxy is not required to detect merged requests is spiraling, and processing continues. If a loop is detected, the
and must not treat merged requests as an error condition. element may return a 482 Loop Detected response.
The end points receiving the requests will resolve the merge
as described in Section 3.1.3.2.2.
3.11.3.4.2 Per 5393
in Section 4.2.1 of RFC 5393 (see Section 9.11). Instead, a proxy need only place enough information in
This second part will not be present if the mes- those URIs to recognize them as values it provided when
sage was not forked when that Via header field they later appear. If the Request-URI contains a maddr
value was added. If the second field is present, parameter, the proxy must check to see if its value is in the
the proxy must perform the second-part calcu- set of addresses or domains the proxy is configured to be
lation described in Section 4.2.1 of RFC 5393 responsible for. If the Request-URI has a maddr parameter
(see Section 9.11) on this request and compare with a value the proxy is responsible for, and the request was
the result to the value from the Via header field. received using the port and transport indicated (explicitly
If these values are equal, the request has looped or by default) in the Request-URI, the proxy must strip the
and the proxy must reject the request with a 482 maddr and any nondefault port or transport parameter and
Loop Detected response. If the values differ, the continue processing as if those values had not been present
request is spiraling and processing continues to in the request. A request may arrive with a maddr matching
the next step. the proxy, but on a port or transport different from that indi-
cated in the URI. Such a request needs to be forwarded to
the proxy using the indicated port and transport. If the first
3.11.3.5 Proxy-Require Check
value in the Route header field indicates this proxy, the proxy
Future extensions to this protocol may introduce features must remove that value from the request.
that require special handling by proxies. End points will
include a Proxy-Require header field in requests that use
these features, telling the proxy not to process the request
3.11.5 Determining Request Targets
unless the feature is understood. If the request contains Next, the proxy calculates the target(s) of the request. The
a Proxy-Require header field (Section 2.8.2) with one or set of targets will either be predetermined by the contents
more option tags this element does not understand, the of the request or will be obtained from an abstract location
element must return a 420 Bad Extension response. The service. Each target in the set is represented as a URI. If the
response must include an Unsupported (Section 2.8.2) Request-URI of the request contains a maddr parameter, the
header field listing those option tags the element did not Request-URI must be placed into the target set as the only
understand. target URI, and the proxy must proceed to Section 3.11.6. If
the domain of the Request-URI indicates a domain this ele-
ment is not responsible for, the Request-URI must be placed
3.11.3.6 Proxy-Authorization Check
into the target set as the only target, and the element must
If an element requires credentials before forwarding a request, proceed to the task of Request Forwarding (Section 3.11.6).
the request must be inspected as described in Section 19.6.3. There are many circumstances in which a proxy might
That section also defines what the element must do if the receive a request for a domain it is not responsible for. A fire-
inspection fails. wall proxy handling outgoing calls (the way HTTP prox-
ies handle outgoing requests) is an example of where this is
likely to occur. If the target set for the request has not been
3.11.4 Route Information Preprocessing
predetermined as described above, this implies that the ele-
The proxy must inspect the Request-URI of the request. If ment is responsible for the domain in the Request-URI, and
the Request-URI of the request contains a value this proxy the element may use whatever mechanism it desires to deter-
previously placed into a Record-Route header field (see mine where to send the request. Any of these mechanisms
Section 3.11.6), the proxy must replace the Request-URI can be modeled as accessing an abstract location service.
in the request with the last value from the Route header This may consist of obtaining information from a loca-
field, and remove that value from the Route header field. tion service created by a SIP Registrar, reading a database,
The proxy must then proceed as if it received this modified consulting a presence server, utilizing other protocols, or sim-
request. This will only happen when the element sending the ply performing an algorithmic substitution on the Request-
request to the proxy (which may have been an end point) is URI. When accessing the location service constructed by
a strict router. This rewrite on receive is necessary to enable a registrar, the Request-URI must first be canonicalized as
backwards compatibility with those elements. It also allows described in Section 3.3 before being used as an index. The
elements following this specification to preserve the Request- output of these mechanisms is used to construct the target
URI through strict-routing proxies (see Section 3.6.2.1.1). set. If the Request-URI does not provide sufficient informa-
This requirement does not obligate a proxy to keep state in tion for the proxy to determine the target set, it should return
order to detect URIs it previously placed in Record-Route a 485 Ambiguous response. This response should contain a
header fields. Contact header field containing URIs of new addresses to be
SIP Message Elements ◾ 243
tried. For example, an INVITE to sip:John.Smith@company. in any order. It may process multiple targets serially, allow-
com may be ambiguous at a proxy whose location service has ing each client transaction to complete before starting the
multiple John Smiths listed. See Section 2.8.2 for details. next. It may start client transactions with every target in
Any information in or about the request or the current parallel. It also may arbitrarily divide the set into groups,
environment of the element may be used in the construc- processing the groups serially and processing the targets in
tion of the target set. For instance, different sets may be each group in parallel. A common ordering mechanism is to
constructed depending on contents or the presence of header use the q-value parameter of targets obtained from Contact
fields and bodies, the time of day of the request’s arrival, the header fields (see Section 2.8.2). Targets are processed from
interface on which the request arrived, failure of previous the highest q-value to the lowest. Targets with equal q-values
requests, or even the element’s current level of utilization. may be processed in parallel.
As potential targets are located through these services, their A stateful proxy must have a mechanism to maintain the
URIs are added to the target set. Targets can only be placed target set as responses are received and associate the responses
in the target set once. If a target URI is already present in the to each forwarded request with the original request. For the
set (based on the definition of equality for the URI type), it purposes of this model, this mechanism is a response context
must not be added again. A proxy must not add additional created by the proxy layer before forwarding the first request.
targets to the target set if the Request-URI of the original For each target, the proxy forwards the request following
request does not indicate a resource this proxy is responsible these steps:
for. A proxy can only change the Request-URI of a request
during forwarding if it is responsible for that URI. If the 1. Make a copy of the received request.
proxy is not responsible for that URI, it will not recurse on 2. Update the Request-URI.
3xx or 416 responses as described below. If the Request- 3. Update the Max-Forwards header field.
URI of the original request indicates a resource this proxy 4. Optionally add a Record-route header field value.
is responsible for, the proxy may continue to add targets to 5. Optionally add additional header fields.
the set after beginning Request Forwarding. It may use any 6. Postprocess routing information.
information obtained during that processing to determine 7. Determine the next-hop address, port, and transport.
new targets. 8. Add a Via header field value.
For instance, a proxy may choose to incorporate contacts 9. Add a Content-Length header field if necessary.
obtained in a redirect response (3xx) into the target set. If a 10. Forward the new request.
proxy uses a dynamic source of information while building 11. Set timer C.
the target set (for instance, if it consults a SIP Registrar), it
should monitor that source for the duration of processing the Each of these steps is detailed in the following sections.
request. New locations should be added to the target set as
they become available. As above, any given URI must not
3.11.6.1 Copy Request
be added to the set more than once. Allowing a URI to be
added to the set only once reduces unnecessary network traf- The proxy starts with a copy of the received request. The copy
fic, and, in the case of incorporating contacts from redirect must initially contain all of the header fields from the received
requests, prevents infinite recursion. For example, a trivial request. Fields not detailed in the processing described below
location service is a no-op, where the target URI is equal must not be removed. The copy should maintain the order-
to the incoming Request-URI. The request is sent to a spe- ing of the header fields as in the received request. The proxy
cific next-hop proxy for further processing. During request must not reorder field values with a common field name (see
forwarding of Section 3.11.6, the identity of that next hop, Section 2.8). The proxy must not add to, modify, or remove
expressed as a SIP or SIPS URI, is inserted as the topmost the message body. An actual implementation need not per-
Route header field value into the request. If the Request-URI form a copy; the primary requirement is that the processing
indicates a resource at this proxy that does not exist, the for each next hop begins with the same request.
proxy must return a 404 Not Found response. If the tar-
get set remains empty after applying all of the above, the
3.11.6.2 Request-URI
proxy must return an error response, which should be the
480 Temporarily Unavailable responses. The Request-URI in the copy’s start line must be replaced
with the URI for this target. If the URI contains any param-
eters not allowed in a Request-URI, they must be removed.
3.11.6 Request Forwarding This is the essence of a proxy’s role. This is the mechanism
As soon as the target set is nonempty, a proxy may begin through which a proxy routes a request toward its destina-
forwarding the request. A stateful proxy may process the set tion. In some circumstances, the received Request-URI is
244 ◾ Handbook on Session Initiation Protocol
placed into the target set without being modified. For that the server location procedures of RFC 3263 (see Section
target, the replacement above is effectively a no-op. 8.2.4) are applied to it, so that subsequent requests reach the
same SIP element. If the Request-URI contains a SIPS URI,
or the topmost Route header field value contains a SIPS URI,
3.11.6.3 Max-Forwards
the URI placed into the Record-Route header field must be a
If the copy contains a Max-Forwards header field, the proxy SIPS URI. Furthermore, if the request was not received over
must decrement its value by 1. If the copy does not contain TLS, the proxy must insert a Record-Route header field. In a
a Max-Forwards header field, the proxy must add one with a similar fashion, a proxy that receives a request over TLS, but
field value, which should be 70. Some existing UAs will not generates a request without a SIPS URI in the Request-URI
provide a Max-Forwards header field in a request. or topmost Route header field, must insert a Record-Route
header field that is not a SIPS URI.
A proxy at a security perimeter must remain on the perim-
3.11.6.4 Record-Route
eter throughout the dialog. If the URI placed in the Record-
If this proxy wishes to remain on the path of future requests Route header field needs to be rewritten when it passes back
in a dialog created by this request (assuming the request through in a response, the URI must be distinct enough
creates a dialog), it must insert a Record-Route header field to locate at that time. (The request may spiral through this
value into the copy before any existing Record-Route header proxy, resulting in more than one Record-Route header field
field values, even if a Route header field is already present. value being added.) Section 3.11.7 recommends a mecha-
Requests establishing a dialog may contain a preloaded nism to make the URI sufficiently distinct. The proxy may
Route header field. If this request is already part of a dialog, include parameters in the Record-Route header field value.
the proxy should insert a Record-Route header field value if These will be echoed in some responses to the request such as
it wishes to remain on the path of future requests in the dia- the 200 OK responses to INVITE. Such parameters may be
log. In normal end-point operation as described in Section useful for keeping state in the message rather than the proxy.
3.6, these Record-Route header field values will not have any If a proxy needs to be in the path of any type of dialog
effect on the route sets used by the end points. (such as one straddling a firewall), it should add a Record-
The proxy will remain on the path if it chooses to not Route header field value to every request with a method it
insert a Record-Route header field value into requests that does not understand since that method may have dialog
are already part of a dialog. However, it would be removed semantics. The URI a proxy places into a Record-Route
from the path when an end point that has failed reconsti- header field is only valid for the lifetime of any dialog created
tutes the dialog. A proxy may insert a Record-Route header by the transaction in which it occurs. A dialog-stateful proxy,
field value into any request. If the request does not initiate a for example, may refuse to accept future requests with that
dialog, the end points will ignore the value. See Section 3.6 value in the Request-URI after the dialog has terminated.
for details on how end points use the Record-Route header Nondialog stateful proxies, of course, have no concept of
field values to construct Route header fields. Each proxy in when the dialog has terminated; however, they may encode
the path of a request chooses whether to add a Record-Route enough information in the value to compare it against the
header field value independently—the presence of a Record- dialog identifier of future requests and may reject requests
Route header field in a request does not obligate this proxy not matching that information.
to add a value. End points must not use a URI obtained from a Record-
The URI placed in the Record-Route header field value Route header field outside the dialog in which it was pro-
must be a SIP or SIPS URI. This URI must contain an lr vided. See Section 3.6 for more information on an end point’s
parameter (see Section 4.2.1). This URI may be different use of Record-Route header fields. Record-routing may be
for each destination the request is forwarded to. The URI required by certain services where the proxy needs to observe
should not contain the transport parameter unless the proxy all messages in a dialog. However, it slows down processing
has knowledge (such as in a private network) that the next and impairs scalability, and thus proxies should only Record-
downstream element that will be in the path of subsequent Route if required for a particular service. The Record-Route
requests supports that transport. The URI this proxy pro- process is designed to work for any SIP request that initiates a
vides will be used by some other element to make a routing dialog. INVITE is the only such request in this specification,
decision. This proxy, in general, has no way of knowing the but extensions to the protocol may define others.
capabilities of that element, so it must restrict itself to the
mandatory elements of a SIP implementation: SIP URIs and
3.11.6.5 Add Additional Header Fields
either the TCP or UDP transports.
The URI placed in the Record-Route header field MUST The proxy may add any other appropriate header fields to the
resolve to the element inserting it (or a suitable stand-in) when copy at this point.
SIP Message Elements ◾ 245
3.11.6.6 Postprocessing Routing Information header field should be used for that purpose as described
above. In the absence of such an overriding mechanism, the
A proxy may have a local policy that mandates that a request
proxy applies the procedures listed in RFC 3263 (see Section
visit a specific set of proxies before being delivered to the des-
8.2.4) as follows to determine where to send the request.
tination. A proxy must ensure that all such proxies are loose
If the proxy has reformatted the request to send to a strict-
routers. Generally, this can only be known with certainty if
routing element, the proxy must apply those procedures to
the proxies are within the same administrative domain. This
the Request-URI of the request. Otherwise, the proxy must
set of proxies is represented by a set of URIs (each of which
apply the procedures to the first value in the Route header
contains the lr parameter). This set must be pushed into the
field, if present, else the Request-URI. The procedures will
Route header field of the copy ahead of any existing values, if
produce an ordered set of (address, port, and transport) tuple.
present. If the Route header field is absent, it must be added,
Independently of which URI is being used as input to the
containing that list of URIs.
procedures of RFC 3263 (see Section 8.2.4), if the Request-
If the proxy has a local policy that mandates that the
URI specifies a SIPS resource, the proxy must follow the pro-
request visit one specific proxy, an alternative to pushing a
cedures of RFC 3263 as if the input URI were a SIPS URI.
Route value into the Route header field is to bypass the for-
As described in RFC 3263 (see Section 8.2.4), the proxy
warding logic of item 10 below, and instead just send the
must attempt to deliver the message to the first tuple in that
request to the address, port, and transport for that specific
set, and proceed through the set in order until the delivery
proxy. If the request has a Route header field, this alterna-
attempt succeeds. For each tuple attempted, the proxy must
tive must not be used unless it is known that next-hop proxy
format the message as appropriate for the tuple and send the
is a loose router. Otherwise, this approach may be used;
request using a new client transaction. Since each attempt
however, the Route insertion mechanism above is preferred
uses a new client transaction, it represents a new branch.
for its robustness, flexibility, generality, and consistency of
Thus, the branch parameter provided with the Via header
operation.
field must be different for each attempt. If the client transac-
Furthermore, if the Request-URI contains a SIPS URI,
tion reports failure to send the request or a timeout from its
TLS must be used to communicate with that proxy. If the
state machine, the proxy continues to the next address in that
copy contains a Route header field, the proxy must inspect
ordered set. If the ordered set is exhausted, the request can-
the URI in its first value. If that URI does not contain an lr
not be forwarded to this element in the target set. The proxy
parameter, the proxy must modify the copy as follows:
does not need to place anything in the response context, but
otherwise acts as if this element of the target set returned a
◾◾ The proxy must place the Request-URI into the Route
408 Request Timeout final response.
header field as the last value.
◾◾ The proxy must then place the first Route header field
value into the Request-URI and remove that value 3.11.6.8 Add a Via Header Field Value
from the Route header field.
3.11.6.8.1 Per RFC 3261
Appending the Request-URI to the Route header field The proxy must insert a Via header field value into the copy
is part of a mechanism used to pass the information in that before the existing Via header field values. The construction
Request-URI through strict-routing elements. Popping the of this value follows the same guidelines of Section 3.1.2.1.7.
first Route header field value into the Request-URI for- This implies that the proxy will compute its own branch
mats the message the way a strict-routing element expects to parameter, which will be globally unique for that branch,
receive it (with its own URI in the Request-URI and the next and contain the requisite magic cookie. Note that this
location to visit in the first Route header field value). implies that the branch parameter will be different for differ-
ent instances of a spiraled or looped request through a proxy.
Proxies choosing to detect loops have an additional con-
3.11.6.7 Determine Next-Hop Address,
straint in the value they use for construction of the branch
Port, and Transport parameter. A proxy choosing to detect loops should create a
The proxy may have a local policy to send the request to a branch parameter separable into two parts by the implemen-
specific IP address, port, and transport, independent of the tation. The first part must satisfy the constraints of Section
values of the Route and Request-URI. Such a policy must not 3.1.2.1.7 as described above. The second is used to perform
be used if the proxy is not certain that the IP address, port, loop detection and distinguish loops from spirals.
and transport correspond to a server that is a loose router. Loop detection is performed by verifying that, when a
However, this mechanism for sending the request through request returns to a proxy, those fields having an impact on
a specific next hop is not recommended; instead, a Route the processing of the request have not changed. The value
246 ◾ Handbook on Session Initiation Protocol
placed in this part of the branch parameter should reflect 9.11.1.2) have an additional constraint on the
all of those fields (including any Route, Proxy-Require, and value they place in the Via header field. Such
Proxy-Authorization header fields). This is to ensure that if proxies create a branch value separable into two
the request is routed back to the proxy and one of those fields parts in any implementation-dependent way.
changes, it is treated as a spiral and not a loop (see Section The remainder of this section’s description
3.11.3). A common way to create this value is to compute a assumes the existence of these two parts. If a
cryptographic hash of the To tag, From tag, Call-ID header proxy chooses to employ some other mechanism,
field, the Request-URI of the request received (before trans- it is the implementer’s responsibility to verify that
lation), the topmost Via header, and the sequence number the detection properties defined by the require-
from the CSeq header field, in addition to any Proxy-Require ments placed on these two parts are achieved.
and Proxy-Authorization header fields that may be present. The first part of the branch value must sat-
The algorithm used to compute the hash is implementation isfy the constraints of Section 8.1.1.7 (see Section
dependent; however, MD5, defined in RFCs 1321 and 6151 3.1.2.1.7) of RFC 3261. The second part is used
expressed in hexadecimal, is a reasonable choice. (Base64 is to perform loop detection and distinguish loops
not permissible for a token.) from spirals. This second part must vary with any
If a proxy wishes to detect loops, the branch parameter it field used by the location service logic in deter-
supplies must depend on all information affecting processing of mining where to retarget or forward this request.
a request, including the incoming Request-URI and any header This is necessary to distinguish looped requests
fields affecting the request’s admission or routing. This is neces- from spirals by allowing the proxy to recognize if
sary to distinguish looped requests from requests whose rout- none of the values affecting the processing of the
ing parameters have changed before returning to this server. request have changed. Hence, the second part
The request method must not be included in the calculation must depend at least on the received Request-URI
of the branch parameter. In particular, CANCEL and ACK and any Route header field values used when pro-
requests (for non-2xx responses) must have the same branch cessing the received request. Implementers need
value as the corresponding request they cancel or acknowledge. to take care to include all fields used by the loca-
The branch parameter is used in correlating those requests at tion service logic in that particular implementa-
the server handling them (see Sections 3.2 and 3.12.2.3). tion. This second part must not vary with the
request method. CANCEL and non-200 ACK
requests must have the same branch parameter
3.11.6.8.2 Per RFC 5393
value as the corresponding request they cancel
Section 4 .2.1 o f R FC 5 393: U pdate t o S ection 1 6.6 or acknowledge. This branch parameter value is
of RFC 3261: used in correlating those requests.
INVITE request is proxied. The timer MUST be larger than transaction. The timer MAY be reset to a different value, but
3 minutes. Section 3.11.7 discusses how this timer is updated this value must be greater than 3 minutes.
with provisional responses, and Section 3.11.8 discusses pro-
cessing when it fires.
3.11.7.3 Via
The proxy removes the topmost Via header field value from
3.11.7 Response Processing the response. If no Via header field values remain in the
When a response is received by an element, it first tries to response, the response was meant for this element and must
locate a client transaction (Section 3.12.1.3) matching the not be forwarded. The remainder of the processing described
response. If a transaction is found, the response is handed to in this section is not performed on this message, the UAC
the client transaction. If none is found, the element must not processing rules described in Section 3.1.2 are followed
forward the response. (Note that these two sentences have instead (transport layer processing has already occurred).
been updated/modified per RFC 6026.) As client transac- This will happen, for instance, when the element generates
tions pass responses to the proxy layer, the following process- CANCEL requests as described in Section 3.3.
ing must take place:
3.11.7.4 Add Response to Context
1. Find the appropriate response context.
2. Update timer C for provisional responses. Final responses received are stored in the response context
3. Remove the topmost Via. until a final response is generated on the server transaction
4. Add the response to the response context. associated with this context. The response may be a candi-
5. Check to see if this response should be forwarded date for the best final response to be returned on that server
immediately. transaction. Information from this response may be needed
6. When necessary, choose the best final response from in forming the best response, even if this response is not cho-
the response context. sen. If the proxy chooses to recurse on any contacts in a 3xx
response by adding them to the target set, it must remove
If no final response has been forwarded after every client them from the response before adding the response to the
transaction associated with the response context has been ter- response context. However, a proxy should not recurse to a
minated, the proxy must choose and forward the best response non-SIPS URI if the Request-URI of the original request
from those it has seen thus far. The following processing must was a SIPS URI. If the proxy recurses on all of the contacts
be performed on each response that is forwarded. It is likely in a 3xx response, the proxy should not add the resulting
that more than one response to each request will be for- contactless response to the response context.
warded: at least each provisional and one final response. Removing the contact before adding the response to the
response context prevents the next element upstream from
7. Aggregate authorization header field values if necessary. retrying a location this proxy has already attempted. 3xx
8. Optionally rewrite Record-Route header field values. responses may contain a mixture of SIP, SIPS, and non-
9. Forward the response. SIP URIs. A proxy may choose to recurse on the SIP and
10. Generate any necessary CANCEL requests. SIPS URIs and place the remainder into the response con-
text to be returned, potentially in the final response. If a
Each of the above steps is detailed in the following sections. proxy receives a 416 Unsupported URI Scheme response
to a request whose Request-URI scheme was not SIP, but
the scheme in the original received request was SIP or SIPS
3.11.7.1 Find Context (i.e., the proxy changed the scheme from SIP or SIPS to
The proxy locates the response context it created before something else when it proxied a request), the proxy should
forwarding the original request using the key described in add a new URI to the target set. This URI should be a SIP
Section 3.11.6. The remaining processing steps take place in URI version of the non-SIP URI that was just tried. In the
this context. case of the tel URL, this is accomplished by placing the
telephone-subscriber part of the tel URL into the user part
of the SIP URI, and setting the host part to the domain
3.11.7.2 Update Timer C for Provisional
where the prior request was sent. See Section 4.2.2.1 for
Responses
more detail on forming SIP URIs from tel URLs. As with a
For an INVITE transaction, if the response is a provisional 3xx response, if a proxy recurses on the 416 by trying a SIP
response with status codes 101 to 199 inclusive (i.e., any- or SIPS URI instead, the 416 response should not be added
thing but 100), the proxy must reset timer C for that client to the response context.
248 ◾ Handbook on Session Initiation Protocol
3.11.7.5 Check Response for Forwarding the 6xx class responses if any exist in the context. If no 6xx
class responses are present, the proxy should choose from
Until a final response has been sent on the server transaction,
the lowest response class stored in the response context. The
the following responses must be forwarded immediately:
proxy may select any response within that chosen class. The
proxy should give preference to responses that provide infor-
◾◾ Any provisional response other than 100 Trying mation affecting resubmission of this request, such as 401,
◾◾ Any 2xx response 407, 415, 420, and 484 if the 4xx class is chosen. A proxy
that receives a 503 Service Unavailable response should not
If a 6xx response is received, it is not immediately for- forward it upstream unless it can determine that any sub-
warded, but the stateful proxy should cancel all client pend- sequent requests it might proxy will also generate a 503. In
ing transactions as described in Section 3.3, and it MUST other words, forwarding a 503 means that the proxy knows it
NOT create any new branches in this context. This is a cannot service any requests, not just the one for the Request-
change from RFC 2543 obsoleted by RFC 3261, which man- URI in the request that generated the 503.
dated that the proxy was to forward the 6xx response imme- If the only response that was received is a 503, the proxy
diately. For an INVITE transaction, this approach had the should generate a 500 response and forward that upstream.
problem that a 2xx response could arrive on another branch, The forwarded response must be processed as described in
in which case the proxy would have to forward the 2xx. The steps Aggregate Authorization Header Field Values through
result was that the UAC could receive a 6xx response fol- Record-Route. For example, if a proxy forwarded a request to
lowed by a 2xx response, which should never be allowed to four locations, and received 503, 407, 501, and 404 responses,
occur. Under the new rules, upon receiving a 6xx, a proxy it may choose to forward the 407 Proxy Authentication
will issue a CANCEL request, which will generally result in Required responses. 1xx and 2xx responses may be involved
487 responses from all outstanding client transactions, and in the establishment of dialogs. When a request does not
then at that point the 6xx is forwarded upstream. After a contain a To tag, the To tag in the response is used by the
final response has been sent on the server transaction, the UAC to distinguish multiple responses to a dialog creating
following responses must be forwarded immediately: request. A proxy must not insert a tag into the To header field
of a 1xx or 2xx response if the request did not contain one.
◾◾ Any 2xx response to an INVITE request A proxy must not modify the tag in the To header field of a
1xx or 2xx response. Since a proxy may not insert a tag into
A stateful proxy must not immediately forward any other the To header field of a 1xx response to a request that did not
responses. In particular, a stateful proxy must not forward contain one, it cannot issue non-100 provisional responses
any 100 Trying response. Those responses that are candi- on its own. However, it can branch the request to a UAS
dates for forwarding later as the best response have been sharing the same element as the proxy.
gathered as described in step Add Response to Context. Any This UAS can return its own provisional responses, enter-
response chosen for immediate forwarding must be processed ing into an early dialog with the initiator of the request. The
as described in steps Aggregate Authorization Header Field UAS does not have to be a discreet process from the proxy.
Values through Record-Route. This step, combined with the It could be a virtual UAS implemented in the same code
next, ensures that a stateful proxy will forward exactly one space as the proxy. 3-6xx responses are delivered hop by hop.
final response to a non-INVITE request, and either exactly When issuing a 3-6xx response, the element is effectively act-
one non-2xx response or one or more 2xx responses to an ing as a UAS, issuing its own response, usually based on the
INVITE request. responses received from downstream elements. An element
should preserve the To tag when simply forwarding a 3-6xx
response to a request that did not contain a To tag. A proxy
3.11.7.6 Choosing the Best Response must not modify the To tag in any forwarded response to a
A stateful proxy must send a final response to a response request that contains a To tag. While it makes no difference
context’s server transaction if no final responses have been to the upstream elements if the proxy replaced the To tag
immediately forwarded by the above rules and all client in a forwarded 3-6xx response, preserving the original tag
transactions in this response context have been terminated. may assist with debugging. When the proxy is aggregating
The stateful proxy must choose the best final response among information from several responses, choosing a To tag from
those received and stored in the response context. If there among them is arbitrary, and generating a new To tag may
are no final responses in the context, the proxy must send make debugging easier. This happens, for instance, when
a 408 Request Timeout response to the server transaction. combining 401 Unauthorized and 407 Proxy Authentication
Otherwise, the proxy must forward a response from the Required challenges, or combining Contact values from
responses stored in the response context. It must choose from unencrypted and unauthenticated 3xx responses.
SIP Message Elements ◾ 249
3.11.7.7 Aggregate Authorization the proxy modifies the first Record-Route whose identifier
Header Field Values matches the proxy instance. The modification results in a
URI without this piece of data appended to the user por-
If the selected response is a 401 Unauthorized or 407 tion of the URI. Upon the next iteration, the same algorithm
Proxy Authentication Required, the proxy must collect any (find the topmost Record-Route header field value with the
WWWAuthenticate and Proxy-Authenticate header field parameter) will correctly extract the next Record-Route
values from all other 401 Unauthorized and 407 Proxy header field value inserted by that proxy. Not every response
Authentication Required responses received thus far in to a request to which a proxy adds a Record-Route header
this response context, and add them to this response with- field value will contain a Record-Route header field. If the
out modification before forwarding. The resulting 401 response does contain a Record-Route header field, it will
(Unauthorized) or 407 Proxy Authentication Required contain the value the proxy added.
response could have several WWWAuthenticate and Proxy-
Authenticate header field values. This is necessary because
any or all of the destinations the request was forwarded to 3.11.7.9 Forward Response
may have requested credentials. The client needs to receive all After performing the processing described in steps Aggregate
of those challenges and supply credentials for each of them Authorization Header Field Values through Record-Route,
when it retries the request. Motivation for this behavior is the proxy may perform any feature-specific manipulations
provided in Section 19.12. on the selected response. The proxy must not add to, modify,
or remove the message body. Unless otherwise specified, the
proxy must not remove any header field values other than the
3.11.7.8 Record-Route
Via header field value discussed in Section 3.11.7. In particu-
If the selected response contains a Record-Route header lar, the proxy must not remove any received parameter it may
field value originally provided by this proxy, the proxy may have added to the next Via header field value while process-
choose to rewrite the value before forwarding the response. ing the request associated with this response. The proxy must
This allows the proxy to provide different URIs for itself to pass the response to the server transaction associated with
the next upstream and downstream elements. A proxy may the response context. This will result in the response being
choose to use this mechanism for any reason. For instance, sent to the location now indicated in the topmost Via header
it is useful for multihomed hosts. If the proxy received the field value. If the server transaction is no longer available
request over TLS, and sent it out over a non-TLS connection, to handle the transmission, the element must forward the
the proxy must rewrite the URI in the Record-Route header response statelessly by sending it to the server transport. The
field to be a SIPS URI. If the proxy received the request over server transaction might indicate failure to send the response
a non-TLS connection, and sent it out over TLS, the proxy or signal a timeout in its state machine. These errors would
must rewrite the URI in the Record-Route header field to be logged for diagnostic purposes as appropriate; however,
be a SIP URI. The new URI provided by the proxy must the protocol requires no remedial action from the proxy.
satisfy the same constraints on URIs placed in Record-Route The proxy must maintain the response context until all of
header fields in requests (see Section 3.11.6) with the follow- its associated transactions have been terminated, even after
ing modifications: the URI should not contain the transport forwarding a final response.
parameter unless the proxy has knowledge that the next
upstream (as opposed to downstream) element that will be
3.11.7.10 Generate CANCELs
in the path of subsequent requests supports that transport.
When a proxy does decide to modify the Record-Route If the forwarded response was a final response, the proxy
header field in the response, one of the operations it performs must generate a CANCEL request for all pending client
is locating the Record-Route value that it had inserted. transactions associated with this response context. A proxy
If the request spiraled, and the proxy inserted a Record- should also generate a CANCEL request for all pending cli-
Route value in each iteration of the spiral, locating the cor- ent transactions associated with this response context when
rect value in the response (which must be the proper iteration it receives a 6xx response. A pending client transaction is one
in the reverse direction) is tricky. The rules above recom- that has received a provisional response, but no final response
mend that a proxy wishing to rewrite Record-Route header (it is in the proceeding state), and has not had an associated
field values insert sufficiently distinct URIs into the Record- CANCEL generated for it. Generating CANCEL requests
Route header field so that the right one may be selected for is described in Section 3.2. The requirement to CANCEL
rewriting. A recommended mechanism to achieve this is for pending client transactions upon forwarding a final response
the proxy to append a unique identifier for the proxy instance does not guarantee that an end point will not receive mul-
to the user portion of the URI. When the response arrives, tiple 200 OK responses to an INVITE. 200 OK responses
250 ◾ Handbook on Session Initiation Protocol
on more than one branch may be generated before the requests for all pending client transactions in the context
CANCEL requests can be sent and processed. Furthermore, as described in Section 3.11.7. If a response context is not
it is reasonable to expect that a future extension may override found, the element does not have any knowledge of the
this requirement to issue CANCEL requests. request to apply the CANCEL to. It must statelessly forward
the CANCEL request (it may have statelessly forwarded the
associated request previously).
3.11.8 Processing Timer C
If timer C should fire, the proxy must either reset the timer
with any value it chooses, or terminate the client transaction. 3.11.11 Stateless Proxy
If the client transaction has received a provisional response, When acting statelessly, a proxy is a simple message for-
the proxy must generate a CANCEL request matching that warder. Much of the processing performed when acting
transaction. If the client transaction has not received a pro- statelessly is the same as when behaving statefully. The dif-
visional response, the proxy must behave as if the transac- ferences are detailed here. A stateless proxy does not have any
tion received a 408 Request Timeout response. Allowing notion of a transaction, or of the response context used to
the proxy to reset the timer allows the proxy to dynamically describe stateful proxy behavior. Instead, the stateless proxy
extend the transaction’s lifetime based on current conditions takes messages, both requests and responses, directly from
(such as utilization) when the timer fires. the transport layer (see Section 3.13). As a result, stateless
proxies do not retransmit messages on their own. They do,
however, forward all retransmissions they receive (they do
3.11.9 Handling Transport Errors
not have the ability to distinguish a retransmission from the
If the transport layer notifies a proxy of an error when it tries original message). Furthermore, when handling a request
to forward a request (see Section 3.13.4), the proxy must statelessly, an element must not generate its own 100 Trying
behave as if the forwarded request received a 503 Service or any other provisional response. A stateless proxy must vali-
Unavailable response. If the proxy is notified of an error date a request as described in Section 3.11.3.
when forwarding a response, it drops the response. The proxy A stateless proxy must follow the request processing steps
should not cancel any outstanding client transactions associ- described in Sections 3.11.4 and 3.11.5 with the following
ated with this response context due to this notification. If exception:
a proxy cancels its outstanding client transactions, a single
malicious or misbehaving client can cause all transactions to ◾◾ A stateless proxy must choose one and only one tar-
fail through its Via header field. get from the target set. This choice must only rely on
fields in the message and time-invariant properties of
the server. In particular, a retransmitted request must
3.11.10 CANCEL Processing
be forwarded to the same destination each time it is
A stateful proxy may generate a CANCEL to any other processed. Furthermore, CANCEL and non-Routed
request it has generated at any time (subject to receiving a ACK requests must generate the same choice as their
provisional response to that request as described in Section associated INVITE.
3.2). A proxy MUST cancel any pending client transac-
tions associated with a response context when it receives a A stateless proxy must follow the request processing steps
matching CANCEL request. A stateful proxy may gener- described in Section 2.12.6 with the following exceptions:
ate CANCEL requests for pending INVITE client transac-
tions based on the period specified in the INVITE’s Expires ◾◾ The requirement for unique branch IDs across space
header field elapsing. However, this is generally unnecessary and time applies to stateless proxies as well. However,
since the end points involved will take care of signaling the a stateless proxy cannot simply use a random number
end of the transaction. generator to compute the first component of the branch
While a CANCEL request is handled in a stateful proxy ID, as described in Section 3.11.6. This is because
by its own server transaction, a new response context is not retransmissions of a request need to have the same
created for it. Instead, the proxy layer searches its exist- value, and a stateless proxy cannot tell a retransmission
ing response contexts for the server transaction handling from the original request. Therefore, the component of
the request associated with this CANCEL. If a matching the branch parameter that makes it unique must be the
response context is found, the element must immediately same each time a retransmitted request is forwarded.
return a 200 OK response to the CANCEL request. In this Thus, for a stateless proxy, the branch parameter must
case, the element is acting as a UAS as defined in Section be computed as a combinatoric function of message
3.1.3. Furthermore, the element must generate CANCEL parameters that are invariant on retransmission.
SIP Message Elements ◾ 251
The stateless proxy may use any technique it likes value. The proxy must not add to, modify, or remove
to guarantee the uniqueness of its branch IDs across the message body. Unless specified otherwise, the
transactions. However, the following procedure is rec- proxy must not remove any other header field values.
ommended. The proxy examines the branch ID in the If the address does not match the proxy, the message
topmost Via header field of the received request. If it must be silently discarded.
begins with the magic cookie, the first component of
the branch ID of the outgoing request is computed as
a hash of the received branch ID. Otherwise, the first
3.11.12 Summary of Proxy Route Processing
component of the branch ID is computed as a hash In the absence of local policy to the contrary, the processing
of the topmost Via, the tag in the To header field, the that a proxy performs on a request containing a Route header
tag in the From header field, the Call-ID header field, field can be summarized in the following steps:
the CSeq number (but not method), and the Request-
URI from the received request. One of these fields will 1. The proxy will inspect the Request-URI. If it indi-
always vary across two different transactions. cates a resource owned by this proxy, the proxy will
◾◾ All other message transformations specified in Section replace it with the results of running a location service.
3.11.6 must result in the same transformation of Otherwise, the proxy will not change the Request-URI.
a retransmitted request. In particular, if the proxy 2. The proxy will inspect the URI in the topmost Route
inserts a Record-Route value or pushes URIs into the header field value. If it indicates this proxy, the proxy
Route header field, it must place the same values in removes it from the Route header field (this route node
retransmissions of the request. As for the Via branch has been reached).
parameter, this implies that the transformations must 3. The proxy will forward the request to the resource
be based on time-invariant configuration or retrans- indicated by the URI in the topmost Route header
mission-invariant properties of the request. field value or in the Request-URI if no Route header
◾◾ A stateless proxy determines where to forward the field is present. The proxy determines the address,
request as described for stateful proxies in Section port, and transport to use when forwarding the
3.11.2. The request is sent directly to the transport request by applying the procedures in RFC 3263 (see
layer instead of through a client transaction. Since a Section 8.2.4) to that URI. If no strict-routing ele-
stateless proxy must forward retransmitted requests to ments are encountered on the path of the request, the
the same destination and add identical branch param- Request-URI will always indicate the target of the
eters to each of them, it can only use information from request.
the message itself and time-invariant configuration
data for those calculations. If the configuration state is
not time invariant (e.g., if a routing table is updated),
any requests that could be affected by the change may
3.11.12.1 Examples
not be forwarded statelessly during an interval equal 3.11.12.1.1 Basic SIP Trapezoid
to the transaction timeout window before or after the
change. The method of processing the affected requests This scenario is the basic SIP trapezoid, U1 -> P1 -> P2 ->
in that interval is an implementation decision. U2, with both proxies record-routing. Here is the flow.
A common solution is to forward the transaction
statefully. Stateless proxies must not perform special U1 sends
processing for CANCEL requests. They are processed INVITE sip:[email protected] SIP/2.0
by the above rules as any other requests. In particular, Contact: sip:[email protected] to P1. P1
a stateless proxy applies the same Route header field is an outbound proxy. P1 is not responsible
processing to CANCEL requests that it applies to any for domain.com, so it looks it up in DNS and
other request. Response processing as described in sends it there. It also adds a Record-Route
header field value: INVITE sip:callee@domain.
Section 3.11.7 does not apply to a proxy behaving state-
com SIP/2.0 Contact: sip:[email protected].
lessly. When a response arrives at a stateless proxy, the com Record-Route: <sip:p1.example.com;lr>
proxy must inspect the sent-by value in the first (top-
most) Via header field value. If that address matches P2 gets this. It is responsible for domain.com, so it runs
the proxy (it equals a value this proxy has inserted into a location service and rewrites the Request-URI. It also adds
previous requests), the proxy must remove that header a Record-Route header field value. There is no Route header
field value from the response and forward the result field, so it resolves the new Request-URI to determine where
to the location indicated in the next Via header field to send the request:
252 ◾ Handbook on Session Initiation Protocol
rooted in the importance of delivering all 200 OK responses misidentify retransmissions of the request as a new, unassoci-
to an INVITE to the UAC. To deliver them all to the UAC, ated request. The correction involves modifying the INVITE
the UAS alone takes responsibility for retransmitting them, transaction state machines. The correction also changes the
and the UAC alone takes responsibility for acknowledging way responses that cannot be matched to an existing trans-
them with ACK. Since this ACK is retransmitted only by the action are handled to address a security risk. This specifi-
UAC, it is effectively considered its own transaction. cation (RFC 6026) describes an essential correction to the
However, RFC 6026 normatively updates RFC 3261, SIP, defined in RFC 3261. The change addresses an error in
the SIP, to address an error in the specified handling of suc- the handling of 2xx class responses to INVITE requests that
cess (2xx class) responses to INVITE requests. Elements leads to retransmissions of the INVITE being treated as new
following RFC 3261 (see Sections 2.5 and 2.6) exactly will requests and forbids forwarding stray INVITE responses.
CT ST CT ST CT ST
Response Response Response
SIP network
Alice (caller) Bob (callee) UAC Outbound Inbound UAS
SIP UA SIP UA (Alice) proxy proxy (Bob)
(U1) (U2) ST = server transaction
[email protected] CT = client transaction
[email protected]
(a) (b)
Request from TU
send request
INVITE from TU
INVITE sent Timer E Timer F fires or transport error
Timer A fires Timer B fires send request inform TU
Reset A, or transport error Trying
INVITE sent inform TU 200–699
Response to TU
Calling
2xx
1xx
2xx to TU
response to TU
1xx
1xx Timer E Timer F fires or transport
1xx to TU
1xx to TU send request error inform TU
300–699 Proceeding
ACK sent Proceeding 1xx
response to TU 300–699 2xx response to TU
ACK sent 2xx to TU
response to TU 200–699
Transport error response to TU
inform TU
Completed Accepted
300–699 Completed
ACK Timer D fires Timer M fires
sent
2xx
Note: 2xx to TU Timer K
Transactions are labeled with the Terminated
Terminated
event over the action to take
(c) (d)
Figure 3.20 SIP transaction: (a) SIP network, (b) SIP transaction relationships, (c) INVITE client transaction state
machine (including updates from RFC 6026), and (d) non-INVITE client transaction state machine. (Copyright IETF.
Reproduced with permission.)
256 ◾ Handbook on Session Initiation Protocol
Timer C More than 3 minutes Section 16.6, bullet Proxy INVITE transaction timeout
11
Timer D More than 32 seconds for Section 17.1.1.2 Wait time for response
UDP, 0 second for TCP/SCTP
Timer I T4, 0 second for TCP/SCTP Section 17.2.1 For UDP wait time for ACK retransmits
Timer J 64*T1, 0 second for TCP/SCTP Section 17.2.2 For UDP wait time for non-INVITE request
retransmits
Timer K T4 for UDP, 0 second for TCP/ Section 17.1.2.2 Wait time for response retransmits
SCTP
Timer L 64*T1 Section 17.2.1 Wait time for accepted INVITE request
retransmits
reliance on a two-way handshake, TUs should respond The default value for T1 is 500 milliseconds. T1 is an
immediately to non-INVITE requests. estimate of the RTT between the client and server transac-
tions. Elements may (though it is not recommended) use
smaller values of T1 within closed, private networks that do
3.12.1.1 INVITE Client Transaction not permit general Internet connection. T1 may be chosen
3.12.1.1.1 Overview larger, and this is recommended if it is known in advance
(such as on high latency access links) that the RTT is
The INVITE transaction consists of a three-way handshake. larger. Whatever the value of T1, the exponential back-offs
The client transaction sends an INVITE, the server trans- on retransmissions described in this section must be used.
action sends responses, and the client transaction sends an When timer B fires and if the client transaction is still in
ACK. For unreliable transports (such as UDP), the client the Calling state, the client transaction should inform the
transaction retransmits requests at an interval that starts at TU that a timeout has occurred. The client transaction must
T1 seconds and doubles after every retransmission. T1 is an not generate an ACK. The value of 64*T1 is equal to the
estimate of the round-trip time (RTT), and it defaults to 500 amount of time required to send seven requests in the case
milliseconds. Nearly all of the transaction timers described of an unreliable transport. If the client transaction receives a
here scale with T1, and changing T1 adjusts their values. The provisional response while in the Calling state, it transitions
request is not retransmitted over reliable transports. After to the Proceeding state. In the Proceeding state, the client
receiving a 1xx response, any retransmissions cease alto- transaction should not retransmit the request any longer.
gether, and the client waits for further responses. The server Furthermore, the provisional response must be passed to the
transaction can send additional 1xx responses, which are not TU. Any further provisional responses must be passed up to
transmitted reliably by the server transaction. Eventually, the TU while in the Proceeding state. (Note that the next
the server transaction decides to send a final response. For three paragraphs have been updated/modified in accordance
unreliable transports, that response is retransmitted periodi- to RFC 6926.)
cally, and for reliable transports, it is sent once. For each final When in either the Calling or Proceeding states, recep-
response that is received at the client transaction, the client tion of a response with status code from 300 to 699 must
transaction sends an ACK, the purpose of which is to quench cause the client transaction to transition to Completed. The
retransmissions of the response. client transaction MUST pass the received response up to
the TU, and the client transaction must generate an ACK
request, even if the transport is reliable (guidelines for con-
3.12.1.1.2 Description
structing the ACK from the response are given in the next
The state machine for the INVITE client transaction is section), and then pass the ACK to the transport layer for
shown in Figure 3.20c. The initial state, calling, must be transmission. The ACK must be sent to the same address,
entered when the TU initiates a new client transaction with port, and transport to which the original request was sent.
an INVITE request. The client transaction must pass the The client transaction must start Timer D when it enters the
request to the transport layer for transmission. If an unreli- Completed state for any reason, with a value of at least 32
able transport is being used, the client transaction must start seconds for unreliable transports, and a value of 0 seconds for
timer A with a value of T1. If a reliable transport is being reliable transports. Timer D reflects the amount of time that
used, the client transaction should not start timer A (timer the server transaction can remain in the Completed state
A controls request retransmissions). For any transport, the when unreliable transports are used. This is equal to timer H
client transaction must start timer B with a value of 64*T1 in the INVITE server transaction, whose default is 64*T1,
seconds (timer B controls transaction timeouts). When timer and is also equal to the time a UAS core will wait for an ACK
A fires, the client transaction must retransmit the request by once it sends a 2xx response.
passing it to the transport layer. It then must reset the timer However, the client transaction does not know the value
with a value of 2*T1. The formal definition of retransmit of T1 in use by the server transaction or any downstream
within the context of the transaction layer is to take the mes- UAS cores, so an absolute minimum of 32 seconds is used
sage previously sent to the transport layer and pass it to the instead of basing timer D on T1. Any retransmissions of a
transport layer once more. When timer A fires 2*T1 seconds response with status code 300–699 that are received while
later, the request must be retransmitted again (assuming the in the Completed state must cause the ACK to be repassed
client transaction is still in this state). This process must con- to the transport layer for retransmission; however, the
tinue so that the request is retransmitted with intervals that newly received response must not be passed up to the TU.
double after each transmission. These retransmissions should A retransmission of the response is defined as any response
only be done while the client transaction is in the calling that would match the same client transaction based on the
state. rules of matching responses described later. If timer D fires
258 ◾ Handbook on Session Initiation Protocol
while the client transaction is in the Completed state, the generates an ACK for 2xx must instead follow the rules as
client transaction MUST move to the Terminated state. follows:
When a 2xx response is received while in either the Calling
or Proceeding states, the client transaction must transition to ◾◾ The ACK must be passed to the client transport every
the Accepted state, and timer M must be started with a value time a retransmission of the 2xx final response that
of 64*T1. The 2xx response MUST be passed up to the TU. triggered the ACK arrives.
The client transaction must not generate an ACK to ◾◾ Response retransmissions cease when an ACK request
the 2xx response—its handling is delegated to the TU. A for the response is received. This is independent of
UAC core will send an ACK to the 2xx response using a whatever transport protocols are used to send the
new transaction. A proxy core will always forward the 2xx response.
response upstream. The purpose of the Accepted state is to
allow the client transaction to continue to exist to receive, The ACK request constructed by the client transaction
and pass to the TU, any retransmissions of the 2xx response must contain values for the Call-ID, From, and Request-
and any additional 2xx responses from other branches of URI that are equal to the values of those header fields in the
the INVITE if it forked downstream. Timer M reflects the request passed to the transport by the client transaction (let
amount of time that the TU will wait for such messages. Any us call this the original request). The To header field in the
2xx responses that match this client transaction and that are ACK must equal the To header field in the response being
received while in the Accepted state MUST be passed up to acknowledged, and therefore will usually differ from the To
the TU. The client transaction must not generate an ACK header field in the original request by the addition of the tag
to the 2xx response. The client transaction takes no further parameter. The ACK must contain a single Via header field,
action. If timer M fires while the client transaction is in and this must be equal to the top Via header field of the orig-
the Accepted state, the client transaction must move to the inal request. The CSeq header field in the ACK must contain
Terminated state. The client transaction must be destroyed the same value for the sequence number as was present in the
the instant it enters the Terminated state. original request, but the method parameter must be equal
If timer D fires while the client transaction is in the to ACK. If the INVITE request whose response is being
Completed state, the client transaction must move to the acknowledged had Route header fields, those header fields
terminated state. When in either the Calling or Proceeding must appear in the ACK. This is to ensure that the ACK can
states, reception of a 2xx response must cause the client trans- be routed properly through any downstream stateless prox-
action to enter the Terminated state, and the response must be ies. Although any request may contain a body, a body in an
passed up to the TU. The handling of this response depends ACK is special since the request cannot be rejected if the
on whether the TU is a proxy core or a UAC core. A UAC body is not understood. Therefore, placement of bodies in
core will handle generation of the ACK for this response, ACK for non-2xx is not recommended, but if done, the body
while a proxy core will always forward the 200 OK upstream. types are restricted to any that appeared in the INVITE,
The differing treatment of 200 OK between proxy and UAC assuming that the response to the INVITE was not 415. If
is the reason that handling of it does not take place in the it was, the body in the ACK may be any type listed in the
transaction layer. The client transaction must be destroyed Accept header field in the 415.
the instant it enters the Terminated state. This is actually nec- For example, consider the following request:
essary to guarantee correct operation. The reason is that 2xx
responses to an INVITE are treated differently; each one is INVITE sip:[email protected] SIP/2.0
forwarded by proxies, and the ACK handling in a UAC is Via: SIP/2.0/UDP pc33.atlanta.com;
branch=z9hG4bKkjshdyff
different. Thus, each 2xx needs to be passed to a proxy core
To: Bob <sip:[email protected]>
(so that it can be forwarded) and to a UAC core (so it can be From: Alice <sip:[email protected]>;
acknowledged). No transaction layer processing takes place. tag=88sja8x
Whenever a response is received by the transport, if the trans- Max-Forwards: 70
port layer finds no matching client transaction, the response Call-ID: 987asjd97y7atg
is passed directly to the core. Since the matching client trans- CSeq: 986759 INVITE
action is destroyed by the first 2xx, subsequent 2xx will find
no match and therefore be passed to the core. The ACK request for a non-2xx final response to this
request would look like this:
From: Alice <sip:[email protected]>; enters the Completed state, it must set timer K to fire in T4
tag=88sja8x seconds for unreliable transports, and 0 seconds for reliable
Max-Forwards: 70
transports. The Completed state exists to buffer any addi-
Call-ID: 987asjd97y7atg
CSeq: 986759 ACK tional response retransmission that may be received (which is
why the client transaction remains there only for unreliable
transports). T4 represents the amount of time the network
will take to clear messages between client and server transac-
3.12.1.2 Non-INVITE Client Transaction
tions. The default value of T4 is 5 seconds. A response is a
The non-INVITE client transaction is depicted in Figure retransmission when it matches the same transaction, using
3.20d. Non-INVITE transactions do not make use of the rules specified earlier. If timer K fires while in this state,
ACK. They are simple request–response interactions. For the client transaction must transition to the Terminated
unreliable transports, requests are retransmitted at an state. Once the transaction is in the terminated state, it must
interval that starts at T1 and doubles until it hits T2. If a be destroyed immediately.
provisional response is received, retransmissions continue
for unreliable transports, but at an interval of T2. The state
machine for the non-INVITE client transaction is shown
3.12.1.3 Matching Responses to
in Figure 3.10. It is very similar to the state machine for Client Transactions
INVITE. The Trying state is entered when the TU initi- When the transport layer in the client receives a response,
ates a new client transaction with a request. When enter- it has to determine which client transaction will handle the
ing this state, the client transaction should set timer F to response, so that the processing described earlier can take
fire in 64*T1 seconds. The request must be passed to the place. The branch parameter in the top Via header field is
transport layer for transmission. If an unreliable trans- used for this purpose. A response matches a client transac-
port is in use, the client transaction must set timer E to tion under two conditions:
fire in T1 seconds. If timer E fires while still in this state,
the timer is reset. However, it is with a reset time value of ◾◾ If the response has the same value of the branch param-
MIN(2*T1, T2). eter in the top Via header field as the branch parameter
When the timer fires again, it is reset to a MIN(4*T1, in the top Via header field of the request that created
T2). This process continues so that retransmissions occur the transaction.
with an exponentially increasing interval that caps at T2. ◾◾ If the method parameter in the CSeq header field
The default value of T2 is 4 seconds, and it represents the matches the method of the request that created the
amount of time a non-INVITE server transaction will take transaction. The method is needed since a CANCEL
to respond to a request, if it does not respond immediately. request constitutes a different transaction, but shares
For the default values of T1 and T2, this results in intervals the same value of the branch parameter.
of 500 milliseconds, 1 second, 2 seconds, 3 seconds, 4 sec-
onds, 4 seconds, etc. If timer F fires while the client transac- If a request is sent via multicast, it is possible that it
tion is still in the Trying state, the client transaction should will generate multiple responses from different servers.
inform the TU about the timeout, and then it should enter These responses will all have the same branch parameter in
the Terminated state. If a provisional response is received the topmost Via, but vary in the To tag. The first response
while in the Trying state, the response must be passed to received, based on the rules above, will be used, and oth-
the TU, and then the client transaction should move to the ers will be viewed as retransmissions. That is not an error;
Proceeding state. If a final response (status codes 200–699) multicast SIP provides only a rudimentary single-hop-
is received while in the Trying state, the response must be discovery-like service that is limited to processing a single
passed to the TU, and the client transaction must transition response.
to the Completed state.
If timer E fires while in the Proceeding state, the request
must be passed to the transport layer for retransmission and 3.12.1.4 Handling Transport Errors
timer E must be reset with a value of T2 seconds. If timer F When the client transaction sends a request to the transport
fires while in the Proceeding state, the TU must be informed layer to be sent, the following procedures are followed if the
of a timeout, and the client transaction must transition to the transport layer indicates a failure. The client transaction
terminated state. If a final response (status codes 200–699) should inform the TU that a transport failure has occurred,
is received while in the Proceeding state, the response must and the client transaction should transition directly to the
be passed to the TU, and the client transaction must tran- Terminated state. The TU will handle the failover mecha-
sition to the Completed state. Once the client transaction nisms described in RFC 3263.
260 ◾ Handbook on Session Initiation Protocol
Request received
INVITE pass to TU
Pass INVITE to TU
Send 100 if TU won’t in 200 ms
200–699 from TU
101–199 from TU send response
Trying
send response
1xx from TU
INVITE Transport error send response
Proceeding Request
send response inform TU
1xx from TU send response
send response
300–699 from TU 2xx from TU Proceeding
send response send response
Transport error
Timer G fires
200–699 from TU inform TU
send response Transport error
INVITE send response
send response Completed INVITE inform TU Request
send response
Accepted ACK to TU
ACK
Completed
Transport error
Confirmed Transport error
inform TU
2xx from TU inform TU
Timer I fires send response Timer J fires
Timer H fires
Timer L fires
Terminated Terminated
(a) (b)
Figure 3.21 State machines for server transactions: (a) INVITE server transaction (including updates from RFC 6026) and
(b) non-INVITE server transaction. (Copyright IETF. Reproduced with permission.)
request. If timer G fires, the response is passed to the trans- the Trying state and is passed a request other than INVITE
port layer once more for retransmission, and timer G is set or ACK when initialized. This request is passed up to the
to fire in min(2*T1, T2) seconds. From then on, when timer TU. Once in the Trying state, any further request retrans-
G fires, the response is passed to the transport again for missions are discarded. A request is a retransmission if it
transmission, and timer G is reset with a value that doubles, matches the same server transaction, using the rules specified
unless that value exceeds T2, in which case it is reset with earlier. While in the Trying state, if the TU passes a provi-
the value of T2. This is identical to the retransmit behavior sional response to the server transaction, the server transac-
for requests in the Trying state of the non-INVITE client tion must enter the Proceeding state. The response must be
transaction. Furthermore, while in the Completed state, if a passed to the transport layer for transmission. Any further
request retransmission is received, the server should pass the provisional responses that are received from the TU while
response to the transport for retransmission. in the Proceeding state must be passed to the transport layer
If an ACK is received while the server transaction is in for transmission. If a retransmission of the request is received
the Completed state, the server transaction must transition while in the Proceeding state, the most recently sent pro-
to the Confirmed state. As timer G is ignored in this state, visional response must be passed to the transport layer for
any retransmissions of the response will cease. If timer H retransmission.
fires while in the Completed state, it implies that the ACK If the TU passes a final response (status codes 200–699)
was never received. In this case, the server transaction must to the server while in the Proceeding state, the transaction
transition to the Terminated state, and must indicate to the must enter the Completed state, and the response must be
TU that a transaction failure has occurred. The purpose of passed to the transport layer for transmission. When the
the Confirmed state is to absorb any additional ACK mes- server transaction enters the Completed state, it must set
sages that arrive, triggered from retransmissions of the final timer J to fire in 64*T1 seconds for unreliable transports, and
response. When this state is entered, timer I is set to fire in 0 seconds for reliable transports. While in the Completed
T4 seconds for unreliable transports, and 0 seconds for reli- state, the server transaction must pass the final response to
able transports. Once timer I fires, the server must transition the transport layer for retransmission whenever a retrans-
to the Terminated state. (Note that the next two paragraphs mission of the request is received. Any other final responses
have been updated/modified per RFC 6026.) passed by the TU to the server transaction must be discarded
The purpose of the Accepted state is to absorb retrans- while in the Completed state. The server transaction remains
missions of an accepted INVITE request. Any such retrans- in this state until timer J fires, at which point it must transi-
missions are absorbed entirely within the server transaction. tion to the Terminated state. The server transaction must be
They are not passed up to the TU since any downstream UAS destroyed the instant it enters the Terminated state.
cores that accepted the request have taken responsibility for
reliability and will already retransmit their 2xx responses if
3.12.2.3 Matching Requests
necessary. While in the Accepted state, if the TU passes a
to Server Transactions
2xx response, the server transaction must pass the response
to the transport layer for transmission. When a request is received from the network by the server, it
When the INVITE server transaction enters the Accepted has to be matched to an existing transaction. This is accom-
state, timer L must be set to fire in 64*T1 for all transports. plished in the following manner. The branch parameter in
This value matches both timer B in the next upstream client the topmost Via header field of the request is examined. If it
state machine (the amount of time the previous hop will wait is present and begins with the magic cookie z9hG4bK, the
for a response when no provisionals have been sent) and the request was generated by a client transaction compliant to
amount of time this (or any downstream) UAS core might this specification. Therefore, the branch parameter will be
be retransmitting the 2xx while waiting for an ACK. If an unique across all transactions sent by that client. The request
ACK is received while the INVITE server transaction is in matches a transaction if
the Accepted state, then the ACK must be passed up to the
TU. If timer L fires while the INVITE server transaction is ◾◾ The branch parameter in the request is equal to the one
in the Accepted state, the transaction must transition to the in the top Via header field of the request that created
Terminated state. Once the transaction is in the Terminated the transaction.
state, it must be destroyed immediately. ◾◾ The sent-by value in the top Via of the request is equal
to the one in the request that created the transaction.
◾◾ The method of the request matches the one that cre-
3.12.2.2 Non-INVITE Server Transaction
ated the transaction, except for ACK, where the
The state machine for the non-INVITE server transaction method of the request that created the transaction is
is shown in Figure 3.21b. The state machine is initialized in INVITE.
262 ◾ Handbook on Session Initiation Protocol
This matching rule applies to both INVITE and non- First, the procedures in RFC 3263 (see Section 8.2.4) are
INVITE transactions alike. The sent-by value is used as part followed, which attempt to deliver the response to a backup.
of the matching process because there could be accidental If those should all fail, based on the definition of failure in
or malicious duplication of branch parameters from differ- RFC 3263 (see Section 8.2.4), the server transaction should
ent clients. If the branch parameter in the top Via header inform the TU that a failure has occurred, and should transi-
field is not present, or does not contain the magic cookie, tion to the terminated state.
the following procedures are used. These exist to handle
backwards compatibility with RFC 2543 obsoleted by RFC
3.12.2.5 Non-INVITE Transactions
3261–compliant implementations. The INVITE request
matches a transaction if the Request-URI, To tag, From tag, The procedures described in RFC 3261 may have the high
Call-ID, CSeq, and top Via header field match those of the probability of messages losing the race condition inherent
INVITE request that created the transaction. In this case, in the non-INVITE transactions, and may create unneces-
the INVITE is a retransmission of the original one that cre- sary network traffic storm as documented in RFC 4321. To
ated the transaction. The ACK request matches a transaction take care of this problem, RFC 4320 provides the normative
if the Request-URI, From tag, Call-ID, CSeq number (not updates of RFC 3261 for the SIP non-INVITE transactions
the method), and top Via header field match those of the as follows:
INVITE request that created the transaction, and the To tag
of the ACK matches the To tag of the response sent by the ◾◾ Make the best use of provisional responses
server transaction. Matching is done based on the matching – A SIP element must not send any provisional
rules defined for each of those header fields. response with a Status-Code other than 100 to a
Inclusion of the tag in the To header field in the ACK non-INVITE request.
matching process helps disambiguate ACK for 2xx from ACK – A SIP element must not respond to a non-INVITE
for other responses at a proxy, which may have forwarded both request with a Status-Code of 100 over any unre-
responses. This can occur in unusual conditions. Specifically, liable transport, such as UDP, before the amount
when a proxy forked a request, and then crashes, the responses of time it takes a client transaction’s timer E to be
may be delivered to another proxy, which might end up for- reset to T2.
warding multiple responses upstream. An ACK request that – A SIP element may respond to a non-INVITE
matches an INVITE transaction matched by a previous ACK request with a Status-Code of 100 over a reliable
is considered a retransmission of that previous ACK. For all transport at any time.
other request methods, a request is matched to a transac- – Without regard to transport, an SIP element must
tion if the Request-URI, To tag, From tag, Call-ID, CSeq respond to a non-INVITE request with a Status-
(including the method), and top Via header field match those Code of 100 if it has not otherwise responded after
of the request that created the transaction. Matching is done the amount of time it takes a client transaction’s
on the basis of the matching rules defined for each of those timer E to be reset to T2.
header fields. When a non-INVITE request matches an exist- ◾◾ Remove the useless late-response storm
ing transaction, it is a retransmission of the request that cre- – A transaction-stateful SIP element must not send
ated that transaction. Because the matching rules include the a response with Status-Code of 408 to a non-
Request-URI, the server cannot match a response to a transac- INVITE request. As a consequence, an element
tion. When the TU passes a response to the server transaction, that cannot respond before the transaction expires
it must pass it to the specific server transaction for which the will not send a final response at all.
response is targeted. – A transaction-stateful SIP proxy must not send
any response to a non-INVITE request unless it
has a matching server transaction that is not in the
3.12.2.4 Handling Transport Errors Terminated state. As a consequence, this proxy will
When the server transaction sends a response to the transport not forward any late non-INVITE responses.
layer to be sent, the following procedures are followed if the
transport layer indicates a failure. First, the procedures in RFC
3263 (see Section 8.2.4) are followed, which attempt to deliver
the response to a backup. If those should all fail, based on the
definition of failure in RFC 3263 (see Section 8.2.4), the server
3.13 Transport
transaction should inform the TU that a failure has occurred, The transport layer is responsible for the actual transmis-
and must remain in the current state. (Note that the last two sion of requests and responses over network transports. This
sentences have been updated/modified per RFC 6026.) includes determination of the connection to use for a request
SIP Message Elements ◾ 263
or response in the case of connection-oriented transports. one indicated in the top Via, the value in the top Via must
The transport layer is responsible for managing persistent be changed. This prevents fragmentation of messages over
connections for transport protocols like TCP and SCTP, or UDP and provides congestion control for larger messages.
TLS over those, including ones opened to the transport layer. However, implementations must be able to handle messages
This includes connections opened by the client or server up to the maximum datagram packet size. For UDP, this size
transports, so that connections are shared between client and is 65,535 bytes, including IP and UDP headers. The 200 byte
server transport functions. These connections are indexed buffer between the message size and the MTU accommo-
by the tuple formed from the address, port, and transport dates the fact that the response in SIP can be larger than the
protocol at the far end of the connection. When a connec- request. This happens due to the addition of Record-Route
tion is opened by the transport layer, this index is set to the header field values to the responses to INVITE, for exam-
destination IP, port, and transport. When the connection is ple. With the extra buffer, the response can be about 170
accepted by the transport layer, this index is set to the source bytes larger than the request, and still not be fragmented on
IP address, port number, and transport. Note that, because IPv4 (about 30 bytes is consumed by IP/UDP, assuming no
the source port is often ephemeral, but it cannot be known IPSec). 1300 is chosen when path MTU is not known based
whether it is ephemeral or selected through procedures in on the assumption of a 1500 byte Ethernet MTU.
RFC 3263 (see Section 8.2.4), connections accepted by the If an element sends a request over TCP because of these
transport layer will frequently not be reused. The result is message size constraints, and that request would have other-
that two proxies in a peering relationship using a connection- wise been sent over UDP, if the attempt to establish the con-
oriented transport frequently will have two connections in nection generates either an ICMP Protocol Not Supported,
use, one for transactions initiated in each direction. or results in a TCP reset, the element should retry the request,
It is recommended that connections be kept open for using UDP. This is only to provide backwards compatibility
some implementation-defined duration after the last mes- with RFC 2543 (obsoleted by RFC 3261)–compliant imple-
sage was sent or received over that connection. This duration mentations that do not support TCP. It is anticipated that
should at least equal the longest amount of time the element this behavior will be deprecated in a future revision of this
would need in order to bring a transaction from instantiation specification. A client that sends a request to a multicast
to the terminated state. This is to make it likely that trans- address must add the maddr parameter to its Via header field
actions are completed over the same connection on which value containing the destination multicast address, and for
they are initiated (e.g., request, response, and in the case of IPv4, should add the ttl parameter with a value of 1. Usage
INVITE, ACK for non-2xx responses). This usually means of IPv6 multicast is not defined in this specification, and will
at least 64*T1 (see Section 3.12.1.1.1 for a definition of T1). be a subject of future standardization when the need arises.
However, it could be larger in an element that has a TU using These rules result in a purposeful limitation of multicast in
a large value for timer C (Section 3.11.6), for example. All SIP. Its primary function is to provide a single-hop-discovery-
SIP elements must implement UDP and TCP. SIP elements like service, delivering a request to a group of homogeneous
may implement other protocols. Making TCP mandatory servers, where it is only required to process the response from
for the UA is a substantial change from RFC 2543 obsoleted any one of them. This functionality is most useful for regis-
by RFC 3261. It has arisen out of the need to handle larger trations. In fact, based on the transaction processing rules in
messages, which must use TCP, as discussed below. Thus, Section 3.12.1.3, the client transaction will accept the first
even if an element never sends large messages, it may receive response, and view any others as retransmissions because
one and needs to be able to handle them. they all contain the same Via branch identifier.
Before a request is sent, the client transport must insert a
value of the sent-by field into the Via header field. This field
3.13.1 Clients contains an IP address or host name, and port. The usage of
an FQDN is recommended. This field is used for sending
3.13.1.1 Sending Requests
responses under certain conditions, described below. If the
The client side of the transport layer is responsible for send- port is absent, the default value depends on the transport. It
ing the request and receiving responses. The user of the is 5060 for UDP, TCP, and SCTP, and 5061 for TLS. For
transport layer passes the client transport the request, an IP reliable transports, the response is normally sent on the con-
address, port, transport, and possibly time to live (TTL) for nection on which the request was received. Therefore, the
multicast destinations. If a request is within 200 bytes of the client transport must be prepared to receive the response on
path MTU, or if it is larger than 1300 bytes and the path the same connection used to send the request. Under error
MTU is unknown, the request must be sent using an RFC conditions, the server may attempt to open a new connection
2914 congestion controlled transport protocol, such as TCP. to send the response. To handle this case, the transport layer
If this causes a change in the transport protocol from the must also be prepared to receive an incoming connection on
264 ◾ Handbook on Session Initiation Protocol
the source IP address from which the request was sent and A URI can also be handed out by placing it on a web page
port number in the sent-by field. It also MUST be prepared or business card. It is also recommended that a server lis-
to receive incoming connections on any address and port ten for requests on the default SIP ports (5060 for TCP
that would be selected by a server based on the procedures and UDP, 5061 for TLS over TCP) on all public interfaces.
described in RFC 3263 (see Section 8.2.4). The typical exception would be private networks, or when
For unreliable unicast transports, the client transport multiple server instances are running on the same host. For
must be prepared to receive responses on the source IP any port and interface that a server listens on for UDP, it
address from which the request is sent (as responses are sent MUST listen on that same port and interface for TCP. This
back to the source address) and the port number in the sent- is because a message may need to be sent using TCP, rather
by field. Furthermore, as with reliable transports, in certain than UDP, if it is too large. As a result, the converse is
cases the response will be sent elsewhere. The client must not true. A server need not listen for UDP on a particular
be prepared to receive responses on any address and port address and port just because it is listening on that same
that would be selected by a server based on the procedures address and port for TCP. There may, of course, be other
described in RFC 3263 (see Section 8.2.4). For multicast, reasons why a server needs to listen for UDP on a particular
the client transport must be prepared to receive responses address and port.
on the same multicast group and port to which the request When the server transport receives a request over any
is sent (i.e., it needs to be a member of the multicast group it transport, it must examine the value of the sent-by param-
sent the request to.) If a request is destined to an IP address, eter in the top Via header field value. If the host portion
port, and transport to which an existing connection is open, of the sent-by parameter contains a domain name, or if it
it is recommended that this connection be used to send the contains an IP address that differs from the packet source
request, but another connection may be opened and used. address, the server must add a received parameter to that Via
If a request is sent using multicast, it is sent to the group header field value. This parameter MUST contain the source
address, port, and TTL provided by the transport user. If a address from which the packet was received. This is to assist
request is sent using unicast unreliable transports, it is sent the server transport layer in sending the response, since it
to the IP address and port provided by the transport user. must be sent to the source IP address from which the request
came. Consider a request received by the server transport
that looks like, in part:
3.13.1.2 Receiving Responses
INVITE sip:[email protected] SIP/2.0
When a response is received, the client transport examines
Via: SIP/2.0/UDP bobspc.biloxi.com:5060
the top Via header field value. If the value of the sent-by
parameter in that header field value does not correspond The request is received with a source IP address of
to a value that the client transport is configured to insert 192.0.2.4. Before passing the request up, the transport adds
into requests, the response must be silently discarded. The a received parameter, so that the request would look like, in
client transport uses the matching procedures of Section part:
3.12.1.3 to attempt to match the response to an existing
transaction. If there is a match, the response must be passed INVITE sip:[email protected] SIP/2.0
to that transaction. Otherwise, any element other than a Via: SIP/2.0/UDP bobspc.biloxi.
com:5060;received=192.0.2.4
stateless proxy must silently discard the response. (Note
that the last three sentences have been updated/modified
Next, the server transport attempts to match the request
per RFC 6026.)
to a server transaction. It does so using the matching rules
described in Section 3.12.2.3. If a matching server transac-
3.13.2 Servers tion is found, the request is passed to that transaction for
processing. If no match is found, the request is passed to the
3.13.2.1 Receiving Requests
core, which may decide to construct a new server transaction
A server should be prepared to receive requests on any IP for that request. (Note that the last three sentences have been
address, port, and transport combination that can be the updated/modified per RFC 6026.)
result of a DNS lookup on a SIP or SIPS URI specified
in RFC 3263 (see Section 8.2.4) that is handed out for
the purposes of communicating with that server. In this
3.13.2.2 Sending Responses
context, handing out includes placing a URI in a Contact The server transport uses the value of the top Via header field
header field in a REGISTER request or a redirect response, in order to determine where to send a response. It must fol-
or in a Record-Route header field in a request or response. low the following process:
SIP Message Elements ◾ 265
When does a registration expire? How does a user alternative, and multipart-related. How does a UA pro-
refresh and cancel the registration? cess each kind of SIP message body?
3. What are the benefits of doing SIP Registration with 17. What are different Content and Disposition types in
GRUU and managing client-initiated connection? SIP? Explain in detail including UA behavior. How are
4. Which field can a UA use to indicate its capabilities in different kinds message-bodies processed in SIP speci-
a SIP message? What are the specific capabilities that a fied by RFC 5621?
UA can indicate per RFC 3840? 18. How can an existing SIP session be terminated?
5. What is SIP media feature tag? Describe SIP media Explain the behavior of SIP UAC, UAS, stateful
tags that are being registered by IANA. Explain briefly proxy, and stateless proxy. Explain in detail how the
what the sip.isfocus media feature tag is. route information is handled by both the stateful and
6. Describe briefly how UA capabilities can be registered stateless proxies in the context of the terminating the
in SIP using the detailed call flows. Explain briefly session including the processing of the Record-Route
how content can be negotiated in SIP. How does the field. How does SIP timer C play a role in terminating
OPTION method help in discovering UA capabilities? the session?
7. Show the detailed call flows of how a SIP entity can 19. What constitutes a transaction in SIP? Explain
discover UA and proxy capabilities. the INVITE and non-INVITE client transaction
8. What is a SIP dialog? Describe UA behaviors for the including their differences. How can the matching
creation of a dialog, requests, and responses within a requests be created for the existing transaction by
dialog after establishment of the dialog and termina- the client?
tion of a dialog. 20. What is the SIP server transaction? Explain the
9. What is the remote target URI? Show an example with INVITE and non-INVITE server transaction includ-
call flows of how target request and responses are used ing their differences. How can the matching requests
to establish and modify the remote target URI that be created for the existing transaction by the server?
includes feature sets in a dialog. 21. How are the transport errors handled by the SIP client
10. Explain with detailed call flows how a session can and server?
be created with SIP. Describe in detail how each SIP 22. Explain in detail how RFC 4320 updates RFC 3261 in
functional entity process the SIP INVITE and the handling the non-INVITE transactions?
response message in establishing the session, including 23. Explain how the SIP timers and transport protocols’
other methods, if applicable. (TCP, UDP, SCTP, and TLS) timers complement one
11. How can the existing SIP session can be modified another?
using the SIP UPDATE method? How does each func- 24. Describe in detail how the SIP client handles transport
tional entity of the source–destination path handle the protocols (TCP, UDP, SCTP, and TLS) in sending SIP
UPDATE method? Explain in detail using the call requests and receiving SIP responses?
flows. 25. Describe in detail how the SIP sever handle transports
12. How does the SDP offer and answer play a role in protocols (TCP, UDP, SCTP, and TLS) in receiving
modifying the existing session using SDP along with SIP requests and sending SIP responses?
the UPDATE method? 26. How do the SIP client and server handle framing and
13. Describe in detail how the SDP is used in generating errors for transport protocols (TCP, UDP, SCTP, and
the answer for the unicast and multicast streams in TLS)?
view of offer and answer. How does the offerer process
the answer?
14. Explain using examples how an existing session can be
modified using SDP offer and answer for the follow- References
ing: adding and removing a media stream, and putting 1. International Telecommunication Union, “Procedures
a unicast media stream on hold. for real-time Group 3 facsimile communication over IP
15. Explain briefly how capabilities are expressed in SDP. Networks,” ITU-T Recommendation T.38, October 2010.
Explain with call flows how the basic SDP offer– 2. International Telecommunication Union, “Procedures for
answer can be exchanged. document facsimile transmission in the general switched tele-
16. How can a SIP message body be encoded? How does phone network,” ITU-T Recommendation T.30, September
2005.
a UA behave with a message body using the binary
3. International Telecommunication Union, “Standardization
encoding scheme? Explain the following types of SIP of Group 3 facsimile terminals for document transmission,”
message body: multipart, multipart/mixed, multipart/ ITU-T Recommendation T.4, July 2003.
SIP Message Elements ◾ 267
4. Kaplan, H., “GIN with Literal AORs for SIP in SSPs 6. International Telephone and Telegraph Consultative
(GLASS),” IETF draft, Work in Progress, 2014. Committee, “Pulse code modulation (PCM) of voice fre-
5. National Institute of Standards and Technology, “Secure quencies,” CCITT Recommendation G.711, 1972.
Hash Standard (SHS),” FIPS PUB 180-3, October 2008. 7. International Telecommunication Union, “Procedures
Available at http://csrc.nist.gov/publications/fips/fips180-3 for supporting voice-band data over IP networks,” ITU-T
/fips180-3_final.pdf. Recommendation V.152, September 2010.
Chapter 4
Addressing in SIP
Abstract terms of SIP or SIP Security (SIPS) URI. Like the telephone
number and e-mail address, the AOR can be used by a user
The Request-Uniform Resource Identifier
publicly in the business card, web page, and pocketbook for
(Request-URI) of the Session Initiation Protocol
calling the user in a variety of ways. The E.164 telephone
(SIP) carries the logical public address for reach-
address defined by ITU-T can also be used in SIP by using
ability in establishing the session between the
tel-URI; however, a mapping is needed between E.194 and
communicating parties. The different kinds of
SIP AOR. As a result, the E.164 number (ENUM) and
URIs, such as SIP, SIP Security, and telephone
domain name system (DNS) are used for address resolution.
(tel) URI, that the SIP messages can use are
Some SIP service URIs have also been defined for invok-
described here. The rules for the formation of
ing multimedia services such as voice mail, multimedia mail,
SIP Request-URI with different kinds of URIs,
and media services using SIP.
the relationship between these different URI
types, and their mapping among themselves
are explained. The creation and registration of
the unique globally routable user agent (UA) 4.2 SIP Public Address
URI out of multiple URIs for the same SIP
The SIP public address, as explained, is an AOR expressed
address-of-record for reaching the SIP UA is also
in SIP or SIPS URI to be used publicly for calling the user.
described in detail. Finally, some service URIs
Thus, an AOR is also thought of as the public address of the
that are created using informational Requests for
user. An example of SIP AOR can be as follows: sip:smith
Comment (RFC) and may be useful for using
@food.net. However, an AOR points to a domain with a loca-
or offering SIP-related specific services are dis-
tion service that can map the URI to another URI where the
cussed. In addition to RFC 3261, the material
user might be available. In SIP, a location service is assumed
from many other RFCs that have enhanced SIP
to be populated through registrations; however, location ser-
is included here.
vices are not required to provide a SIPS binding for a SIPS
Request-URI. As location services are out of scope in SIP,
various other protocols and interfaces could conceivably sup-
ply contact addresses for an AOR, and these tools are free to
4.1 Introduction map SIPS URIs to SIP URIs as appropriate. It is envisioned
Session Initiation Protocol (SIP) addresses are used for that a location service returns its contact addresses without
uniquely identifying SIP entities such as SIP user agents regard for whether it received a request with a SIPS Request-
(UAs) and SIP servers. The users, services, and automata that URI when queries are made for bindings by SIP servers such
want to communicate either as SIP UAs or SIP servers need as proxies and redirect servers. If a redirect server is accessing
to use the SIP address (Uniform Resource Identifier [URI]). the location service, it is up to the entity that processes the
However, the SIP also has a formally defined user’s public Contact header field of a redirection to determine the propri-
address known as address-of-record (AOR), expressed in ety of the contact addresses.
269
270 ◾ Handbook on Session Initiation Protocol
4.2.1 SIP and SIPS Uniform subscriber field defined in RFC 3966 may be used to popu-
Resource Indicators late the user field. There are special escaping rules for encod-
ing telephone-subscriber fields in SIP and SIPS URIs.
A SIP or SIPS URI identifies a communications resource.
Like all URIs, SIP and SIPS URIs may be placed in web ◾◾ Password: A password associated with the user. While
pages, e-mail messages, or printed literature. They contain the SIP and SIPS URI syntax allows this field to be
sufficient information to initiate and maintain a communi- present, its use is not recommended because the pass-
cation session with the resource. Examples of communica- ing of authentication information in clear text (such as
tions resources include the following: URIs) has proven to be a security risk in almost every
case where it has been used. For instance, transport-
◾◾ A user of an online service ing a personal identification number (PIN) in this field
◾◾ An appearance on a multiline phone exposes the PIN. Note that the password field is just
◾◾ A mailbox on a messaging system an extension of the user portion. Implementations not
◾◾ A public switched telephone network (PSTN) number wishing to give special significance to the password
at a gateway service portion of the field may simply treat user:password as
◾◾ A group (such as sales or helpdesk) in an organization a single string.
◾◾ Host: The host providing the SIP resource. The host
A SIPS URI specifies that the resource be contacted
part contains either a fully qualified domain name or
securely. This means, in particular, that Transport Layer
numeric IPv4 or IPv6 address. Using the fully quali-
Security (TLS) is to be used between the UA client (UAC)
fied domain name form is recommended whenever
and the domain that owns the URI. From there, secure com-
possible.
munications are used to reach the user, where the specific
◾◾ Port: The port number where the request is to be sent.
security mechanism depends on the policy of the domain.
URI parameters: parameters affecting a request con-
Any resource described by a SIP URI can be upgraded to
structed from the URI. URI parameters are added
a SIPS URI by just changing the scheme, if it is desired to
after the hostport component and are separated by
communicate with that resource securely.
semicolons.
◾◾ URI: URI parameters take the form parameter name
4.2.1.1 SIP and SIPS URI Components “=” parameter value.
The sip: and sips: schemes follow the guidelines in RFC 3986.
Although an arbitrary number of URI parameters may
They use a form similar to the mailto Universal Resource
be included in a URI, any given parameter name must not
Locator (URL), allowing the specification of SIP request-
appear more than once. This extensible mechanism includes
header fields and the SIP message body. This makes it pos-
the transport, maddr, ttl, user, method, and lr parameters.
sible to specify the subject, media type, or urgency of sessions
The transport parameter determines the transport mecha-
initiated by using a URI on a web page or in an e-mail mes-
nism to be used for sending SIP messages, as specified in
sage. The formal syntax for a SIP or SIPS URI is presented in
RFC 3263 (see Section 8.2.4). SIP can use any network
Section 2.4.1.2. Its general form, in the case of a SIP URI, is
transport protocol. Parameter names are defined for the user
sip:user:password@host:port; datagram protocol (UDP) specified in RFC 768, transmis-
uri-parameters?headers sion control protocol (TCP) described in RFC 761, and stream
control transmission protocol (SCTP) defined in RFC 2960.
The format for a SIPS URI is the same, except that the For a SIPS URI, the transport parameter must indicate a
scheme is sips instead of sip. These tokens, and some of the reliable transport.
tokens in their expansions, have the following meanings— The maddr parameter indicates the server address to
user: The identifier of a particular resource at the host being be contacted for this user, overriding any address derived
addressed. The term host in this context frequently refers to a from the host field. When an maddr parameter is present,
domain. The userinfo of a URI consists of this user field, the the port and transport components of the URI apply to the
password field, and the @ sign following them. The userinfo address indicated in the maddr parameter value. RFC 3263
part of a URI is optional and may be absent when the desti- (see Section 8.2.4) describes the proper interpretation of the
nation host does not have a notion of users or when the host transport, maddr, and hostport in order to obtain the des-
itself is the resource being identified. If the @ sign is present tination address, port, and transport for sending a request.
in a SIP or SIPS URI, the user field must not be empty. The maddr field has been used as a simple form of loose
If the host being addressed can process telephone num- source routing. It allows a URI to specify a proxy that must
bers, for instance, an Internet telephony gateway, a telephone be traversed en route to the destination. Continuing to use
Addressing in SIP ◾ 271
the maddr parameter this way is strongly discouraged (the card. Entries marked “m” are mandatory; those marked “o”
mechanisms that enable it are deprecated). Implementations are optional; and those marked “-” are not allowed.
should instead use the Route mechanism described in this Elements processing URIs should ignore any disal-
document, establishing a preexisting route set if necessary lowed components if they are present. The second column
(see Section 3.1.2.1.1). This provides a full URI to describe indicates the default value of an optional element if it is not
the node to be traversed. present. “--” indicates that the element is either not optional,
The ttl parameter determines the time-to-live value of the or has no default value. URIs in Contact header fields have
UDP multicast packet and MUST only be used if maddr is different restrictions depending on the context in which
a multicast address and the transport protocol is UDP. For the header field appears. One set applies to messages that
example, to specify a call to [email protected] using mul- establish and maintain dialogs (INVITE and its 200 OK
ticast to 239.255.255.1 with a ttl of 15, the following URI response). The other applies to registration and redirection
would be used: messages (REGISTER, its 200 OK response, and 3xx class
responses to any method).
sip:[email protected];
maddr=239.255.255.1;ttl=15
4.2.1.2 Character Escaping Requirements
The set of valid telephone-subscriber strings is a subset
of valid user strings. The user URI parameter exists to dis- SIP follows the requirements and guidelines of RFC 2396
tinguish telephone numbers from user names that happen (obsoleted by RFC 3986) when defining the set of characters
to look like telephone numbers. If the user string contains a that must be escaped in a SIP URI, and uses its % HEX
telephone number formatted as a telephone-subscriber, the HEX mechanism for escaping. From RFC 2396: The set of
user parameter value phone should be present. Even without characters actually reserved within any given URI compo-
this parameter, recipients of SIP and SIPS URIs may inter- nent is defined by that component. In general, a character is
pret the pre-@ part as a telephone number if local restrictions reserved if the semantics of the URI changes if the character
on the namespace for user name allow it. The method of the is replaced with its escaped US-ASCII encoding specified in
SIP request constructed from the URI can be specified with RFC 2396. Excluded US-ASCII characters defined in RFC
the method parameter. 2396, such as space and control characters and characters
The lr parameter, when present, indicates that the ele- used as URI delimiters, also must be escaped. URIs must not
ment responsible for this resource implements the routing contain unescaped space and control characters.
mechanisms specified in this document. This parameter will For each component, the set of valid augmented Backus–
be used in the URI proxies placed into Record-Route header Naur Form (ABNF) expansions defines exactly which char-
field values, and may appear in the URIs in a preexisting route acters may appear unescaped. All other characters must be
set. This parameter is used to achieve backwards compat- escaped. For example, @ is not in the set of characters in
ibility with systems implementing the strict-routing mecha- the user component, so the user j@s0n must have at least
nisms of RFC 2543 (obsoleted by RFC 3261). An element the @ sign encoded, as in j%40s0n. Expanding the hname
preparing to send a request based on a URI not containing and hvalue tokens in Section 3.9 show that all URI reserved
this parameter can assume that the receiving element imple- characters in header field names and values must be escaped.
ments strict routing and reformats the message to preserve The telephone-subscriber subset of the user component has
the information in the Request-URI. Since the uri-parameter special escaping considerations. The set of characters not
mechanism is extensible, SIP elements must silently ignore reserved in the RFC 3966 (obsoleted by RFC 3986) descrip-
any uri-parameters that they do not understand. tion of telephone-subscriber contains a number of charac-
ters in various syntax elements that need to be escaped when
◾◾ Headers: Header fields to be included in a request used in SIP URIs. Any characters occurring in a telephone-
constructed from the URI. Headers fields in the SIP subscriber that do not appear in an expansion of the ABNF
request can be specified with the “?” mechanism within for the user rule must be escaped.
a URI. The header names and values are encoded in Note that character escaping is not allowed in the host
ampersand separated hname = hvalue pairs. The special component of a SIP or SIPS URI (the % character is not
hname body indicates that the associated hvalue is the valid in its expansion). This is likely to change in the future as
message body of the SIP request. requirements for Internationalized Domain Names (IDNs)
are finalized. Current implementations must not attempt to
Table 4.1 summarizes the use of SIP and SIPS URI com- improve robustness by treating received escaped characters in
ponents based on the context in which the URI appears. The the host component as literally equivalent to their unescaped
external column describes URIs appearing anywhere outside counterpart. The behavior required to meet the requirements
of a SIP message, for instance, on a web page or business of IDN may be significantly different.
272 ◾ Handbook on Session Initiation Protocol
Dialog/
Registration/ Contact/
Request- Redirection/ R-R/
Default URI To From Contact Route External Remarks
user – o o o o o o Identifier of a resource of a host being
addressed
password – o o o o o o Password associated with the user
(use not recommended as it might have
security risk)
host – m m m m m m Host providing the SIP resource
(fully qualified domain name of IPv4 or IPv6
address)
port * o - - o o o Port number where the request to be sent
Note: “-”, not allowed; “–”, the element is either not optional, or has no default value; m, mandatory; o, optional; R-R, Record-Route.
* The default port value is transport and scheme dependent. The default is 5060 for sip: using UDP, TCP, or SCTP. The default is 5061 for sip:
using TLS over TCP and sips: over TCP.
** The default transport is scheme dependent. For sip:, it is UDP. For sips:, it is TCP.
4.2.1.3 Example SIP and SIPS URIs of this protocol, the field is opaque. The structure of that
value is only useful to the SIP element responsible for the
We are providing some examples of SIP and SIPS URIs as
resource.
follows from RFC 3261:
sip:[email protected]
sip:alice:[email protected]; 4.2.1.4 URI Comparison
transport=tcp
sips:[email protected]?subject=project%20 Some operations in this specification require determining
x&priority=urgent whether two SIP or SIPS URIs are equivalent. In this specifi-
sip:+1-212-555-1212:[email protected]; cation, registrars need to compare bindings in Contact URIs
user=phone in REGISTER requests (see Section 3.3). SIP and SIPS URIs
sips:[email protected] are compared for equality according to the following rules:
sip:[email protected]
sip:atlanta.com;method=REGISTER?to=alice%40
atlanta.com ◾◾ A SIP and SIPS URI are never equivalent.
sip:alice;[email protected] ◾◾ Comparison of the userinfo of SIP and SIPS URIs
is case sensitive. This includes userinfo containing
The last sample URI above has a user field value of passwords or formatted as telephone-subscribers.
alice;day=tuesday. The escaping rules defined above allow a Comparison of all other components of the URI is
semicolon to appear unescaped in this field. For the purposes case-insensitive unless explicitly defined otherwise.
Addressing in SIP ◾ 273
◾◾ The ordering of parameters and header fields is not sig- – A user, ttl, or method uri-parameter appearing in
nificant in comparing SIP and SIPS URIs. only one URI never matches, even if it contains the
◾◾ Characters other than those in the reserved set defined default value.
in RFC 2396 (obsoleted by RFC 3986) are equivalent – A URI that includes an maddr parameter will not
to their % HEX HEX encoding. match a URI that contains no maddr parameter.
◾◾ An Internet Protocol (IP) address that is the result of a – All other uri-parameters appearing in only one
DNS lookup of a host name does not match that host URI are ignored when comparing the URIs.
name. ◾◾ URI header components are never ignored. Any pres-
◾◾ RFC 5954 updates RFC 3261 for rules for matching ent header component must be present in both URIs
of two URIs. According to RFC 5954, for two URIs and match for the URIs to match.
to be equal, the user, password, host, and port com-
ponents must match. If the host component contains The URIs within each of the following sets are equivalent:
a textual representation of IP addresses, then the rep-
resentation of those IP addresses may vary. If so, the sip:%[email protected];transport=TCP
sip:[email protected];Transport=tcp
host components are considered to match if the dif-
sip:[email protected]
ferent textual representations yield the same binary IP sip:[email protected];newparam=5
address. RFC 5954 also recommends that implement- sip:[email protected];security=on
ers should generate IPv6 text representation as defined sip:biloxi.com;transport=tcp;method=REGISTER?
in RFC 5952. to=sip:bob%40biloxi.com
sip:biloxi.com;method=REGISTER;transport=tcp?
to=sip:bob%40biloxi.com
According to the modified rule of RFC 5954 (updates
sip:[email protected]?subject=project%20
Section 19.1.4 of RFC 3261), the following URIs are equiva- x&priority=urgent
lent because the underlying binary representation of the IP sip:[email protected]?priority=urgent&subject
addresses are the same although their textual representations =project%20x
vary:
The URIs within each of the following sets are not
sip:bob@[::ffff:192.0.2.128] equivalent:
sip:bob@[::ffff:c000:280]
sip:bob@[2001:db8::9:1] SIP:[email protected];Transport=udp
sip:bob@[2001:db8::9:01] (different usernames)
sip:bob@[0:0:0:0:0:FFFF:129.144.52.38] sip:[email protected];Transport=UDP
sip:bob@[::FFFF:129.144.52.38] sip:[email protected] (can resolve to different
ports)
A URI omitting the user component will not match a sip:[email protected]:5060
URI that includes one. A URI omitting the password com- sip:[email protected] (can resolve to different
ponent will not match a URI that includes one. A URI omit- transports)
sip:[email protected];transport=udp
ting any component with a default value will not match a
sip:[email protected] (can resolve to different
URI explicitly containing that component with its default port and transports)
value. For instance, a URI omitting the optional port com- sip:[email protected]:6000;transport=tcp
ponent will not match a URI explicitly declaring port 5060. sip:[email protected] (different header
The same is true for the transport-parameter, ttl-parameter, component)
user-parameter, and method components. Defining sip:user sip:[email protected]?Subject=next%20meeting
sip:[email protected] (even though
@host to not be equivalent to sip:user@host:5060 is a change
that is what
from RFC 2543 (obsoleted by RFC 3261). When deriving sip:[email protected] phone21.boxesbybob.com
addresses from URIs, equivalent addresses are expected from resolves to)
equivalent URIs. The URI sip:user@host:5060 will always
resolve to port 5060. The URI sip:user@host may resolve to Note that equality is not transitive:
other ports through the DNS SRV mechanisms detailed in
RFC 3263 (see Section 8.2.4). ◾◾ sip:[email protected] and sip:[email protected];
security=on are equivalent.
◾◾ URI uri-parameter components are compared as ◾◾ sip:[email protected] and sip:[email protected];
follows: security=off are equivalent.
– Any uri-parameter appearing in both URIs must ◾◾ sip:[email protected];security=on and sip:carol
match. @chicago.com;security=off are not equivalent.
274 ◾ Handbook on Session Initiation Protocol
4.2.1.5 Forming Requests from a URI conferencing server implementation is free to choose from
these methods, which include nonautomated means such as
An implementation needs to take care when forming requests
an Interactive Voice Response system, SIP, or any conference
directly from a URI. URIs from business cards, web pages,
control protocol. To automatically create an arbitrary num-
and even from sources inside the protocol such as registered
ber of ad hoc conferences (and subsequently their focuses)
contacts may contain inappropriate header fields or body
using SIP call control means, a globally routable Conference
parts. An implementation must include any provided trans-
Factory URI can be allocated and published. A successful
port, maddr, ttl, or user parameter in the Request-URI of
attempt to establish a call to this URI would result in the
the formed request. If the URI contains a method param-
automatic creation of a new conference and its focus. As a
eter, its value must be used as the method of the request. The
result, note that the Conference Factory URI and the newly
method parameter must not be placed in the Request-URI.
created focus URI may resolve to different physical devices,
Unknown URI parameters must be placed in the message’s
and some examples are provided in RFC 4579.
Request-URI. An implementation should treat the presence
of any headers or body parts in the URI as a desire to include
them in the message, and choose to honor the request on a 4.2.2 Telephone URI
per-component basis. An implementation should not honor
these obviously dangerous header fields: From, Call-ID, The tel URI (RFC 3966) scheme describes resources identi-
CSeq, Via, and Record-Route. An implementation should fied by telephone numbers. A telephone number is a string of
not honor any requested Route header field values in order decimal digits that uniquely indicates the network termina-
to not be used as an unwitting agent in malicious attacks. tion point. The number contains the information necessary
An implementation should not honor requests to include to route the call to this point. SIP also uses the tel URI in its
header fields that may cause it to falsely advertise its loca- requests as the SIP specification inherits the subscriber part
tion or capabilities. These include Accept, Accept-Encoding, of the syntax as part of the user element in the SIP URI. The
Accept-Language, Allow, Contact (in its dialog usage), tel URI can also be used by other protocols in their URI
Organization, Supported, and User-Agent. schemes. However, the tel URI does not specify the call type,
An implementation should verify the accuracy of any such as voice, fax, or data call, and does not provide the con-
requested descriptive header fields, including Content- nection parameters for a data call. The type and parameters
Disposition, Content-Encoding, Content-Language, Content- are assumed to be negotiated either in-band by the tele-
Length, Content-Type, Date, Mime-Version, and Timestamp. phone device or through a signaling protocol such as SIP.
If the request formed from constructing a message from a given More important, the tel URI scheme facilitates interwork-
URI is not a valid SIP request, the URI is invalid. An imple- ing between the PSTN/Integrated Services Digital Network
mentation must not proceed with transmitting the request. It (ISDN) and IP network. The tel URI is expressed as follows:
should instead pursue the course of action due an invalid URI
telephone-uri = "tel:" telephone-subscriber
in the context it occurs. The constructed request can be invalid
in many ways. These include, but are not limited to, syntax The telephone-subscriber parameter can be a global num-
error in header fields, invalid combinations of URI param- ber or a local number. The tel URL telephone number is not
eters, or an incorrect description of the message body. Sending restricted and can be in any kinds of networks such as the
a request formed from a given URI may require capabilities public telephone network, a private telephone network, or the
unavailable to the implementation. The URI might indicate Internet. Some examples of tel URI can be stated as follows:
use of an unimplemented transport or extension, for exam-
ple. An implementation should refuse to send these requests ◾◾ tel:+1-908-752-5123: This tel URI indicates to a phone
rather than modifying them to match their capabilities. An number in the United States.
implementation must not send a request requiring an exten- ◾◾ tel:8209;phone-context=example.com: The tel URI
sion that it does not support. For example, such a request can describes a local phone number valid within the con-
be formed through the presence of a Require header parameter text example.com.
or a method URI parameter with an unknown or explicitly ◾◾ tel:582-5679;phone-context=+1-201-3756: The tel URI
unsupported value. describes a local phone number that is valid within a
particular phone prefix.
4.2.1.6 Conference Factory URI
Telephone numbers comprise two related but distinct
According to RFC 4579 (see Sections 2.2 and 2.4.4.1) that concepts: a canonical AOR and a dial string. Although
defines conferencing call control features for SIP, there both approaches can be expressed as a URI, the dial string
are many ways in which a conference can be created. A approach is beyond the scope of the tel URI. The telephone
Addressing in SIP ◾ 275
number used as the canonical AOR or identifier indicates a In general, equivalent tel URLs converted to SIP or SIPS
termination point within a specific network. E.164 rules are URIs in this fashion may not produce equivalent SIP or SIPS
followed by these telephone numbers for the public network. URIs. The userinfo of SIP and SIPS URIs are compared as a
However, the private numbers will follow the rules of the case-sensitive string. Variance in case-insensitive portions of
private network. Subscribers publish these identifiers so that tel URLs and reordering of tel URL parameters do not affect
they can be reached, regardless of the location of the caller. tel URL equivalence, but do affect the equivalence of SIP
As result, not all telephone numbers can be reachable from URIs formed from them.
any other numbers globally as the private network rules are For example,
proprietary, although they may use the telephone digits. The
tel URI specifies the telephone number as an AOR or identi- tel:+358-555-1234567;postd=pp22
fier, which can be either globally unique or only valid within tel:+358-555-1234567;POSTD=PP22
a local context. The dialing application is aware of the local
are equivalent, while
context, knowing, for example, whether special digits need to
be dialed to seize an outside line; whether network, pulse, or sip:+358-555-1234567;[email protected];
tone dialing is needed; and what tones indicate call progress. user=phone
The dialing application then converts the telephone num- sip:+358-555-1234567;[email protected];
ber into a dial sequence and performs the necessary signal- user=phone
ing actions. The dialer does not have to be a user application
as found in traditional desktop operating systems but could are not.
well be part of an IP-to-PSTN gateway. To reach a telephone Likewise,
number from a phone on a Private Branch Exchange (PBX),
for example, the user of that phone has to know how to convert tel:+358-555-1234567;postd=pp22;isub=1411
tel:+358-555-1234567;isub=1411;postd=pp22
the telephone number identifier into a dial string appropriate
for that phone. The telephone number itself does not convey are equivalent, while
what needs to be done for a particular terminal. Instructions
may include dialing 9 before placing a call or prepending 00 sip:+358-555-1234567;postd=pp22;isub=1411@
to reach a number in a foreign country. The phone may also foo.com;user=phone
need to strip area and country codes. The identifier approach sip:+358-555-1234567;isub=1411;postd=pp22@
described in this document has the disadvantage that certain foo.com;user=phone
services, such as electronic banking or voice mail, cannot be
specified in a tel URI. In the SIP network, the routing is made are not.
based on the domain name and a call reaches to the destination To mitigate this problem, elements constructing telephone-
domain; the SIP server that is responsible for that domain will subscriber fields to place in the userinfo part of a SIP or SIPS
route the call to the user. However, unlike other URIs, the tel URI should fold any case-insensitive portion of telephone-
URI does not have the domain name or user part. As a result, subscriber to lowercase, and order the telephone-subscriber
there is no mechanism on how a tel URI will be routed over the parameters lexically by parameter name, excepting isdn-
SIP network. subaddress and post-dial, which occur first and in that
order. (All components of a tel URL except for future
extension parameters are defined to be compared case
4.2.2.1 Relating SIP URIs and Tel URLs insensitive.)
Following this suggestion, both
When a tel URL defined in RFC 3966 is converted to a SIP
or SIPS URI, the entire telephone-subscriber portion of the tel:+358-555-1234567;postd=pp22
tel URL, including any parameters, is placed into the user- tel:+358-555-1234567;POSTD=PP22
info part of the SIP or SIPS URI. Thus,
become
tel:+358-555-1234567;postd=pp22 becomes
sip:+358-555-1234567;[email protected]; sip:+358-555-1234567;[email protected];
user=phone or user=phone
sips:+358-555-1234567;[email protected];
user=phone not
sip:[email protected]; and both
postd=pp22;user=phone or
sips:[email protected]; tel:+358-555-1234567;tsp=a.b;phone-context=5
postd=pp22;user=phone tel:+358-555-1234567;phone-context=5;tsp=a.b
276 ◾ Handbook on Session Initiation Protocol
TLS if they can open a TLS connection, even if the route set some proxies in some environments can be configured to only
used SIP URIs instead of SIPS URIs. The proxies can insert route SIPS URIs. Some proxies can be configured to detect
Record-Route header fields using SIP URIs even if it uses noncompliances and reject unsecure requests. For example,
TLS transport. RFC 3261 Section 19.12.3.2.2 explains how proxies could inspect Request-URIs, Path, Record-Route, To,
interdomain requests can use TLS. Some UAs, redirect serv- From, Contact header fields, and Via header fields to enforce
ers, and proxies might have local policies that enforce TLS on SIPS. RFC 3261 (Section 19.12.4.4) explains that S/MIME
all connections, independently of whether or not SIPS is used. can also be used by the originating UAC to ensure that the orig-
inal form of the To header field is carried end-to-end. While
not specifically mentioned in RFC 3261 (Section 19.12.4.4),
4.2.3.4 Usage of Transport=tls URI
this is meant to imply that RFC 3893 (see Section 19.4.7)
Parameter and TLS via Parameter
would be used to tunnel important header fields (such as To
RFC 3261 (Section 19.12.2.2) deprecated the transport=tls and From) in an encrypted and signed S/MIME body, repli-
URI transport parameter in SIPS or SIP URIs: Note that in cating the information in the SIP message, and allowing the
the SIPS URI scheme, transport is independent of TLS, and UAS to validate the content of those important header fields.
thus sips:[email protected];transport=TCP and sips:alice@ While this approach is certainly legal, a preferable approach is
atlanta.com;transport=sctp are both valid (although note to use the SIP Identity mechanism defined in RFC 4474 (see
that UDP is not a valid transport for SIPS). The use of Sections 2.8 and 19.4.8). SIP Identity creates a signed identity
transport=tls has consequently been deprecated, partly digest, which includes, among other things, the AOR of the
because it was specific to a single hop of the request. This is sender (from the From header field) and the AOR of the origi-
a change since RFC 2543 (obsoleted by RFC 3261). The tls nal target (from the To header field).
parameter has not been eliminated from the ABNF in RFC
3261 (Section 3.9) since the parameter needs to remain in
the ABNF for backward compatibility in order for parsers to 4.2.3.6 Problems with Meaning
be able to process the parameter correctly. The transport=tls of SIPS in RFC 3261
parameter has never been defined in an RFC, but only in
RFC 3261 (Section 4.2.1) describes a SIPS URI as follows:
some of the Internet drafts between RFC 2543 and RFC
a SIPS URI specifies that the resource be contacted securely.
3261. This specification does not make use of the transport=tls
This means, in particular, that TLS is to be used between
parameter. The reinstatement of the transport=tls parameter,
the UAC and the domain that owns the URI. From there,
or an alternative mechanism for indicating the use of the
secure communications are used to reach the user, where the
TLS on a single hop in a URI, is outside the scope of this
specific security mechanism depends on the policy of the
specification. For Via header fields, the following transport
domain. RFC 3261 (Section 19.12.2.2) reiterates it, with
protocols are defined in RFC 3261 (see Section 3.13): UDP,
regards to Request-URIs: when used as the Request-URI of
TCP, TLS, SCTP, and in RFC 4168, TLSSCTP.
a request, the SIPS scheme signifies that each hop over which
the request is forwarded, until the request reaches the SIP
4.2.3.5 Detection of Hop-by-Hop Security entity responsible for the domain portion of the Request-
URI, must be secured with TLS; once it reaches the domain
The presence of a SIPS Request-URI does not necessar-
in question, it is handled in accordance with local security
ily indicate that the request was sent securely on each hop.
and routing policy, quite possibly using TLS for any last hop
So how does a UAS know if SIPS was used for the entire
to a UAS. When used by the originator of a request (as would
request path to secure the request end-to-end? Effectively,
be the case if they employed a SIPS URI as the AOR of the
the UAS cannot know for sure. However, RFC 3261 (Section
target), SIPS dictates that the entire request path to the tar-
19.12.4.4) recommends how a UAS can make some checks to
get domain be so secured. Let us take the classic SIP trap-
validate the security. Additionally, the History-Info header
ezoid (Figure 4.1) to explain the meaning of a sips:b@B URI.
field (RFC 4244, see Section 2.8) could be inspected for
Instead of using real domain names like example.com and
detecting retargeting from SIP and SIPS. Retargeting from
example.net, logical names like A and B are used, for clarity.
SIP to SIPS by a proxy is an issue because it can leave the
According to RFC 3261, if a@A is sending a request to
receiver of the request with the impression that the request
sips:b@B, the following applies:
was delivered securely on each hop, while in fact, it was not.
To emphasize, all the checking can be circumvented by
any proxies or B2BUAs on the path that do not follow the rules ◾◾ TLS is required between UA a@A and proxy A.
and recommendations of this specification and of RFC 3261 ◾◾ TLS is required between proxy A and proxy B.
(see Section 4.2.1). Proxies can have their own policies regard- ◾◾ TLS is required between proxy B and UA b@B,
ing routing of requests to SIP or SIPS URIs. For example, depending on local policy.
278 ◾ Handbook on Session Initiation Protocol
Domain B
Domain A
a.example.com b.example.net
(192.0.2.1) (192.0.2.128)
TLS Policy-based
[email protected] [email protected]
Figure 4.1 SIP trapezoid with last-hop exception. (Copyright IETF. Reproduced with permission.)
One can then wonder why TLS is mandatory between support TLS. If the last Record-Route entry, however, is
UA a@A and proxy A but not between proxy B and UA a sip URI, then b would be able to send requests without
b@B. The main reason is that RFC 3261 was written before using TLS (but b would still have to be able to handle
RFC 5626 (see Section 13.2). At that time, it was recognized SIPS schemes when parsing the message). In either case,
that in many practical deployments, proxy B might not be the Request-URI in the request from b@B to B would be
able to establish a TLS connection with UA b because only a SIPS URI.
proxy B would have a certificate to provide and UA b would
not. Since UA b would be the TLS server, it would then not
4.2.3.7 Overview of Operations
be able to accept the incoming TLS connection. The con-
sequence is that an RFC 3261 compliant UAS b, while it Because of all the problems described earlier, this specifica-
might not need to support TLS for incoming requests, will tion deprecates the last-hop exception when forwarding a
nevertheless have to support TLS for outgoing requests as it request to the last hop (Figure 4.2). This will ensure that
takes the UAC role. Contrary to what many believed, the TLS is used on all hops all the way up to the remote target.
last-hop exception was not created to allow for using a SIPS The SIPS scheme implies transitive trust. Obviously,
URI to address a UAS that does not support TLS: the last- there is nothing that prevents proxies from cheating (RFC
hop exception was an attempt to allow for incoming requests 3261, see Section 19.12.4.4). While SIPS is useful to request
to not be transported over TLS when a SIPS URI is used, that a resource be contacted securely, it is not useful as an
and it does not apply to outgoing requests. indication that a resource was in fact contacted securely.
The rationale for this was somewhat flawed, and since Therefore, it is not appropriate to infer that because an
then, RFC 5626 (see Section 13.2) has provided a more incoming request had a Request-URI (or even a To header
satisfactory solution to this problem. RFC 5626 also solves field) containing a SIPS URI, it necessarily guarantees that
the problem that if UA b is behind a NAT or firewall, the request was in fact transmitted securely on each hop.
proxy B would not even be able to establish a TCP ses- Some have been tempted to believe that the SIPS scheme
sion in the first place. Furthermore, consider the problem was equivalent to an HTTPS scheme in the sense that one
of using SIPS inside a dialog. If a@A sends a request to could provide a visual indication to a user (e.g., a padlock
b@B using a SIPS Request-URI, then, according to RFC icon) to the effect that the session is secured. This is obvi-
3261 (Section 3.1.2.1.8), “the Contact header field must ously not the case, and therefore the meaning of a SIPS
contain a SIPS URI as well.” This means that b@B, upon URI is not to be oversold. There is currently no mechanism
sending a new Request within the dialog (e.g., a BYE or to provide an indication of end-to-end security for SIP.
re-INVITE), will have to use a SIPS URI. If there is no Other mechanisms can provide a more concrete indication
Record-Route entry, or if the last Record-Route entry con- of some level of security. For example, SIP Identity (RFC
sists of a SIPS URI, this implies that b@B is expected to 4474, see Sections 2.8 and 19.4.8) provides an authenticated
understand SIPS in the first place, and is required to also identity mechanism and a domain-to-domain integrity
Addressing in SIP ◾ 279
Domain B
Domain A
a.example.com b.example.net
(192.0.2.1) (192.0.2.128)
TLS TLS
[email protected] [email protected]
Figure 4.2 SIP trapezoid without last-hop exception. (Copyright IETF. Reproduced with permission.)
This specification (RFC 5630) mandates that when a 4.2.3.8 UAC Normative Behavior
proxy is forwarding a request, a resource described by a SIPS
When presented with a SIPS URI, a UAC must not change
Request-URI cannot be downgraded to a SIP URI by changing
it to a SIP URI. For example, if a directory entry includes a
the scheme, or by sending the associated request over a nonse-
SIPS AOR, the UAC is not expected to send requests to that
cure link. If a request needs to be rejected because otherwise
AOR using a SIP Request-URI. Similarly, if a user reads a
it would be a downgrade, the request would be rejected with
business card with a SIPS URI, it is not possible to infer a
a 480 (Temporarily Unavailable) response (potentially with
SIP URI. If a 3xx response includes a SIPS Contact header
a Warning header with warn-code 380 SIPS Not Allowed).
field, the UAC does not replace it with a SIP Request-URI
Similarly, this specification mandates that when a proxy is
(e.g., by replacing the SIPS scheme with a SIP scheme) when
forwarding a request, a resource described by a SIP Request-
sending a request as a result of the redirection. As mandated
URI cannot be upgraded to a SIPS URI by changing the
by RFC 3261 (see Section 3.1.2.1.8) in a request, “if the
scheme (otherwise, it would be an upgrade only for that hop
Request-URI or top Route header field value contains a SIPS
onwards rather than on all hops, and would therefore mis-
URI, the Contact header field must contain a SIPS URI as
lead the UAS). If a request needs to be rejected because oth-
well.” Upon receiving a 416 response or a 480 Temporarily
erwise it would be a misleading upgrade, the request would
Unavailable response with a Warning header with warn-
be rejected with a 480 Temporarily Unavailable response
code 380 SIPS Not Allowed, a UAC must not reattempt the
(potentially with a Warning header field with warn-code 381
request by automatically replacing the SIPS scheme with a
SIPS Required).
SIP scheme as described in RFC 3261 (see Section 3.1.2.3.5),
For example, the sip:[email protected] and sips:bob@
as it would be a security vulnerability. If the UAC does reat-
example.com AORs refer to the same user Bob in the domain
tempt the call with a SIP URI, the UAC should get a con-
example.com: the first URI is the SIP version, and the second
firmation from the user to authorize reinitiating the session
one is the SIPS version. From the point of view of routing,
with a SIP Request-URI instead of a SIPS Request-URI.
requests to either sip:[email protected] or sips:bob@example.
When the route set is not empty (e.g., when a service route
com are treated the same way. When Bob registers, it therefore
RFC 3608 [see Section 2.8] is returned by the registrar), it
does not really matter if he is using a SIP or a SIPS AOR, since
is the responsibility of the UAC to use a Route header field
they both refer to the same user. At first glance, RFC 3261
consisting of all SIPS URIs when using a SIPS Request-URI.
(see Section 4.2.1) seems to contradict this idea by stating
Specifically, if the route set included any SIP URI, the UAC
that a SIP and a SIPS URI are never equivalent. Specifically,
must change the SIP URIs to SIPS URIs simply by changing
it says that they are never equivalent for the purpose of com-
the scheme from sip to sips before sending the request. This
paring bindings in Contact header field URIs in REGISTER
allows for configuring or discovering one service route with
requests. The key point is that this statement applies to the
all SIP URIs and allowing sending requests to both SIP and
Contact header field bindings in a registration: it is the associa-
SIPS URIs. When the UAC is using a SIP Request-URI, if
tion of the Contact header field with the AOR that will deter-
the route set is not empty and the topmost Route header field
mine whether or not the user is reachable with a SIPS URI.
entry is a SIPS URI with the lr parameter, the UAC must
Consider this example: if Bob (AOR [email protected])
send the request over the TLS (using a SIP Request-URI). If
registers with a SIPS Contact header field (e.g., sips:bob@
the route is not empty and the Route header field entry is a
bobphone.example.com), the registrar and the location ser-
SIPS URI without the lr parameter, the UAC MUST send
vice then know that Bob is reachable at sips:bob@bobphone.
the request over the TLS using a SIPS Request-URI corre-
example.com and at sip:[email protected]. If a
sponding to the topmost entry in the route set. To emphasize
request is sent to AOR sips:[email protected], Bob’s proxy
what is already defined in RFC 3261, UAs must not use the
will route it to Bob at Request-URI sips:bob@bobphone.
transport=tls parameter.
example.com. If a request is sent to AOR sip:bob@exam
ple.com, Bob’s proxy will route it to Bob at Request-URI
sip:[email protected]. If Bob wants to ensure
4.2.3.8.1 Registration
that every request delivered to him will always be trans-
ported over the TLS, Bob can use RFC 5626 (see Section The UAC registers Contact header fields to either a SIPS
13.2) when registering. However, if Bob had registered with or a SIP AOR. If a UA wishes to be reachable with a SIPS
a SIP Contact header field instead of a SIPS Contact header URI, the UA must register with a SIPS Contact header
field (e.g., sip:[email protected]), then a request field. Requests addressed to that UA’s AOR using either a
to AOR sips:[email protected] would not be routed to Bob, SIP or SIPS Request-URI will be routed to that UA. This
since there is no SIPS Contact header field for Bob, and includes UAs that support both SIP and SIPS. This specifi-
downgrades from SIPS to SIP are not allowed. See Section cation does not provide any SIP-based mechanism for a UA
2.2 for illustrative call flows. to provision its proxy to only forward requests using a SIPS
Addressing in SIP ◾ 281
Request-URI. A non-SIP mechanism such as a web interface 4.2.3.8.3 Derived Dialogs and Transactions
could be used to provision such a preference. A SIP mecha-
Sessions, dialogs, and transactions can be derived from exist-
nism for provisioning such a preference is outside the scope
ing ones. A good example of a derived dialog is one that was
of this specification. If a UA does not wish to be reached
established as a result of using the REFER method. As a gen-
with a SIPS URI, it must register with a SIP Contact header
eral principle, derived dialogs and transactions cannot result
field. Because registering with a SIPS Contact header field
in an effective downgrading of SIPS to SIP, without the
implies a binding of both a SIPS Contact and a correspond-
explicit authorization of the entities involved. For example,
ing SIP Contact to the AOR, a UA must not include both
when a REFER request is used to perform a call transfer, it
the SIPS and the SIP versions of the same Contact header
results in an existing dialog being terminated and another
field in a REGISTER request; the UA must only use the
one being created based on the Refer-To URI. If that ini-
SIPS version in this case. Similarly, a UA should not register
tial dialog was established using SIPS, then the UAC must
both a SIP Contact header field and a SIPS Contact header
not establish a new one using SIP, unless there is an explicit
field in separate registrations as the SIP Contact header field
authorization given by the recipient of the REFER request.
would be superfluous.
This could be a warning provided to the user. Having such
If it does, the second registration replaces the first one
a warning could be useful, for example, for a secure direc-
(e.g., a UA could register first with a SIP Contact header
tory service application, to warn a user that a request may
field, meaning it does not support SIPS, and later register
be routed to a UA that does not support SIPS. A REFER
with a SIPS Contact header field, meaning it now supports
request can also be used for referring to resources that do
SIPS). Similarly, if a UA registers first with a SIPS Contact
not result in dialogs being created. In fact, a REFER request
header field and later registers with a SIP Contact header
can be used to point to resources that are of a different type
field, that SIP Contact header field replaces the SIPS Contact
than the original one (i.e., not SIP or SIPS). Other exam-
header field. RFC 5626 (see Section 13.2) can be used by a
ples of derived dialogs and transactions include the use of
UA if it wants to ensure that no requests are delivered to it
Third-Party Call Control RFC 3725 (see Section 18.3), the
without using the TLS connection it used when registering.
Replaces header field (RFC 3891, see Section 2.8), and the
If all the Contact header fields in a REGISTER request are
Join header field (RFC 3911, see Section 2.8). Again, the gen-
SIPS, the UAC must use SIPS AORs in the From and To
eral principle is that these mechanisms should not result in
header fields in the REGISTER request. If at least one of
an effective downgrading of SIPS to SIP, without the proper
the Contact header fields is not SIPS (e.g., sip, mailto, tel,
authorization.
http, https), the UAC must use SIP AORs in the From and
To header fields in the REGISTER request. To emphasize
what is already defined in RFC 3261, UACs must not use the 4.2.3.8.4 Globally Routable UA URI (GRUU)
transport=tls parameter.
When a Globally Routable UA URI (GRUU) (RFC 5627,
see Section 4.3) is assigned to an instance ID/AOR pair, both
4.2.3.8.2 SIPS in a Dialog SIP and SIPS GRUUs will be assigned. When a GRUU is
If the Request-URI in a request that initiates a dialog is a obtained through registration, if the Contact header field in
SIP URI, then the UAC needs to be careful about what to the REGISTER request contains a SIP URI, the SIP version
use in the Contact header field (in case Record-Route is not of the GRUU is returned. If the Contact header field in the
used for this hop). If the Contact header field was a SIPS REGISTER request contains a SIPS URI, the SIPS version
URI, it would mean that the UAS would only accept mid- of the GRUU is returned. If the wrong scheme is received in
dialog requests that are sent over secure transport on each the GRUU (which would be an error in the registrar), the
hop. Since the Request-URI in this case is a SIP URI, it is UAC should treat it as if the proper scheme was used (i.e.,
quite possible that the UA sending a request to that URI it should replace the scheme with the proper scheme before
might not be able to send requests to SIPS URIs. If the top using the GRUU).
Route header field does not contain a SIPS URI, the UAC
MUST use a SIP URI in the Contact header field, even if
4.2.3.9 UAS Normative Behavior
the request is sent over a secure transport (e.g., the first hop
could be reusing a TLS connection to the proxy as would When presented with a SIPS URI, a UAS must not change
be the case with RFC 5626; see Section 13.2). When a tar- it to a SIP URI. As mandated by RFC 3261 (see Section
get refresh occurs within a dialog (e.g., re-INVITE request, 3.6.1.1), if the request that initiated the dialog contained a
UPDATE request), the UAC must include a Contact header SIPS URI in the Request-URI or in the top Record-Route
field with a SIPS URI if the original request used a SIPS header field value, if there was any, or the Contact header
Request-URI. field if there was no Record-Route header field, the Contact
282 ◾ Handbook on Session Initiation Protocol
header field in the response must be a SIPS URI. If a UAS not of the proper SIPS scheme, the registrar must reject the
does not wish to be reached with a SIPS URI but only with REGISTER with a 400 Bad Request.
a SIP URI, the UAS must respond with a 480 Temporarily A registrar can return a service route (RFC 3608, see
Unavailable response. The UAS should include a Warning Section 2.8) and impose some constraints on whether or not
header with warn-code 380 SIPS Not Allowed. RFC 3261 TLS will be mandatory on specific hops. For example, if the
(see Section 3.1.3.2.1) states that UASs that do not support topmost entry in the Path header field returned by the reg-
the SIPS URI scheme at all “should reject the request with istrar is a SIPS URI, the registrar is telling the UAC that
a 416 Unsupported URI scheme response.” If a UAS does the TLS is to be used for the first hop, even if the Request-
not wish to be contacted with a SIP URI but instead by a URI is SIP. If a UA is registered with a SIPS Contact header
SIPS URI, it must reject a request to a SIP Request-URI field, the registrar returning a service route (RFC 3608) must
with a 480 Temporarily Unavailable response. The UAS return a service route consisting of SIP URIs if the intent
should include a Warning header with warn-code 381 SIPS of the registrar is to allow both SIP and SIPS to be used in
Required. It is a matter of local policy for a UAS to accept requests sent by that client. If a UA registers with a SIPS
incoming requests addressed to a URI scheme that does not Contact header field, the registrar returning a service route
correspond to what it used for registration. must return a service route consisting of SIPS URIs if the
For example, a UA with a policy of always SIPS would intent of the registrar is to allow only SIPS URIs to be used
address the registrar using a SIPS Request-URI over the TLS, in requests sent by that UA.
would register with a SIPS Contact header field, and the
UAS would reject requests using the SIP scheme with a 480
4.2.3.10.1 Globally Routable UA URI
Temporarily Unavailable response with a Warning header
with warn-code 381 SIPS Required. A UA with a policy of When a GRUU (RFC 5627, see Section 4.3) is assigned to
best-effort SIPS would address the registrar using a SIPS an instance ID/AOR pair through registration, the registrar
Request-URI over TLS, would register with a SIPS Contact MUST assign both a SIP GRUU and a SIPS GRUU. If the
header field, and the UAS would accept requests addressed to Contact header field in the REGISTER request contains
either SIP or SIPS Request-URIs. A UA with a policy of No a SIP URI, the registrar MUST return the SIP version of
SIPS would address the registrar using a SIP Request-URI, the GRUU. If the Contact header field in the REGISTER
could use TLS or not, would register with a SIP AOR and a request contains a SIPS URI, the registrar must return the
SIP Contact header field, and the UAS would accept requests SIPS version of the GRUU.
addressed to a SIP Request-URI. If a UAS needs to reject a
request because the URIs are used inconsistently (e.g., the
4.2.3.11 Proxy Normative Behavior
Request-URI is a SIPS URI but the Contact header field
is a SIP URI), the UAS must reject the request with a 400 Proxies must not use the last-hop exception of RFC 3261
Bad Request response. When a target refresh occurs within when forwarding or retargeting a request to the last hop.
a dialog (e.g., re-INVITE request, UPDATE request), the Specifically, when a proxy receives a request with a SIPS
UAS must include a Contact header field with a SIPS URI if Request-URI, the proxy must only forward or retarget the
the original request used a SIPS Request-URI. To emphasize request to a SIPS Request-URI. If the target UAS had regis-
what is already defined in RFC 3261, UASs must not use the tered previously using a SIP Contact header field instead of
transport=tls parameter. a SIPS Contact header field, the proxy must not forward the
request to the URI indicated in the Contact header field. If
the proxy needs to reject the request for that reason, the proxy
4.2.3.10 Registrar Normative Behavior
must reject it with a 480 Temporarily Unavailable response.
The UAC registers Contacts header fields to either a SIPS or In this case, the proxy should include a Warning header with
a SIP AOR. From a routing perspective, it does not matter warn-code 380 SIPS Not Allowed. Proxies should transport
which one is used for registration as they identify the same requests using a SIP URI over TLS when it is possible to set
resource. The registrar must consider AORs that are identi- up a TLS connection, or reuse an existing one. RFC 5626
cal except for one having the SIP scheme and the other hav- (see Section 13.2), for example, allows for reusing an existing
ing the SIPS scheme to be equivalent. A registrar MUST TLS connection. Some proxies could have policies that pro-
accept a binding to a SIPS Contact header field only if all the hibit sending any request over anything but the TLS. When
appropriate URIs are of the SIPS scheme; otherwise, there a proxy receives a request with a SIP Request-URI, the proxy
could be an inadvertent binding of a secure resource (SIPS) must not forward the request to a SIPS Request-URI.
to an unsecured one (SIP). This includes the Request-URI If the target UAS had registered previously using a SIPS
and the Contacts and all the Path header fields, but does Contact header field, and the proxy decides to forward the
not include the From and To header fields. If the URIs are request, the proxy must replace that SIPS scheme with a SIP
Addressing in SIP ◾ 283
scheme while leaving the rest of the URI as is, and use the Contact header field if it is in an environment where TLS is
resulting URI as the Request-URI of the forwarded request. usable (as described in the previous paragraph). If the target
The proxy must use the TLS to forward the request to the UAS had registered previously using a SIP Contact header
UAS. Some proxies could have a policy of not forwarding field, the redirect server must return a SIP Contact header
requests at all using a non-SIPS Request-URI if the UAS field in a 3xx response if it redirects the request.
had registered using a SIPS Contact header field. If the proxy When a redirect server receives a request with a SIPS
elects to reject the request because it has such a policy or Request-URI, the redirect server may redirect with a 3xx
because it is not capable of establishing a TLS connection, response to a SIP or a SIPS Contact header field. If the target
the proxy may reject it with a 480 Temporarily Unavailable UAS had registered previously using a SIPS Contact header
response with a Warning header with warn-code 381 SIPS field, the redirect server should return a SIPS Contact header
Required. If a proxy needs to reject a request because the field if it is in an environment where TLS is usable. If the
URIs are used inconsistently (e.g., the Request-URI is a SIPS target UAS had registered previously using a SIP Contact
URI but the Contact header field is a SIP URI), the proxy header field, the redirect server must return a SIP Contact
should use response code 400 Bad Request. It is recom- header field in a 3xx response if it chooses to redirect; other-
mended that the proxy use the outbound proxy procedures wise, the UAS may reject the request with a 480 Temporarily
defined in RFC 5626 (see Section 13.2) for supporting UACs Unavailable response with a Warning header with warn-
that cannot provide a certificate for establishing a TLS con- code 380 SIPS Not Allowed. If a redirect server redirects to a
nection (i.e., when server-side authentication is used). UAS that it has no knowledge of (e.g., an AOR in a different
When a proxy sends a request using a SIPS Request- domain), the Contact header field could be of any scheme. If
URI and receives a 3xx response with a SIP Contact header a redirect server needs to reject a request because the URIs
field, or a 416 response, or a 480 Temporarily Unavailable are used inconsistently (e.g., the Request-URI is a SIPS URI
response with a Warning header with warn-code 380 SIPS but the Contact header field is a SIP URI), the redirect server
Not Allowed response, the proxy must not recurse on the should use response code 400 Bad Request. To emphasize
response. In this case, the proxy should forward the best what is already defined in RFC 3261, redirect servers must
response instead of recursing, in order to allow for the UAC not use the transport=tls parameter.
to take the appropriate action. When a proxy sends a request
using a SIP Request-URI and receives a 3xx response with a
4.2.3.13 Call Flows
SIPS Contact header field, or a 480 Temporarily Unavailable
response with a Warning header with warn-code 381 SIPS RFC 5630 has also provided detailed call flows describing
Required, the proxy must not recurse on the response. In this each of the SIP messages with their headers as follows: SIP
case, the proxy should forward the best response instead of UA’s contacts registration using SIPS URI, calling between
recursing, in order to allow for the UAC to take the appro- two SIP UAs using SIPS AOR, one SIP UA calling another
priate action. To emphasize what is already defined in RFC SIP UA’s SIP AOR using TCP, and one SIP UA calling
3261, proxies must not use the transport=tls parameter. another SIP UA’s SIP AOR using TLS. We have left these
call flows as exercises to the reader.
4.2.3.12 Redirect Server Normative Behavior
4.2.3.14 Further Considerations
Using a redirect server with TLS instead of using a proxy has
some limitations that have to be taken into account. Since RFC 3261 itself introduces some complications with using
there is no preestablished connection between the proxy SIPS, for example, when Record-Route (see Sections 2.8
and the UAS (such as with RFC 5626, see Section 13.2), and 4.2.1 and Chapter 9) is not used. When a SIPS URI is
it is only appropriate for scenarios where inbound connec- used in a Contact header field in a dialog-initiating request
tions are allowed. For example, it could be used in a server- and Record-Route is not used, that SIPS URI might not be
to-server environment (redirect server or proxy server) where usable by the other end. If the other end does not support
TLS mutual authentication is used, and where there are no SIPS or TLS, it will not be able to use it. The last-hop excep-
NAT traversal issues. A redirect server would not be able to tion is an example of when this can occur. In this case, using
redirect to an entity that does not have a certificate. A redi- Record-Route so that the requests are sent through proxies
rect server might not be usable if there is a NAT between the can help in making it work. Another example is that even
server and the UAS. When a redirect server receives a request in a case where the Contact header field is a SIPS URI, no
with a SIP Request-URI, the redirect server may redirect Record-Route is used, and the far end supports SIPS and
with a 3xx response to either a SIP or a SIPS Contact header TLS, it might still not be possible for the far end to estab-
field. If the target UAS had registered previously using a SIPS lish a TLS connection with the SIP originating end if the
Contact header field, the redirect server should return a SIPS certificate cannot be validated by the far end. This could
284 ◾ Handbook on Session Initiation Protocol
typically be the case if the originating end was using server- register with GRUU in Section 3.3.5. In this context, RFC
side authentication as described below, or if the originating 5627 defines the following additional terms:
end is not using a certificate that can be validated. TLS itself
has a significant impact on how SIPS can be used. Server-side ◾◾ Contact: The term contact, when used in all lowercase,
authentication (where the server side provides its certificate refers to a URI that is bound to an AOR and GRUU
but the client side does not) is typically used between a SIP by means of a registration. A contact is usually a SIP
end-user device acting as the TLS client side (e.g., a phone or URI, and is bound to the AOR and GRUU through
a personal computer) and its SIP server (proxy or registrar) a REGISTER request by appearing as a value of the
acting as the TLS server side. Contact header field. The contact URI identifies a spe-
TLS mutual authentication (where both the client side cific UA.
and the server side provide their respective certificates) is typ- ◾◾ Remote t arget: The term remote target refers to
ically used between SIP servers (proxies, registrars), or stati- a URI that a UA uses to identify itself for receipt
cally configured devices such as PSTN gateways or media of both mid-dialog and out-of-dialog requests. A
servers. In the mutual authentication model, for two entities remote target is established by placing a URI in the
to be able to establish a TLS connection, it is required that Contact header field of a dialog-forming request or
both sides be able to validate each other’s certificates, either response and is updated by target refresh requests or
by static configuration or by being able to recurse to a valid responses.
root certificate. With server-side authentication, only the cli- ◾◾ Contact header field: The term Contact header field,
ent side is capable of validating the server side’s certificate, with a capitalized C, refers to the header field that
as the client side does not provide a certificate. The conse- can appear in REGISTER requests and responses,
quences of all this are that whenever a SIPS URI is used to redirects, or dialog-creating requests and responses.
establish a TLS connection, it is expected to be possible for Depending on the semantics, the Contact header field
the entity establishing the connection (the client) to validate sometimes conveys a contact, and sometimes conveys
the certificate from the server side. For server-side authen- a remote target.
tication, RFC 5626 (see Section 13.2) is the recommended
approach. For mutual authentication, one needs to ensure
that the architecture of the network is such that connections
4.3.2 GRUU Grammar
are made between entities that have access to each other’s RFC 5627 defines two new Contact header field parameters
certificates. Record-Route RFC 3261 (see Section 2.8 and (temp-gruu and pub-gruu) by extending the grammar for
Chapter 9) and Path (RFC 3327, Section 2.8) are very use- contactparams as defined in RFC 3261. It also defines a new
ful in ensuring that previously established TLS connections SIP URI parameter (gr) by extending the grammar for uri-
can be reused. Other mechanisms might also be used in cer- parameter as defined in RFC 3261. The ABNF is provided
tain circumstances: for example, using root certificates that here for convenience although detail SIP syntaxes are pro-
are widely recognized allows for more easily created TLS vided in Section 2.4.1.2 as follows:
connections.
contact-params =/ temp-gruu / pub-gruu
temp-gruu = "temp-gruu" EQUAL quoted-string
pub-gruu = "pub-gruu" EQUAL quoted-string
uri-parameter =/ gr-param
4.3 Globally Routable UA URI gr-param = "gr" ["=" pvalue]; defined in RFC
3261
4.3.1 Overview
A single user in a SIP network can have a number of UAs for The quoted strings for temp-gruu and pub-gruu must
a host of devices for many functions such as handsets, soft- contain a SIP URI. However, they are encoded like all other
phones, voice-mail accounts, and others. All of those UAs of quoted strings and can therefore contain quoted-pair escapes
that particular use are referenced by the same AOR. There when represented this way.
are a number of contexts in which it is desirable to have an
identifier that addresses a single UA for reaching the user
rather than the group of UAs indicated by an AOR. Many of
4.3.3 Operation
the SIP applications require a UA to construct and distribute GRUUs are issued by SIP domains and always route back to
a URI that can be used by anyone on the Internet to route a proxy in that domain. In turn, the domain maintains the
a call to that specific UA instance. A URI that routes to a binding between the GRUU and the particular UA instance.
specific UA instance is called a GRUU defined in RFC 5627 When a GRUU is dereferenced while sending a SIP request,
that is described here. Note that multiple telephone AORs that request arrives at the proxy. It maps the GRUU to the
Addressing in SIP ◾ 285
contact for the particular UA instance, and sends the request GRUU. The GRUU will have the gr URI parameter, either
there. with or without a value. To avoid creating excessive state in
the registrar, it is often desirable to construct cryptographi-
cally protected stateless GRUUs using an algorithm. An
4.3.3.1 Structure of GRUUs
example of a temporary GRUU constructed using a stateful
A GRUU is a SIP URI that has two properties: algorithm would be
The usage of a counter in a 48-bit space with sequen- If the counter has rolled over, this computation is done
tial assignment allows for a compact representation of the using the value of Ke that goes with the value of Ka, which
hashmap key, which is important for generating GRUUs of produced a valid Ac in the previous HMAC validation. The
reasonable size. The counter starts at zero when the system leading 80 bits (the distinguisher D) are discarded, leaving an
is initialized. Persistent and reliable storage of the counter is index Ii in the hashmap. This index is looked up. If it exists,
required to avoid misrouting of a GRUU to the wrong AOR the proxy now has the AOR and instance ID corresponding
and instance ID. Similarly, persistent storage of the hashmap to this temporary GRUU. If there is nothing in the hashmap
is required, even though the proxy and registrar restart. If for the key Ii, the GRUU is no longer valid and the request is
the hashmap is reset, all previous temporary GRUUs become rejected. The usage of a 48-bit counter allows for the registrar
invalidated. This might cause dialogs in progress to fail, or to have as many as a billion AORs, with 10 instances per
future requests toward a temporary GRUU to fail, when they AOR, and cycle through 10,000 Call-ID changes for each
normally would not. The same hashmap needs to be acces- instance through the duration of a single registration. These
sible by all proxies and registrars that can field requests for a numbers reflect the average; the system works fine if a partic-
particular AOR and instance ID. ular AOR has more than 10 instances or a particular instance
The registrar maintains a pair of local symmetric keys cycles through more than 10,000 Call-IDs in its registration,
Ke and Ka. These are regenerated every time the counter is as long as the average meets these constraints.
reset. When the counter rolls over or is reset, the registrar
remembers the old values of Ke and Ka for a time. Like the
hashmap itself, these keys need to be shared across all proxy
4.3.3.3.3 Network Design Considerations
and registrars that can service requests for a particular AOR The GRUU specification works properly based on logic
and instance ID. To generate a new temporary GRUU, the implemented at the UAs and in the authoritative proxies on
registrar generates a random 80-bit distinguisher value D. It both sides of a call. Consequently, it is possible to construct
then computes network deployments in which GRUUs will not work prop-
erly. One important assumption made by the GRUU mecha-
M = D || Ii nism is that, if a request passes through any proxies in the
originating domain before visiting the terminating domain,
E = AES-ECB-Encrypt(Ke, M) one of those proxies will be the authoritative proxy for the
UAC. Administrators of SIP networks will need to make
A = HMAC-SHA256-80(Ka, E) sure that this property is retained. There are several ways it
can be accomplished:
Temp-Gruu-userpart = “tgruu” || base64(E) || base64(A)
◾◾ If the UAs support the service-route mechanism (RFC
where || denotes concatenation, and AES-ECB-Encrypt rep- 3608, see Section 2.8.2), the registrar can implement it
resents AES encryption in electronic codebook mode. M and return a service route that points to the authorita-
will be 128 bits long, producing a value of E that is 128 bits tive proxy. This will cause requests originated by the
and A that is 80 bits. This produces a user part that has 42 UA to pass through the authoritative proxy.
characters. ◾◾ The UAs can be configured to never use an outbound
When a proxy receives a request whose user part begins proxy, and send requests directly to the domain of the
with tgruu, it extracts the remaining portion and splits it into terminating party. This configuration is not practical in
22 characters (E’) and the remaining 14 characters (A’). It many use cases, but it is a solution to this requirement.
then computes A and E by performing a base64 decode of A’ ◾◾ The UAs can be configured with an outbound proxy
and E’, respectively. Next, it computes in the same domain as the authoritative proxy, and this
outbound proxy forwards requests to the authoritative
Ac = HMAC-SHA256-80(Ka, E) proxy by default. This works very well in cases where
the clients are not roaming; in such cases, the outbound
If the counter has rolled over or reset, this computation is proxy in a visited network may be discovered dynami-
performed with the current and previous Ka. If the Ac value(s) cally through the Dynamic Host Configuration
that are computed do not match the value of A extracted Protocol (DHCP) (RFC 3361).
from the GRUU, the GRUU is rejected as invalid. Next, the ◾◾ In cases where the client discovers a local outbound
proxy computes proxy via a mechanism such as DHCP, and is not
implementing the service route mechanism, the UA
M = AES-ECB-Decrypt(Ke, E) can be configured to automatically add an additional
Addressing in SIP ◾ 287
Route header field after the outbound proxy, which ◾◾ A contact with that instance ID remains registered.
points to a proxy in the home network. This has the ◾◾ The UA does not change the Call-ID in its REGISTER
same net effect of the service route mechanism, but is request compared with previous ones for the same reg-
accomplished through static configuration. id (RFC 5626, see Section 13.2).
<allOneLine>
Contact: <sip:[email protected]> 4.3.6 Dereferencing a GRUU
;pub-gruu="sip:[email protected];gr=urn:
uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6" Because a GRUU is simply a URI, a UA dereferences it in
;temp-gruu="sip:tgruu.7hs== exactly the same way as it would any other URI. However,
[email protected]; once the request has been routed to the appropriate proxy,
gr"
the behavior is slightly different. The proxy will map the
;+sip.instance="<urn:uuid:f81d4fae-7dec-11d0
-a765-00a0c91e6bf6>" GRUU to the AOR and determine the set of contacts that
;expires=3600 the particular UA instance has registered. The GRUU is then
</allOneLine> mapped to those contacts, and the request is routed toward
the UA.
When a UA refreshes this registration before its expira-
tion, the registrar will return back the same public GRUU
but will create a new temporary GRUU. Although each 4.3.7 UA Behavior
refresh provides the UA with a new temporary GRUU, all of
4.3.7.1 Generating a REGISTER Request
the temporary GRUUs learned from previous REGISTER
responses during the lifetime of a contact remain valid as When a UA compliant to this specification generates a
long as the following conditions are met: REGISTER request (initial or refresh), it must include the
288 ◾ Handbook on Session Initiation Protocol
Supported header field in the request. The value of that header Contact header field parameters. These header field param-
field MUST include gruu as one of the option tags. This eters convey the public and a temporary GRUU for the
alerts the registrar for the domain that the UA supports the UA instance, respectively. A UA MUST be prepared for a
GRUU mechanism. Furthermore, for each contact for which Contact header field to contain just a pub-gruu, just a temp-
the UA desires to obtain a GRUU, the UA must include a gruu, neither, or both. The temporary GRUU will be valid
sip.instance media feature tag (see RFC 5626, Section 13.2) for the duration of the registration (i.e., through refreshes),
as a UA characteristic (see RFC 3840, Section 3.4, whose while the public GRUU persists across registrations. The
value must be the instance ID that identifies the UA instance UA will receive a new temporary GRUU in each successful
being registered. Each such Contact header field should not REGISTER response, while the public GRUU will typically
contain a pub-gruu or tempgruu header field. The contact be the same. However, a UA must be prepared for the public
URI must not be equivalent, based on the URI equality rules GRUU to change from a previous one, since the persistence
in RFC 3261 (see Section 4.2), to the AOR in the To header property is not guaranteed with complete certainty. If a UA
field. If the contact URI is a GRUU, it must not be a GRUU changed its Call-ID in this REGISTER request compared
for the AOR in the To header field. with a previous REGISTER request for the same contact or
As in RFC 3261 (see Section 3.3), the Call-ID in a reg-id, the UA MUST discard all temporary GRUUs learned
REGISTER refresh should be identical to the Call-ID used through prior REGISTER responses. A UA may retain zero,
to previously register a contact. With GRUU, an additional one, some, or all of the temporary GRUUs that it is provided
consideration applies. If the Call-ID changes in a register during the time over which at least one contact or reg-id
refresh, the server will invalidate all temporary GRUUs asso- remains continuously registered. If a UA stores any tempo-
ciated with that UA instance; the only valid one will be the rary GRUUs for use during its registration, it needs to be
new one returned in that REGISTER response. When RFC certain that the registration does not accidentally lapse due
5626 is in use, this rule applies to the reg-ids: if the Call-ID to clock skew between the UA and registrar.
changes for the registration refresh for a particular reg-id, the Consequently, the UA must refresh its registration such
server will invalidate all temporary GRUUs associated with that the REGISTER refresh transaction will either com-
that UA instance as a whole. Consequently, if a UA wishes plete or timeout before the expiration of the registration.
its previously obtained temporary GRUUs to remain valid, For default transaction timers, this would be at least 32 sec-
it must utilize the same Call-ID in REGISTER refreshes. onds before expiration, assuming the registration expiration
However, it may change the Call-ID in a refresh if invalida- is larger than 64 seconds. If the registration expiration is
tion is the desired objective. less than 64 seconds, the UA should refresh its registration
Note that, if any dialogs are in progress that utilize a tem- halfway before expiration. Note that when RFC 5626 (see
porary GRUU as a remote target, and a UA performs a reg- Section 13.2) is in use, and the UA is utilizing multiple flows
istration refresh with a change in Call-ID, those temporary for purposes of redundancy, the temporary GRUUs remain
GRUUs become invalid, and the UA will not be reachable valid as long as at least one flow is registered. Thus, even if
for subsequent mid-dialog messages. If a UA instance is try- the registration of one flow expires, the temporary GRUUs
ing to register multiple contacts for the same instance for the learned previously remain valid. In cases where registrars
purposes of redundancy, it MUST use the procedures defined forcefully shorten registration intervals, the registration event
in RFC 5626 (see Section 13.2). A UA utilizing GRUUs package, RFC 3680 (see Section 5.3), is used by UAs to learn
can still perform third-party registrations and can include of these changes. A UA implementing both RFC 3680 and
contacts that omit the +sip.instance Contact header field GRUU must also implement the extensions to RFC 3680 for
parameter. If a UA wishes to guarantee that the REGISTER conveying information on GRUU, as defined in RFC 5628,
request is not processed unless the domain supports and uses as these are necessary to keep the set of temporary GRUUs
this extension, it may include a Require header field in the synchronized between the UA and the registrar. More gener-
request with a value that contains the gruu option tag. This is ally, the utility of temporary GRUUs depends on the UA and
in addition to the presence of the Supported header field, also registrar being in sync on the set of valid temporary GRUUs
containing the gruu option tag. The use of Proxy-Require is at any time. Without the support of RFC 3680 and its exten-
not necessary and is not recommended. sion for GRUU, the client will remain in sync only as long
as it always reregisters well before the registration expiration.
Besides forceful de-registrations, other events (e.g., network
4.3.7.2 Learning GRUUs from REGISTER
outages, connection failures, and short refresh intervals) can
Responses
lead to potential inconsistencies in the set of valid tempo-
If the REGISTER response is a 2xx, each Contact header rary GRUUs. For this reason, it is recommended that a UA
field that contains the +sip.instance Contact header field that utilizes temporary GRUUs implement RFC 3680 and
parameter can also contain a pub-gruu and temp-gruu RFC 5628.
Addressing in SIP ◾ 289
A non-2xx response to the REGISTER request has no configured into the proxy, and the proxy will be configured
impact on any existing GRUUs previously provided to the with a mapping from the GRUU to the IP address (or host
UA. Specifically, if a previously successful REGISTER name) and port of the UA.
request provided the UA with a GRUU, a subsequent failed
request does not remove, delete, or otherwise invalidate the
4.3.7.4 Using One’s Own GRUUs
GRUU. The user and host parts of the GRUU learned by the
UA in the REGISTER response must be treated opaquely A UA should use a GRUU when populating the Contact
by the UA. That is, the UA must not modify them in any header field of dialog-forming and target refresh requests and
way. A UA must not modify or remove URI parameters it responses. In other words, a UA compliant to this specifica-
does not recognize. Furthermore, the UA must not add, tion should use one of its GRUUs as its remote target. This
remove, or modify URI parameters relevant for receipt and includes
processing of request at the proxy, including the transport,
lr, maddr, ttl, user, and comp (see RFC 3486, Section 15.7) ◾◾ The INVITE request
URI parameters. The other URI parameter defined in RFC ◾◾ A2xx or 18x response to an INVITE that contains a
3261 (see Section 4.2), method, would not typically be pres- To tag
ent in a GRUU delivered from a registrar, and a UA may ◾◾ The SUBSCRIBE request
add a method URI parameter to the GRUU before hand- ◾◾ A 2xx response to a SUBSCRIBE which contains a To
ing it out to another entity. Similarly, the URI parameters tag
defined in RFCs 4240 and 4458 (see Section 4.4) are meant ◾◾ The NOTIFY request
for consumption by the UA. These would not be included ◾◾ The REFER request
in the GRUU returned by a registrar and may be added by ◾◾ A 2xx response to NOTIFY
a UA wishing to provide services associated with those URI ◾◾ The UPDATE request
parameters. Note, however, that should another UA deref- ◾◾ A 2xx response to NOTIFY
erence the GRUU, the parameters will be lost at the proxy
when the Request-URI is translated into the registered con- The only reason not to use a GRUU would be privacy
tact, unless some other means is provided for the attributes considerations. When using a GRUU obtained through reg-
to be delivered to the UA. istrations, a UA must have an active registration before using
a GRUU, and must use a GRUU learned through that regis-
tration. It must not reuse a GRUU learned through a previ-
4.3.7.3 Constructing a Self-Made GRUU
ous registration that has lapsed (in other words, one obtained
Many UAs (e.g., gateways to the PSTN, conferencing serv- when registering a contact that has expired). The UA may use
ers, and media servers) do not perform registrations and can- either the public or one of its temporary GRUUs provided by
not obtain GRUUs through that mechanism. These types its registrar. A UA must not use a temporary GRUU learned
of UAs can be publicly reachable. This would mean that the in a REGISTER response whose Call-ID differs from the
policy of the domain is that requests can come from any- one in the most recent REGISTER request generated by the
where on the public Internet and be delivered to the UA UA for the same AOR and instance ID (and, if RFC 5626,
without requiring processing by intervening proxies within Section 13.2, is in use, reg-id). When a UA wishes to con-
the domain. Furthermore, firewall and NAT policies admin- struct an anonymous request as described in RFC 3323 (see
istered by the domain would allow such requests into the Section 20.2), it should use a temporary GRUU. As per RFC
network. When a UA is certain that these conditions are 3261 (see Section 2.8.2), a UA should include a Supported
met, a UA may construct a self-made GRUU. Of course, a header with the option tag gruu in requests and responses it
UA that does REGISTER, but for whom these conditions generates.
are met regardless, may also construct a self-made GRUU.
However, usage of GRUUs obtained by the registrar is rec-
4.3.7.4.1 Considerations for Multiple AORs
ommended instead.
A self-made GRUU is one whose domain part equals the These considerations are described in Section 3.3.3.
IP address or host name of the UA. The user part of the SIP
URI is chosen arbitrarily by the UA. Like all other GRUUs,
4.3.7.5 Dereferencing a GRUU
the URI must contain the gr URI parameter, with or with-
out a value, indicating it is a GRUU. If a UA does not reg- A GRUU is identified by the presence of the gr URI param-
ister, but is not publicly reachable, it would need to obtain eter, and this URI parameter might or might not have a
a GRUU through some other means. Typically, the UA value. A UA that wishes to send a request to a URI that con-
would be configured with a GRUU, the GRUU would be tains a GRUU knows that the request will be delivered to
290 ◾ Handbook on Session Initiation Protocol
a specific UA instance without further action on the part of zero still retains the meaning defined in RFC 3261 (see
of the requestor. Some UAs implement nonstandard URI- Section 3.3)—all contacts, not just those with a specific
handling mechanisms that compensate for the fact that here- instance ID, are deleted. As described earlier, this removes
tofore many contact URIs have not been globally routable. the binding of each contact to the AOR and the binding of
Since any URI containing the gr URI parameter is known each contact to its GRUUs. If the contact URI is equivalent
to be globally routable, a UA should not apply such mecha- (based on URI equivalence in RFC 3261, Section 4.2.1) to
nisms when a contact URI contains the gr URI parameter. the AOR, the registrar must reject the request with a 403
Because the instance ID is a callee capabilities param- Forbidden, since this would cause a routing loop. If the con-
eter, a UA might be tempted to send a request to the AOR tact URI is a GRUU for the AOR in the To header field of
of a user, and include an Accept-Contact header field (see the REGISTER request, the registrar must reject the request
Section 2.8.2) that indicates a preference for routing the with a 403 Forbidden, for the same reason.
request to a UA with a specific instance ID. Although this If the contact is not a SIP URI, the REGISTER request
would appear to have the same effect as sending a request to must be rejected with a 403 Forbidden. Next, the regis-
the GRUU, it does not. The caller preferences expressed in trar checks if there is already a valid public GRUU for the
the Accept-Contact header field are just preferences. Their AOR (present in the To header field of the REGISTER
efficacy depends on a UA constructing an Accept-Contact request) and the instance ID (present as the content of the
header field that interacts with domain-processing logic for +sip.instance Contact header field parameter). If there is no
an AOR, to cause a request to route to a particular instance. valid public GRUU, the registrar should construct a public
Given the variability in routing logic in a domain (e.g., time- GRUU at this time according to the procedures described
based routing to only selected contacts), this does not work later. The public GRUU must be constructed by adding the
for many domain-routing policies. However, this specifica- gr URI parameter, with a value, to the AOR. If the contact
tion does not forbid a client from attempting such a request, contained a pub-gruu Contact header field parameter, the
as there can be cases where the desired operation truly is a header field parameter must be ignored by the registrar. A
preferential routing request. UA cannot suggest or otherwise provide a public GRUU to
the registrar.
Next, the registrar checks for any existing contacts reg-
4.3.7.6 Rendering GRUUs on a User Interface
istered to the same AOR, instance ID, and if the contact
When rendering a GRUU to a user through a user interface, in the REGISTER request is registering a flow (RFC 5626,
it is recommended that the gr URI parameter be removed. Section 13.2), reg-id. If there is at least one, the registrar finds
For public GRUUs, this will produce the AOR, as desired. the one that was most recently registered, and examines the
For temporary GRUUs, the resulting URI will be seemingly Call-ID value associated with that registered contact. If it
random. Future work might provide improved mechanisms differs from the one in the REGISTER request, the registrar
that would allow an automaton to know that a URI is ano- must invalidate all previously generated temporary GRUUs
nymized, and therefore inappropriate to render. for the AOR and instance ID. A consequence of this invali-
dation is that requests addressed to those GRUUs will be
rejected by the domain with a 404 from this point forward.
4.3.8 Registrar Behavior Next, the registrar should create a new temporary
4.3.8.1 Processing a REGISTER Request GRUU for the AOR and instance ID with the characteristics
described later. The temporary GRUU construction algo-
A REGISTER request might contain a Require header field rithm must have the following two properties:
with the gruu option tag; this indicates that the registrar has
to understand this extension to process the request. It does ◾◾ The likelihood that the temporary GRUU is equal to
not require the registrar to create GRUUs, however. As the another GRUU that the registrar has created MUST
registrar is processing the contacts in the REGISTER request be vanishingly small.
according to the procedures (see Section 3.3), the registrar ◾◾ Given a pair of GRUUs, it must be computationally
checks whether each Contact header field in the REGISTER infeasible to determine whether they were issued for
message contains a +sip.instance header field parameter. If the same AOR or instance ID or for different AORs
present with a non-zero expiration, the contact is processed and instance IDs.
further based on the rules in the remainder of this section.
Otherwise, the contact is processed based on normal RFC If the contact contained a temp-gruu Contact header
3261 (see Section 3.3) rules. field parameter, the header field parameter must be ignored
Note that handling of a REGISTER request contain- by the registrar. A UA cannot suggest or otherwise provide a
ing a Contact header field with value “*” and an expiration temporary GRUU to the registrar.
Addressing in SIP ◾ 291
F1. REGISTER
GRUU GRUU
AOR
instance: 1 instance: 2
F2. 200 OK
F3. INVITE
F4. INVITE
F5. 200 OK
F6. 200 OK
Contact Contact Contact
instance: 1 instance: 2 instance: 2
F7. ACK
F8. ACK
(a)
F9. SUBSCRIBE
F10. SUBSCRIBE
(c)
Figure 4.3 GRUU usage in a SIP network: (a) GRUU, AOR, Contact, and Instances; (b) SIP network; and (c) call flows
using GRUU. (Copyright IETF. Reproduced with permission.)
does not mandate a particular mechanism for construction registrar. This package allows registrars to alter registrations
of the GRUU. Example algorithms for public and temporary forcefully (e.g., shortening them to force a reregistration). If
GRUUs that work well are provided above. However, in addi- a registrar is supporting RFC 3680 and GRUU, it must also
tion to the properties described earlier, a GRUU constructed by support RFC 5628.
a registrar must exhibit the following properties:
based on URI comparison (see Section 4.2.1.4), to a cur- proxy itself (this will happen when the request is a mid-
rently valid GRUU within the domain, it should be rejected dialog request), the Path URI must be discarded. This is
with a 404 Not Found response; this is the same behavior a permitted by RFC 3327 as a matter of local policy; usage of
proxy would exhibit for any other URI within the domain GRUUs will require this policy in order to avoid call spirals
that is not valid. If the Request-URI contains the gr URI and likely call failures.
parameter and is equivalent, based on URI comparison, to a A proxy may apply other processing to the request,
GRUU that is currently valid within the domain, processing such as execution of called party features, as it might do for
proceeds as it would for any other URI present in the loca- requests targeted to an AOR. For requests that are outside
tion service, as defined in Section 3.11.5, except that the gr of a dialog, it is recommended to apply screening types of
URI parameter is not removed as part of the canonicaliza- functions, both automated (such as blacklist and whitelist
tion process. This is the case for both out-of-dialog requests screening) and interactive (such as interactive voice response)
targeted to the GRUU and mid-dialog requests targeted to applications that confer with the user to determine whether
the GRUU (in which case the incoming request would have to accept a call. In many cases, the new request is related to
a Route header field value containing the URI that the proxy an existing dialog, and might be an attempt to join it (using
used for record routing). the Join header field defined in RFC 3911, Section 2.8.2) or
Note that the gr URI parameter is retained just for the replace it (using the Replaces header field; RFC 3891, see
purposes of finding the GRUU in the location service; if Section 2.8.2). When the new request is related to an exist-
a match is found, the Request-URI will be rewritten with ing dialog, the UA will typically make its own authorization
the registered contacts, replacing the GRUU and its gr URI decisions; bypassing screening services at the authoritative
parameter. The gr URI parameter is not carried forward proxy might make sense, but needs to be carefully considered
into the rewritten Request-URI. If there are no registered by network designers, as the ability to do so depends on the
contacts bound to the GRUU, the server must return a 480 specific type of screening service.
Temporarily Unavailable response. If there are more than However, forwarding services, such as call forward-
one, there are two cases: ing, should not be provided for requests sent to a GRUU.
The intent of the GRUU is to target a specific UA instance,
◾◾ The client is using RFC 5626 (see Section 13.2) and and this is incompatible with forwarding operations. If the
registering multiple contacts for redundancy. In that request is a mid-dialog request, a proxy should only apply
case, these contacts contain reg-id Contact header field services that are meaningful for mid-dialog requests, gener-
parameters, and the rules described in Section 13.2 for ally speaking. This excludes screening and forwarding func-
selecting a single registered contact apply. tions. In addition, a request sent to a GRUU should not be
◾◾ The client was not using RFC 5626, in which case redirected. In many instances, a GRUU is used by a UA in
there would only be multiple contacts with the same order to assist in the traversal of NATs and firewalls, and a
instance ID if the client had rebooted, restarted, and redirection might prevent such a case from working.
reregistered. In this case, these contacts would not
contain the reg-id Contact header field parameter.
4.3.9.2 Record-Routing
The proxy MUST select the most recently refreshed
contact. As with RFC 5626, if a request to this target See Sections 9.5 and 9.6.
fails with a 408 Request Timeout or 430 Flow Failed
response, the proxy should retry with the next most 4.3.10 GRUU Example
recently refreshed contact.
We have considered a SIP network, shown in Figure 4.3b
Furthermore, if the request fails with any other response, and c, that shows a basic registration and call setup, followed
the proxy must not retry on any other contacts for this by a subscription directed to the GRUU. It then shows a
instance. Any caller preferences in the request as defined in failure of the callee, followed by a reregistration. The conven-
RFC 3841 (see Section 9.9) should be processed against the tions of RFC 4475 are used to describe the representation of
contacts bound to the GRUU. In essence, to select a reg- long message lines. The callee supports the GRUU extension.
istered contact, the GRUU is processed just like it was the As such, its REGISTER (message F1) looks as follows:
AOR, but with only a subset of the contacts bound to the
AOR. REGISTER sip:example.com SIP/2.0
Via: SIP/2.0/UDP
Special considerations apply to the processing of any
192.0.2.1;branch=z9hG4bKnashds7
Path headers stored in the registration (see RFC 3327, Max-Forwards: 70
Section 2.8.2). If the received request has Route header From: Callee <sip:[email protected]>;
field values beyond the one pointing to the authoritative tag=a73kszlfl
294 ◾ Handbook on Session Initiation Protocol
Via: SIP/2.0/UDP host.example.com; for the instance ID and AOR. The registrar then generates
branch=z9hG4bK9zz8 the following response:
From: Caller <sip:[email protected]>;tag=kkaz-
SIP/2.0 200 OK
<allOneLine> Via: SIP/2.0/UDP
To: <sip:[email protected]; 192.0.2.2;branch=z9hG4bKnasbba
gr=urn:uuid:f8 From: Callee <sip:[email protected]>;
1d4fae-7dec-11d0-a765-00a0c91e6bf6> tag=ha8d777f0
</allOneLine> To: Callee <sip:[email protected]>;
tag=99f8f7
Call-ID: [email protected] Call-ID: [email protected]
CSeq: 2 SUBSCRIBE CSeq: 1 REGISTER
Supported: gruu
Event: dialog <allOneLine>
Allow: INVITE, OPTIONS, CANCEL, BYE, ACK, Contact: <sip:[email protected]>
NOTIFY ;pub-gruu="sip:[email protected];
Contact: <sip:[email protected]; gr=urn:
gr=hdg7777ad7aflzig8sf7> uuid:f81d4fae-7dec-11d0-a765-00a0c9
Content-Length: 0 1e6bf6"
;temp-gruu="sip:tgruu.7hatz6cn-098sh
fyq193=
The SUBSCRIBE generates a 200 response (mes- [email protected];gr"
sage F11), which is followed by a NOTIFY (messages ;+sip.instance="<urn:uuid:f81d
F13 and F14) and its response (messages F15 and F16). 4fae-7dec-11d0-a765-00a0c91e6bf6>"
At some point after message 16 is received, the callee’s ;expires=3600
machine crashes and recovers. It obtains a new IP address, </allOneLine>
192.0.2.2. Unaware that it had previously had an active
<allOneLine>
registration, it creates a new one (message F17 below). Contact: <sip:[email protected]>
Notice how the instance ID remains the same, as it per- ;pub-gruu="sip:[email protected];
sists across reboot cycles: gr=urn:
uuid:f81d4fae-7dec-11d0-a765-00a0c9
REGISTER sip:example.com SIP/2.0 1e6bf6"
Via: SIP/2.0/UDP ;temp-gruu="sip:tgruu.7hatz6cn-098sh
192.0.2.2;branch=z9hG4bKnasbba fyq193=
Max-Forwards: 70 [email protected];gr"
From: Callee <sip:[email protected]>; ;+sip.instance="<urn:uuid:f81d4
tag=ha8d777f0 fae-7dec-11d0-a765-00a0c91e6bf6>"
Supported: gruu ;expires=400
To: Callee <sip:[email protected]> </allOneLine>
Call-ID: [email protected]
CSeq: 1 REGISTER Content-Length: 0
<allOneLine>
There is no need for the UA to remove the stale regis-
Contact: <sip:[email protected]> tered contact; the request targeting rules described earlier
;+sip.instance="<urn:uuid: will cause the request to be delivered to the most recent one.
f81d4fae-7dec-11d0-a765-00a0c91e6bf6>"
</allOneLine>
Content-Length: 0
4.4 Services URI
SIP signaling schemes or URIs are independent of any spe-
The registrar notices that a different contact, sip:callee@ cific services and are not standardized for expressing any ser-
192.0.2.1, is already associated with the same instance vices. However, there are some nonstandard attempts in the
ID. It registers the new one too and returns both in the form of informational IETF (International Engineering Task
REGISTER response. Both have the same public GRUUs, Force) RFCs to specify the SIP Request-URIs to have some
but the registrar has generated a second temporary GRUU sort of context-aware knowing which SIP UAs or SIP serv-
for this AOR and instance ID combination. Both contacts ers can invoke some specific services as configured during
are included in the REGISTER response, and the temporary service provisioning. It requires that the SIP servers or appli-
GRUU for each is the same—the most recently created one cation servers need to be configured a priori for invoking
296 ◾ Handbook on Session Initiation Protocol
those services. The context-aware request-URIs can only be any arbitrary string to be provisioned, and map the string to
applied in a private SIP network in a single administrative the desired behavior. The private owner of the system may
domain and may even not be suitable for a large administra- choose to provision mnemonic strings, but the application
tive domain. should not require it. In any large installation, the system
owner is likely to have preexisting rules for mnemonic URIs,
and any attempt by an application to define its own rules
4.4.1 Messaging Services may create a conflict.
Real-time multimedia communications services, including In this example, a voice-mail system should allow an
multimedia messaging, are very complex because a huge arbitrary mix of URLs from these schemes, or any other
number of feature-rich multimedia capabilities/services can scheme that renders valid SIP URIs to be provisioned, rather
be provided through manipulating each individual media than enforce one particular scheme. The key limitations of
consisting of a composite multimedia application or a single these service URIs, such as for voice mail, are that they will
media of a given application. Some of those feature-rich mul- only work within a given administrative domain, and SIP
timedia services are already being offered or are emerging, UAs and servers need to be provisioned with these rules and
but many more innovative complex multimedia services that service configurations. Accordingly, some sample private ser-
are yet to imagine are left for the future. However, there are vice URIs have been defined per RFC 4240 in relation to the
some simple real-time or near-real-time multimedia services voice-mail services context shown in Table 4.2.
that have to deal with some immediate information at hand In addition to providing this set of URIs to the subscriber
that seems to be very important for the users for comple- (to use as one sees fit), an integrated service provider could
menting primary call services. add these to the set of contacts in a find-me proxy. The proxy
For example, a user may need to retrieve the voice mail could then route calls to the appropriate URI based on the
from the voice-mail server because the messages have been origin of the request, the subscriber’s preferences, and cur-
left in one’s voice mailbox while the user has not been avail- rent state. This simple example shows how a variety of voice-
able to take the call. In addition, the phone of the called mail services can be created by adding some services context
party needs to be able to store the voice message of the calling even in SIP Request-URI. The example call flows for differ-
party in the called party’s voice mailbox through transferring ent voice-mail services are shown in RFC 4240 in PSTN–IP
of the call. It is interesting to note that this simple voice-mail networking environments. The cause codes for retargeting
service can be offered with a variation of many features con- or history information of retargeting of multiple targets can-
sidering the overall communications such as announcement not be expressed or known in these specific context-aware
services by the voice-mail server, authentication services for request-URIs. A more general approach is needed for inclu-
the calling party for message retrieval, and call forwarding sion of causes for retargeting and information about multiple
services to the mail server with no answer or on busy. retargeting.
This simple voice-mail service shows that service context-
awareness is needed in providing services. The key is that SIP
4.4.1.2 RFC 4458: Generalized
is the protocol for the session establishment and cannot have
any awareness of services. However, if the SIP URI itself is
Context-Aware Approach
created in such a way that SIP devices interpret the URI as if Earlier, we have seen that the specific Request-URIs cre-
it is context-aware, it will offer services based on the context ated for retargeting the services cannot provide the causes
being embedded. to the caller why the services have been retargeted or no
information can be provided to the caller about the mul-
tiple retargeting of services. A more intelligent approach
4.4.1.1 Specific Context-Aware Approach
described in RFC 4458 that is described in this section can
The concept of context-awareness has been captured in a be taken for creation of services like voice mail, interac-
service implementing SIP as defined in RFC 3261, without tive voice recognition, and similar services using SIP URIs
modification, through the standard use of that protocol’s based on redirecting targets from these applications. Two
Request-URI. However, the concept is applicable to any key pieces of information are needed: first, the target appli-
SIP-based service where the initial application state is deter- cation (e.g., mailbox) to use, and second, the indication of
mined from context. This concept is a usage convention of the desired service (e.g., cause for transferring the call to the
standard SIP as defined in RFC 3261 and does not modify target application). The userinfo and hostport parts of the
or extend that protocol in any way. It is important to note, Request-URI will identify the voice-mail service, the target
in practical applications, that an application does not apply mailbox can be put in the target parameter, and the reason
semantic rules to the various URIs. Instead, it should allow can be put in the cause parameter. For example, if the proxy
Addressing in SIP ◾ 297
wished to use Bob’s mailbox because his phone was busy, The values for the cause URI parameter are defined in
the URI sent to the unified messaging (UM) system could Table 4.3 (RFC 4458).
be something like Note that the ABNF requires some characters to be
escaped if they occur in the value of the target parameters.
sip:[email protected]; For example, the “@” character needs to be escaped. These
target=bill%40example.com;cause=408 reason codes are chosen partly because of interworking
between the IP network that uses SIP and PSTN network
◾◾ Target: Target is a URI parameter that indicates the that uses Integrated Services Digital Network User Part sig-
address of the retargeting entity: in the context of UM, naling. If no appropriate mapping to a cause value defined
this can be the mailbox number. For example, in the in this specification exists in a network either IP or PSTN,
case of a voice-mail system on the PSTN, the user it would be mapped to 302 Unconditional. If a new cause
portion will contain the phone number of the voice-
mail system, while the target will contain the phone
number of the subscriber’s mailbox. The syntax for tar- Table 4.3 Defined Values for Cause Parameter
get parameter using ABNF grammar is shown below Redirecting Reason Value
(pvalue defined in RFC 3261; Section 2.4.1.2):
Unknown/not available 404
No reply 408
◾◾ Cause: Cause is a URI parameter that is used to indi-
cate the service that the UAS receiving the message Unconditional 302
should perform. The ABNF grammar for the cause Deflection during alerting 487
parameter is as follows (status-code defined in TFC
3261): Deflection immediate response 480
parameter needs to be defined, this specification will have to prompts users and collect information from users and media
be updated. The user portion of the URI should be used as bridging/mixing (audio, video, or data) services during mul-
the address of the voice-mail system on the PSTN, while the tipoint conferencing. The user part of each SIP Request-URI
target should be mapped to the original redirecting number for media services has been created having the following
on the PSTN side. The redirection counters should be set properties:
to one unless additional information is available. Because of
multiple retargeting through changing the caller’s original ◾◾ No change is made to core SIP.
Request-URI, it is not possible know any information about ◾◾ Only devices that choose to conform to this private
all those targets. The History-Info header field can be used standard (IETF Informational RFC 4240) have to
to build up as the request progresses and, upon reaching the implement it. Thereby, interoperability among these
UAS, can be returned in certain responses. implementations can be facilitated for the media services.
All the messaging services described in the case of the ◾◾ Media service SIP Request-URIs only apply to multi-
specific context-aware Request-URIs can also be created function SIP-controlled media servers.
using this more generalized approach. In addition, this ◾◾ Non-multifunction SIP-controlled media servers are
approach can also provide the specific reason for retarget- not affected.
ing, including the history of all targets through which the ◾◾ SIP devices other than media servers will have no impact.
call has been retargeted. The limitations of this general-
ized context-aware Request-URI are that the service needs
4.4.2.1 Announcement Services
to understand whether the messaging system it is targeting
supports the syntax defined in this specification. Today, this The announcement services provide the delivery of multi
information is provided to the proxy by configuration. This media services such as the prompt file and audio and/
implies that this approach is also unlikely to work in cases in video prompt by a media server to a terminal or to a group
which the proxy is not configured with information about of terminals. In SIP, a simple way of doing is to set up a
the messaging system or in which the messaging system is call between the media server and the terminal sending an
not in the same administrative domain. Thereby, this scheme INVITE message, and then the announcement services will
is not suitable for implementation in the large system because be offered by the media server. The key of RFC 4240 is to
a proxy will not be able to do anything when the same target create the Request-URI of the INVITE message in such
is being tried multiple times. a way that the media server will be configured for provid-
ing the specific announcement services accordingly. That
is, services are being created to offer by the announcement
4.4.2 Media Services applications of the media server just by knowing the session
The multifunction media server needs to provide some basic level information provided in Request-URIs. In this case,
media services such as playing announcements, prompt- the Request-URI fully describes the announcement service
ing and collecting of information with users, and audio, through the use of the user part of the address and additional
video, or data bridging/mixing services in the SIP network. URI parameters. The user portion of the address, annc, spec-
These media services are a part of the application layer func- ifies the announcement service on the media server. The ser-
tion and are offered with application server protocols with vice has several associated URI parameters that control the
markup languages like voice extensible markup language content and delivery of the announcement. For example, the
(VoiceXML) and media server control markup language form of the SIP Request-URI for announcement services can
(MSCML). However, these media services are offered to the be as follows:
users after establishment of sessions between the users and
the multifunction media server using SIP. The key is how sip:[email protected]; \
play=http://audio.example.net/
to invoke media services when SIP Request-URIs are sent
allcircuitsbusy.g711
to the media server for the session establishments between sip:[email protected]; \
the media server and users. RFC 3087 describes a client or play=file://fileserver.example.
a proxy can communicate context through the use of a dis- net//geminii/yourHoroscope.wav
tinctive Request-URI used in SIP and provides examples of
how this mechanism could be used in a voice-mail appli- The above example shows that service has several associ-
cation with reference to RFC 2543 that was obsoleted by ated URI parameters that control the content and delivery of
RFC 3261. Similar to RFC 3087, RFC 4240 has also defined the announcement. All these parameters are described below:
some generalized media service context-aware SIP Request-
URIs, such as announcements to users by playing media, ◾◾ Play: Specifies the resource or announcement sequence
prompting and collecting services where the media server to be played.
Addressing in SIP ◾ 299
◾◾ Repeat: Specifies how many times the media server 4.4.2.1.1 Formal Syntax
should repeat the announcement or sequence named
The following syntax specification uses the ABNF as
by the play= parameter. The value forever means the
described in RFC 4234:
repeat should be effectively unbounded. In this case,
it is recommended the media server implements some
ANNC-URL = sip-ind annc-ind "@"
local policy, such as limiting what forever means, to
hostport annc-parameters
ensure errant clients do not create a denial of service u
ri-parameters
attack. sip-ind = "sip:"/"sips:"
◾◾ Delay: Specifies a delay interval between announce- annc-ind = "annc"
ment repetitions. The delay is measured in milliseconds. annc-parameters = ";" play-param [";"
◾◾ Duration: Specifies the maximum duration of the content-param]
[";" delay-param]
announcement. The media server will discontinue the
[";" duration-param]
announcement and end the call if the maximum dura- [";" repeat-param]
tion has been reached. The duration is measured in [";" locale-param]
milliseconds. [";" variable-params]
◾◾ Locale: Specifies the language and optionally country [";" extension-params]
variant of the announcement sequence named in the play-param = "play = " prompt-url
content-param = "content-type = " MIME-
type
play= parameter. RFC 4646 specifies the locale tag.
delay-param = "delay = " delay-valudelay-
The locale tag is usually a two- or three-letter code per value = 1*DIGIT
ISO 639-1. The country variant is also often a two- duration-param = "duration = "duration-value
letter code per ISO 3166-1. These elements are con- duration-value = 1*DIGIT
catenated with a single under bar (%x5F) character, repeat-param = "repeat = "
such as enffCA. If only the language is specified, such repeat-value
repeat-value = 1*DIGIT/"forever"
as locale=en, the choice of country variant is an imple-
locale-param = "locale = " token
mentation matter. Implementations should provide the ; per RFC 3066, usually
best possible match between the requested locale and ; ISO639-1ffISO3166-1
the available languages in the event the media server ; e.g., en, enffUS,
cannot honor the locale request precisely. For example, enffUK, etc.
if the request has locale=caffFR, but the media server variable-params = param-name " = "
variable-value
only has frffFR available, the media server should
param-name = "param" DIGIT; e.g.,
use the frffFR variant. Implementations should pro- "param1"
vide a default locale to use if no language variants are variable-value = 1*(ALPHA / DIGIT)
available. extension-params = extension-param [";"
◾◾ Param[n]: Provides a mechanism for passing values that extension-params]
are to be substituted into an announcement sequence. extension-param = token "=" token
Up to nine parameters (param1= through param9=)
may be specified. The mechanics of announcement uri-parameters is the SIP Request-URI parameter list as
sequences are beyond the scope of this document. described in RFC 3261 (see Section 2.8.2). All parameters of
◾◾ Extension: Provides a mechanism for extending the the Request-URI are part of the URI matching algorithm.
parameter set. If the media server receives an exten- The MIME-type is the MIME content type for the announce-
sion it does not understand, it must silently ignore the ment, such as audio/basic, audio/G729, audio/mpeg, video/
extension parameter and value. mpeg, and so on. A number of MIME registrations, which
could be used here, have parameters, for instance, video/
The play= parameter is mandatory and must be present. DV. To accommodate this, and retain compatibility with
All other parameters are optional. Some encodings are not the SIP URI structure, the MIME-type parameter separator
self-describing. Thus, the implementation relies on filename (semicolon, %3b) and value separator (equal, %d3) must be
extension conventions for determining the media type. It escaped. For example:
should be noted that RFC 3261 implies that proxies are sup-
posed to pass parameters through unchanged. However, be
sip:[email protected]; \
aware that nonconforming proxies may strip Request-URI
play=file://fs.example.net//clips/
parameters. In this case, the implementation needs to take my-intro.dvi; \
care of this. For example, the proxy inserting the parameters content-type=video/mpeg%3bencode%
is the last proxy before the media server. d3314M-25/625-50
300 ◾ Handbook on Session Initiation Protocol
The locale-value consists of a tag as specified in RFC 4.4.2.2.1 Syntax for Prompt and Collect Services
4646. The definition of hostport is as specified by RFC 3261.
The following syntax specification uses the ABNF as
The syntax of prompt-url consists of a URL scheme as speci-
described in RFC 4234.
fied by RFC 3986 or a special token indicating a provisioned
announcement sequence. For example, the URL scheme may
DIALOG-URL = sip-ind dialog-ind "@"
include any of the following: http/https, ftp, file, referenc- hostport
ing a local or Network File Transfer protocol (NFS) (RFC dialog-parameters
3530), object and NRFS URL (RFC 2224). If a provisioned sip-ind = "sip:"/"sips:"
announcement sequence is to be played, the value of prompt- dialog-ind = "dialog"
url will have the following form: dialog-parameters = ";" dialog-param
[vxml-parameters]
[uri-parameters]
prompt-url = "/provisioned/" dialog-param = "voicexml = " vxml-url
announcement-id vxml-parameters = vxml-param
announcement-id = 1*(ALPHA/DIGIT) [vxml-parameters]
vxml-param = ";" vxml-keyword " = "
Note that the scheme /provisioned/ was chosen because vxml-value
of a hesitation to register a provisioned: URI scheme. This vxml-keyword = token
document is strictly focused on the SIP interface for the vxml-value = token
announcement service and, as such, does not detail how
announcement sequences are provisioned or defined. Note The vxml-url is the URI of the VoiceXML script. If pres-
that the media type of the object the prompt-url refers to ent, other parameters get passed to the VoiceXML inter-
can be most anything, including audio file formats, text file preter session with the assigned vxml-keyword vxml-value
formats, or URI lists. pairs. Note that all vxml-keywords must have values. If there
is a vxml-keyword without a corresponding vxml-value, the
media server must reject the request with a 400 Bad Request
4.4.2.2 Prompting and Collection Services response code. In addition, the media server must state
The prompt and collection services use voice communica- Missing VXML Value in the reason phrase. The media server
tions and are also known as voice dialogs. It establishes an presents the parameters as environment variables in the con-
aural dialog with the user. The dialog service follows the nection object. Specifically, the parameter appears in the
model of the announcement service. However, the service connection.sip tree. If the Media Server does not support the
indicator is dialog. The dialog service takes a parameter, passing of keyword-value pairs to the VoiceXML interpreter
voicexml=, indicating the URI of the VoiceXML script to session, it must ignore the parameters. uri-parameters is the
execute, and the request URI may look like as follows: SIP Request-URI parameter list as described in RFC 3261.
All parameters in the parameter list, whether they come from
uri-parameters or from vxml-keyworks, are part of the URI
sip:[email protected]; \
voicexml=http://vxmlserver.example. matching algorithm.
net/cgi-bin/script.vxml
Request-URI for media mixing of the multimedia confer- server is the Conference Focus. Note that the conference
ence participants is application server is a server to the conference participants
(i.e., UACs). However, the conference application server is a
sip:conf=uniqueIdentifier@mediaserver.
example.net
client for mixing services to the media server.
303
304 ◾ Handbook on Session Initiation Protocol
video, or data applications. A caller may place a call to another the state changes. The reception of the notification also needs
user, but the called party may be too busy to accept the call. to be acknowledged. Of course, for an initial subscription, a
If the busy state of the called party was known through some notification is due immediately because no state information
kind of notifications to the caller, the call would not even be was sent to the subscriber.
tried. In another situation, if the availability of a called party A more detailed logical implementation example of the
in certain times is known in advance through a notification, SIP event framework over the SIP network is depicted in
the caller might try to place the call in those time intervals. Figure 5.2. Here we show that a SIP event server that hosts
With SIP being the call control signaling protocol for these the state information of events of the resources of SIP func-
highly complex networked multimedia communications, tional entities is placed behind a SIP proxy.
there must be some built-in mechanisms to know these event This logical SUBSCRIBE/NOTIFY event implementa-
states through SIP functional entities such as SIP UAs. tion model considers a separate SIP event server for provid-
For realizing the actual event delivery functionality, RFC ing scalability for the large-scale SIP network. The key is that
6665 introduces two new SIP methods, namely SUBSCRIBE the SIP event server acts on behalf of the SIP UAs in keeping
and NOTIFY. A key requirement of the SIP event notifica- updates on any changes of states of SIP resources on behalf of
tion framework is the publication of the event states, and UAs. SIP has elaborate standards about creating, refreshing,
the PUBLISH method defined in RFC 3903 is used for this removing, confirming, receiving and processing, and fork-
purpose. A subscriber subscribes to SIP resources to know a ing of subscriptions and notifications along with maintain-
particular event’s state, and the notification receives the latter ing detailed security. For each recourse state of a UA, there
for the initial notification and all subsequent ones that are may be a huge number of subscribers. In this situation, an
related to this subscription. It is envisioned that the so-called individual SIP UA may not be efficient enough to deal with
SIP event server hosts the state information of the particular those notifications of the updated states to all its subscribers
event. SIP functional entities route the subscription to the individually. Therefore, a separate SIP event server will be a
SIP event server from SIP UAs and route back the notifica- natural choice. In addition, there can be many SIP proxies in
tion to the respective SIP UAs. a given administrative domain. Many separate administra-
tive domains may also be involved where SIP UAs may be
located. As a result, many more SIP event servers may be
5.2.2 Subscription, Notification, needed across single or multiple administrative domains. The
and Publication Event Model details are beyond the scope of the current discussion.
5.2.2.1 SUBSCRIBE and NOTIFY
5.2.2.2 PUBLISH
Figure 5.1 depicts an overview of the general concept for the
operation of a SIP event based on subscription and notifica- The PUBLISH method allows a user to create, modify, and
tion. Note that the subscription must be confirmed before remove the state in another entity that manages this state
expecting any notification. A notification is only sent when on behalf of the user that may have a single or multiple
SUBSCRIBER NOTIFIER
F2. Acknowledge
F2. 200 OK
subscription
Figure 5.1 Typical operation of event subscription and notification. (Copyright IETF. Reproduced with permission.)
SIP Event Framework and Packages ◾ 305
SIP proxy 1
Proxy
from UA 1 to notifier UA 2)
query
SUBSCRIBE
SUBSCRIBE
states of UA 2)
states of UA 2)
NOTIFY
NOTIFY
Registration Registration
SIP UA 1 SIP UA 2
Figure 5.2 Logical view of SUBSCRIBE/NOTIFY event model implementation. (Copyright IETF. Reproduced with
permission.)
publication UAs/endpoints. The PUBLISH request contains Similar to the subscription, a SIP EPA can publish its states
the Request Uniform Resource Identifier, which is a request to a SIP event state compositor that can also distribute the
populated with the address of the resource for which the PUA’s states as a notifier to all its subscribers. The general
user wishes to publish the event state. However, the event operation of the states’ publication, subscription, and notifi-
state compositor generates the composite event state of the cation is shown in Figure 5.3.
resource, taking out the information from the unique state Figure 5.4 shows an example of a logical view of the
published by an individual user. In addition to a particu- PUBLISH event implementation model. As before, we
lar resource, all published event states are associated with a show a separate SIP event server that acts as an ESC for the
specific event package. The user is able to discover the com- PUBLISH method.
posite event state of all of the active publications through The key of the implementation example of the SIP event
a subscription to that particular event package as described framework shows that a separate SIP event server is more
in the SUBSCRIBE method. Moreover, RFC 3903 speci- scalable for implementation of the subscription, publication,
fies that a UAC that publishes an event state is labeled an and notification over the SIP network.
Event Publication Agent (EPA). For presence, this is the
familiar presence user agent (PUA) role as defined in RFC
3856. The entity that processes the PUBLISH request is
5.2.2.3 SIP Extensions
known as an event state compositor (ESC). For presence,
this is the familiar presence agent (PA) role as defined in The SIP event framework and packages define the mecha-
RFC 3856. nisms by which SIP nodes can request notification from
Unlike SUBSCRIBE, a PUBLISH request does not remote nodes indicating that certain events have occurred.
establish a dialog as specified in RFC 3903. A UA client may The event notification framework needs some additional
include a Route header field in a PUBLISH request based on capabilities for which the base SIP specified in RFC 3261 has
a preexisting route. The Record–Route header field has no been extended in the following respects:
meaning in PUBLISH requests or responses, and must be
ignored if present. In particular, the UAC must not create a ◾◾ Methods: SUBSCRIBE, NOTIFY, and PUBLISH
new route set based on the presence or absence of a Record– ◾◾ Header fields: Event, Allow-Events, Subscription-State,
Route header field in any response to a PUBLISH request. and Suppress-If-Match
306 ◾ Handbook on Session Initiation Protocol
Event state
Event publication compositor (ESC)/
agent (EPA) notifier Subscriber
F2. Acknowledge
F2. 200 OK
subscription
F4. Acknowledge
F4. 200 OK
subscription
Figure 5.3 Typical operation of event publication, subscription, and notification. (Copyright IETF. Reproduced with
permission.)
SIP event
Subscription, server
notification, and Event state
DNS
publication compositor (ESC)
state queries/ SUBSCRIBE()
lookup NOTIFY()
DNS lookup PUBLISH()
SIP proxy 1
Proxy
Update location server query
SUBSCRIBE
SUBSCRIBE
PUBLISH
PUBLISH
Registration Registration
SIP UA 1 SIP UA 1
….
….
SIP UA k SIP UA q
Figure 5.4 Logical view of PUBLISH event model implementation. (Copyright IETF. Reproduced with permission.)
SIP Event Framework and Packages ◾ 307
◾◾ Respond codes: 202 Accepted, 204 No Notification, to it and what they are to contain. RFC 4661 defines a pre-
and 409 Bad Event ferred format in the form of an XML document that is used
◾◾ Event header field parameters: max-rate, min-rate, and to enable the subscriber to describe the state changes of a
adaptive-min-rate resource that cause notifications to be sent to it and what
◾◾ Subscription-state header field reason codes (event– those notifications are to contain. The filter mechanisms
reason–value): deactivated, probation, rejected, time- and the document format can be used for many SIP event
out, giveup, noresource, invariant, and badfilter packages. Note that subscriptions are expired and must be
◾◾ Option tag: eventlist refreshed by subsequent SUBSCRIBE requests before the
◾◾ Event notification filtering expiration time. Detailed discussions on all SIP event pack-
◾◾ State aggregation ages are beyond the scope of this chapter.
PROBLEMS
1. What is the event framework needed in SIP? Explain
5.3 Event Package the functionalities of SUBSCRIBE, NOTIFY, and
The SUBSCRIBE, NOTIFY, and PUBLISH method are PUBLISH.
used by SIP event packages as a mechanism for publication 2. Describe the detailed header fields of each message of
of the respective event packages. In accordance to the SIP each call flow of Figures 5.1 and 5.3 along with their
event framework, it is required that notification needs to be message bodies with reasonable explanations for a
sent to the subscriber when there are changes to the states given scenario of your choice.
of the resources that have been subscribed to. There can be 3. How do the SUBSCRIBE, NOTIFY, and PUBLISH
many changes in the states of the resource(s), and a subscriber methods differ with respect to other SIP methods?
may be only interested about a certain kind of state change 4. Develop the detailed call flows of Figures 5.2 and 5.4
of the resource(s) instead of sending all the state changes. using all relevant messages including the SIP event
Thus, the event notification service will be efficient if there server. Explain why a separate SIP event server is scal-
have been some defined rules for filtering the state changes of able for the large-scale SIP network.
the resources and the subscriber can subscribe to those event 5. Extend the call flows of Figures 5.2 and 5.4 considering
changes described by specific filtering rules. two separate administrative domains where the inter-
Thereby, the filtering mechanism can be used for con- domain communications are done only via SIP proxies
trolling the content of event notifications, and RFC 4660 while each domain has a separate SIP event server.
provides a mechanism for filtering whereby a subscriber 6. What constitutes the SIP event package? Is presence
describes its preference for when notifications are to be sent protocol a new protocol? Justify your answer.
Chapter 6
309
310 ◾ Handbook on Session Initiation Protocol
PUBLISH
Figure 6.1 Publication, subscription, and notifications of presence/presentity for reachability in real time.
information to interested parties, called watchers. A presence ◾◾ Presence u ser a gent ( PUA): A PUA manipulates
protocol is a protocol for providing a presence service over presence information for a presentity. This manipula-
the Internet or any Internet Protocol network. RFC 3856 tion can be the side effect of some other action (such
extends SIP for proving the presence service, while RFC as sending a SIP REGISTER request to add a new
3857 defines a SIP event package for the watcher. Contact) or can be done explicitly through the publi-
cation of presence documents. We explicitly allow mul-
tiple PUAs per presentity. This means that a user can
6.2.2 SIP Extensions for Presence have many devices, such as a cell phone and PDA, each
RFC 3856 extends the SIP protocol through defining new of which is independently generating a component of
functionalities of SIP especially the SIP UA. The new func- the overall presence information for a presentity. PUAs
tionalities of SIP for providing presence services are defined push data into the presence system, but are outside of
as follows: it, in that they do not receive SUBSCRIBE messages or
send NOTIFY messages.
◾◾ Presence: It is defined as the willingness and ability ◾◾ Presence agent (PA): A PA is a SIP user agent that is
of a user to communicate with other users on the net- capable of receiving SUBSCRIBE requests, responding
work. Publications, subscriptions, and notifications of to them, and generating notifications of changes in the
presence are supported by defining an event package presence state. A PA must have knowledge of the pres-
within the general SIP event notification framework. ence state of a presentity. This means that it must have
There can be rules about how and what part of pres- access to presence data manipulated by PUAs for the
ence information can be accessed. The detailed infor- presentity. One way to do this is by colocating the PA
mation includes location, preferred communication with the proxy/registrar.
mode, current mood, and activity. Another way is to colocate it with the PUA of the
◾◾ Presentity: It represents a user or a group of users or presentity. However, these are not the only ways, and
programs that are the source of presence information. this specification makes no recommendations about
Presence and Instant Messaging in SIP ◾ 311
where the PA function should be located. A PA is designed so that much of it can be derived automatically,
always addressable with a SIP Uniform Resource for example, from calendar files or user activity. The rate of
Identifier (URI) that uniquely identifies the presentity presence notification, type of composition policies that need
(i.e., sip:[email protected]). There can be multiple PAs to be supported, size of the sent watcher filter sent by the
for a particular presentity, each of which handles some watcher in the SUBSCRIBE message, and the partial noti-
subset of the total subscriptions currently active for the fication used to conserve bandwidth by sending only the
presentity. A PA is also a notifier defined in RFC 3265 changes in the presence document to the watchers are among
that supports the presence event package. the things that affect the scalability of the presence imple-
◾◾ Presence server: A presence server is a physical entity mentation significantly. The detailed discussions are beyond
that can act as either a PA or as a proxy server for the scope of this chapter.
SUBSCRIBE requests. When acting as a PA, it is
aware of the presence information of the presentity
through some protocol means. When acting as a proxy,
6.2.4 Presence Operations
the SUBSCRIBE requests are proxied to another entity Figure 6.2 depicts a simple example (RFC 3863) of how
that may act as a PA. the presence server can be responsible for sending notifica-
◾◾ Edge presence server: An edge presence server is a PA tions for a presentity. This flow assumes that the watcher has
that is colocated with a PUA. It is aware of the pres- previously been authorized to subscribe to this resource at
ence information of the presentity because it is colo- the server. In this flow, the PUA informs the server about
cated with the entity that manipulates this presence the updated presence information through some non-SIP
information. means. When the value of the Content-Length header field
◾◾ Presence U RI: A presentity is identified in the most is “...,” this means that the value should be whatever the com-
general way through a presence URI (RFC 3859), puted length of the body is.
which is of the form pres:user@domain. These URIs Each of the message details is provided below:
are resolved to protocol-specific URIs, such as the SIP
or secure SIP (SIPS) URI, through domain-specific F1 SUBSCRIBE watcher -> example.com server
mapping policies maintained on a server. It is very pos-
sible that a user will have both a SIP (or SIPS) URI and SUBSCRIBE sip:[email protected] SIP/2.0
Via: SIP/2.0/TCP watcherhost.example.com;
a presence URI to identify both themselves and other
branch=z9hG4bKnashds7
users. This leads to questions about how these URIs
relate and which are to be used.
◾◾ Watcher: This represents (RFC 3857) the requester Presence Presence
Watcher user agent
server
of presence information about a presentity. Watcher (PUA)
information refers to the set of users subscribed to a
particular resource within a particular event package.
F1 SUBSCRIBE
Watcher information changes dynamically as users
subscribe, unsubscribe, are approved, or are rejected.
A user can subscribe to this information, and therefore F2 200 OK
F2. (SIP) 200 OK F2. Bob listens on port 8888, and sends the following
response:
F3. (SIP) ACK
Bob->Alice (SIP): 200 OK
F7. Alice->Bob (MSRP): that; their UA simply morphs from an ordinary UA into a
special-purpose one called a focus UA. Another commonly
MSRP dkei38sd 200 OK used setup is one where a dedicated node in the network
To-Path: msrp://bob.example.com:8888/9di4eae
functions as a focus UA. Each chat room has an identifier
923wzd;tcp
From-Path: msrp://alicepc.example.com:7777/ of its own: a SIP URI that participants use to join the chat
iau39soe2843z;tcp room, for example, by sending an INVITE request to it.
-------dkei38sd$ The conference focus processes the invitations, and as such,
maintains SIP dialogs with each participant. In a multiparty
F8. Alice->Bob (SIP): BYE sip:[email protected] chat, or chat room, MSRP is one of the established media
streams. Each chat-room participant establishes an MSRP
Alice invalidates local session state.
session with the MSRP switch, which is a special-purpose
MSRP application. The MSRP sessions can be relayed by
F9. Bob invalidates local state for the session.
one or more MSRP relays, which are specified in RFC 4976
Bob->Alice (SIP): 200 OK (Figure 6.5).
The MSRP switch is similar to a conference mixer in that
it handles media sessions with each of the participants and
6.3.4 Multiparty Session Mode bridges these streams together. However, unlike a conference
The MSRP protocol is being extended (IETF draft-ietf-simple- mixer, the MSRP switch merely forwards messages between
chat-18, January 2013) for providing multiparty IM confer- participants but does not actually mix the streams in any
encing. Participants in a chat room can be identified by a way. The system is illustrated in Figure 6.6.
pseudonym, and decide if their real identifier is disclosed to Typically, chat-room participants also subscribe to a con-
other participants. It uses the SIP conferencing framework ference event package to gather information about the con-
(RFC 4353) as a design basis. It also aims to be compatible ference roster in the form of conference state notifications.
with the centralized conferencing framework (RFC 5239). For example, participants can learn about other participants’
Before a chat room can be entered, it must be created. Users identifiers, including their nicknames. All messages in the
wishing to host a chat room themselves can of course do just chat room use the Message/CPIM wrapper content type
MSRP sessions
MSRP sessions
MSRP sessions
SIP dialogs
MSRP sessions
MSRP sessions
Figure 6.5 Multiparty chat overview shown with MSRP relays and a conference focus UA. (Copyright IETF. Reproduced
with permission.)
316 ◾ Handbook on Session Initiation Protocol
and the MSRP switch. Notice that these two MSRP sessions
MSRP MSRP
client client can still be multiplexed over the same TCP connection as per
MSRP MSRP regular MSRP procedures. However, each chat room is asso-
client client ciated with a unique MSRP session and a unique SIP dialog.
A detailed discussion about multiparty chat conferencing is
out of scope of this chapter.
MSRP MSRP MSRP
client switch client
317
318 ◾ Handbook on Session Initiation Protocol
sessions in a multimedia session; the binding of the service. In particular, for multimedia conferences, a
SSRC identifiers is provided through RTCP. If a par- control protocol may distribute multicast addresses
ticipant generates multiple streams in one RTP session, and keys for encryption, negotiate the encryption
for example, from separate video cameras, each must algorithm to be used, and define dynamic mappings
be identified as a different SSRC. between RTP payload type values and the payload for-
◾◾ Contributing s ource (CSRC): A source of a stream mats they represent for formats that do not have a pre-
of RTP packets that has contributed to the combined defined payload type value. Examples of such protocols
stream produced by an RTP mixer. The mixer inserts include the SIP (RFC 3261), ITU Recommendation
a list of the SSRC identifiers of the sources that con- H.323, and applications using SDP (RFC 4566, see
tributed to the generation of a particular packet into Section 7.7), such as RTSP (RFC 2326 to be obsoleted
the RTP header of that packet. This list is called the by the International Engineering Task Force [IETF]
CSRC list. An example application is audio conferenc- draft if approved: draft-ietf-mmusic-rfc2326bis-40,
ing where a mixer indicates all the talkers whose speech October 2014). For simple applications, electronic mail
was combined to produce the outgoing packet, allow- or a conference database may also be used. The specifi-
ing the receiver to indicate the current talker, even cation of such protocols and mechanisms is beyond the
though all the audio packets contain the same SSRC scope of this document.
identifier (that of the mixer).
◾◾ End system: An application that generates the content
to be sent in RTP packets or consumes the content of 7.2.2.2 Byte Order, Alignment,
received RTP packets. An end system can act as one and Time Format
or more synchronization sources in a particular RTP
All integer fields are carried in network byte order, that
session, but typically only one.
is, the most significant byte (octet) first. This byte order is
◾◾ Mixer: An intermediate system that receives RTP
commonly known as big endian. Unless otherwise noted,
packets from one or more sources, possibly changes the
numeric constants are in decimal (base 10). All header data is
data format, combines the packets in some manner,
aligned to its natural length, that is, 16-bit fields are aligned
and then forwards a new RTP packet. Since the tim-
on even offsets, 32-bit fields are aligned at offsets divisible by
ing among multiple input sources will not generally be
four, and others. Octets designated as padding have the value
synchronized, the mixer will make timing adjustments
zero. Wall clock time (absolute date and time) is represented
among the streams and generate its own timing for the
using the time-stamp format of the Network Time Protocol
combined stream. Thus, all data packets originating
(NTP), which is in seconds relative to 0h UTC on January 1,
from a mixer will be identified as having the mixer as
1900 (RFC 1350).
their synchronization source.
◾◾ Translator: An intermediate system that forwards
RTP packets with their synchronization source identi- 7.2.2.3 RTP Fixed Header Fields
fier intact. Examples of translators include devices that
convert encodings without mixing, replicators from The RTP packet header format is shown in Figure 7.1. The
multicast to unicast, and application-level filters in first 12 octets are present in every RTP packet, while the list
firewalls. of CSRC identifiers is present only when inserted by a mixer.
◾◾ Monitor: An application that receives RTCP packets The fields have the following meaning:
sent by participants in an RTP session, in particular
the reception reports, and estimates the current QOS ◾◾ Version (V): 2 bits. This field identifies the version of
for distribution monitoring, fault diagnosis, and long- RTP. The version defined by this specification is two
term statistics. The monitor function is likely to be (2). (The value 1 is used by the first draft version of
built into the application(s) participating in the session, RTP, and the value 0 is used by the protocol initially
but may also be a separate application that does not implemented in the vat audio tool.)
otherwise participate and does not send or receive the ◾◾ Padding (P): 1 bit. If the padding bit is set, the packet
RTP data packets (since they are on a separate port). contains one or more additional padding octets at the
These are called third-party monitors. It is also accept- end that are not part of the payload. The last octet of
able for a third-party monitor to receive the RTP data the padding contains a count of how many padding
packets but not send RTCP packets or otherwise be octets should be ignored, including itself. Padding may
counted in the session. be needed by some encryption algorithms with fixed
◾◾ Non-RTP m eans: Protocols and mechanisms that block sizes or for carrying several RTP packets in a
may be needed in addition to RTP to provide a usable lower-layer protocol data unit.
320 ◾ Handbook on Session Initiation Protocol
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 0 1
Time stamp
……..
Figure 7.1 RTP packet header. (Copyright IETF. Reproduced with permission.)
◾◾ Extension ( X): 1 bit. If the extension bit is set, the synchronization accuracy and for measuring packet
fixed header must be followed by exactly one header arrival jitter (one tick per video frame is typically not
extension. sufficient).
◾◾ CSRC count (CC): 4 bits. The CSRC count contains
the number of CSRC identifiers that follow the fixed
7.2.2.4 Media Payload Type
header.
◾◾ Marker (M): 1 bit. The interpretation of the marker is The media payload type (PT) for different audio (A), video
defined by a profile. It is intended to allow significant (V), and audio-video (AV) combined encodings are shown in
events such as frame boundaries to be marked in the Tables 7.1 and 7.2, as standardized by RFC 3551.
packet stream. A profile may define additional marker
bits or specify that there is no marker bit by changing
the number of bits in the payload type field.
7.2.2.5 Multiplexing RTP Sessions
◾◾ Payload t ype ( PT): 7 bits. This field identifies the In RTP, multiplexing is provided by the destination trans-
format of the RTP payload and determines its inter- port address (network address and port number), which is
pretation by the application. A profile may specify a different for each RTP session. For example, in a teleconfer-
default static mapping of payload type codes to pay- ence composed of audio and video media encoded separately,
load formats. Additional payload type codes may be each medium should be carried in a separate RTP session
defined dynamically through non-RTP means. A set with its own destination transport address. Separate audio
of default mappings for audio and video is specified in and video streams should not be carried in a single RTP ses-
the companion RFC 3551. An RTP source may change sion and demultiplexed based on the payload type or SSRC
the payload type during a session, but this field should fields. Using a different SSRC for each medium for multi-
not be used for multiplexing separate media streams. A plexing but sending them in the same RTP session would
receiver must ignore packets with payload types that it avoid several problems: interleaving media streams changing
does not understand. encodings, timing and sequence number space, and RTCP
◾◾ Sequence n umber: 16 bits. The sequence number sender and receiver reports describing timing and sequence
increments by one for each RTP data packet sent, and number. Note that RTP mixer may not be able to combine
may be used by the receiver to detect packet loss and interleaved streams of incompatible media into one stream.
to restore packet sequence. The initial value of the Carrying multiple media in one RTP session over the net-
sequence number should be random (unpredictable) to work will force all streams to follow the same path.
make known-plaintext attacks on encryption more dif-
ficult. Techniques for choosing unpredictable numbers
are discussed in RFC 4086.
7.2.2.6 RTP Translators and Mixers
◾◾ Time stamp: 32 bits. The time stamp reflects the sam- RFC 3530 specifies some aspects of RTP translators and
pling instant of the first octet in the RTP data packet. mixers at the RTP level. Although this support adds some
The sampling instant must be derived from a clock complexity to the protocol, the need for these functions has
that increments monotonically and linearly in time to been clearly established by experiments with multicast audio
allow synchronization and jitter calculations. The reso- and video applications in the IP network. The details are
lution of the clock must be sufficient for the desired beyond the scope of this section.
Media Transport Protocol and Media Negotiation ◾ 321
2 Reserved A 1 19 Reserved A
7.2.3 RTCP Specification ◾◾ RR: receiver report, for reception statistics from par-
ticipants that are not active senders and in combina-
The RTP control protocol is based on the periodic trans- tion with SR for active senders reporting on more than
mission of control packets to all participants in the session, 31 sources
using the same distribution mechanism as the data packets. ◾◾ SDES: source description items, including CNAME
The underlying protocol must provide multiplexing of the ◾◾ BYE: indicates end of participation
data and control packets, for example, using separate port ◾◾ APP: application-specific functions
numbers with UDP. RTCP performs four functions: provid-
ing feedback on the quality of the data distribution, keeping
track of each participant using canonical name (CNAME), An example of RTCP compound packets is shown in
sending RTCP control packets so that each party can inde- Figure 7.2. However, all RTCP packets must be sent in a
pendently observe the number of participants, and providing compound packet of at least two individual packets, with the
session information including participant identification con- following format:
veyed with minimal session control information.
◾◾ Encryption prefix: If and only if the compound packet
is to be encrypted according to the method in RFC
7.2.3.1 RTCP Packet Format
3550, it must be prefixed by a random 32-bit quan-
This specification defines several RTCP packet types to carry tity redrawn for every compound packet transmitted.
a variety of control information: If padding is required for the encryption, it must be
added to the last packet of the compound packet.
◾◾ SR: sender report, for transmission and reception sta- ◾◾ SR or R R: The first RTCP packet in the compound
tistics from participants that are active senders packet must always be a report packet to facilitate
322 ◾ Handbook on Session Initiation Protocol
Table 7.2 Payload Types (PT) for Video and Combined however, there can be some exceptions as well. For
Encodings example, SDES information might be encrypted while
reception reports were sent in the clear to accommodate
Encoding Media
PT Name Type Clock Rate (Hz) third-party monitors that are not privy to the encryp-
tion key. In those cases, the SDES information must
24 Unsigned V be appended to an RR packet with no reports (and
25 CelB V 90,000 the random number) to satisfy the requirement that
all compound RTCP packets begin with an SR or RR
26 JPEG V 90,000 packet. The SDES CNAME item is required in either
27 Unassigned V the encrypted or unencrypted packet, but not both.
The same SDES information should not be carried
28 nv V 90,000 in both packets as this may compromise the encryp-
29 Unassigned V tion. Other source description items may optionally be
included if required by a particular application, subject
30 Unassigned V to bandwidth constraints specified in RFC 3550.
31 H261 90,000 ◾◾ BYE or APP: Other RTCP packet types, including
those yet to be defined, may follow in any order, except
32 MPV 90,000 that BYE should be the last packet sent with a given
33 MP2T AV 90,000 SSRC/CSRC. Packet types may appear more than
once.
34 H263 V 90,000
35–71 Unassigned
7.2.3.2 Additional RTCP Functionalities
72–76 Reserved N/A N/A
RFC 3530 specifies many other functionalities such as
77–95 Unassigned ?
RTCP packet transmission interval along with maintaining
96–127 Dynamic ? the number of session members and RTCP packet send-and-
receive rules. However, the packet send-and-receive rules
Dynamic H263-1998 V 90,000
have further been detailed, as follows: computing the RTCP
Source: Copyright IETF. Reproduced with permission. transmission interval, initialization, receiving an RTP or
non-BYE RTCP packet, receiving an RTCP BYE packet,
header validation as described in RFC 3550. This is timing out an SSRC, expiration of transmission timer, trans-
true even if no data has been sent or received, in which mitting a BYE packet, updating we-sent, and allocation of
case an empty RR must be sent, and even if the only source description bandwidth. More important, as RTCP is
other RTCP packet in the compound packet is a BYE. usually sent as a compound packet, the RTCP packet start
◾◾ Additional R Rs: If the number of sources for which with sender report (SR) or receiver report (RR), if necessary,
reception statistics are being reported exceeds 31, the then additional packets are sent. Moreover, RTCP reports
number that will fit into one SR or RR packet, then such as extending the sender and receiver reports and analyz-
additional RR packets should follow the initial report ing sender and receiver reports are also specified. The BYE
packet. packet is used to leave a session. The APP packet is an appli-
◾◾ SDES: An SDES packet containing a CNAME item cation-specific packet that is used in RTCP extensions. With
must be included in each compound RTCP packet, respect to RTP translators and mixers including cascaded
SSRC
SSRC
SSRC
SSRC
SSRC
SSRC
SR Sender Site 1 Site 2 SDES CNAME PHONE CNAME LOC BYE Reason
report
Compound packet
UDP packet
Figure 7.2 Example of an RTCP compound packet. (Copyright IETF. Reproduced with permission.)
Media Transport Protocol and Media Negotiation ◾ 323
ones, RTCP also has features describing the detailed opera- packet. Thus, if both encryption and authentication are
tion and maintenance aspects of media translation and mix- applied, encryption shall be applied before authentica-
ing properties. tion on the sender side and conversely on the receiver
side. The authentication tag provides authentication of
the RTP header and payload, and it indirectly provides
replay protection by authenticating the sequence num-
7.3 Secure RTP (SRTP) ber. Note that the MKI is not integrity protected, as
RFC 3371 specifies the Secure Real-Time Transport Protocol this does not provide any extra protection.
(SRTP) extending RTP to provide confidentiality of the
RTP payload while leaving the RTP header in the clear so One or more SRTP crypto attributes, known as SDP
that link-level header compression algorithms can still oper- security description (SDES), specified in RFC 4568 (see
ate. Figure 7.3 shows the SRTP packet format that extends Sections 7.7.3 and 7.7.4), that are later used for media encryp-
the RTP for providing security. tion are negotiated at the SIP session level using SDP offer–
Note that SRTP extends the RTP with only two parameters: answer model. SDES is not an authenticated key mechanism
as clear texts are used and communicated for the security
◾◾ Master key identifier (MKI): MKI is an optional parameters and the keys. A crypto parameter should not
parameter and its length is configurable. The MKI is be specified at the session level unless SIP signaling is pro-
defined, signaled, and used by key management. The tected or encrypted by other means. SRTP is based on the
MKI identifies the master key from which the session Advanced Encryption Standard (AES) and provides stronger
keys were derived that authenticate or encrypt the par- security. The new secure RTP for audio-video transport pro-
ticular packet. Note that the MKI shall not identify the file, designated as RTP/SAVP, offers confidentiality, integrity
SRTP cryptographic context. The MKI may be used by protection, data origin authentication in case of point-to-
key management for the purposes of rekeying, identifying point communications, and replay protection to the RTP
a particular master key within the cryptographic context. and RTCP media stream. In cryptographic context that
◾◾ Authentication tag: This parameter is recommended are associated with the one RTP and RTCP stream, SRTP
and its length is configurable. The authentication tag is allows to negotiate the security parameters using a number of
used to carry message authentication data. The authen- key management protocols such as multimedia key exchange
ticated portion of an SRTP packet consists of the RTP (MIKEY) and others, and provides the end points with a pair
header followed by the encrypted portion of the SRTP of shared secret known as master key and master salt. This
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
…………..
Payload …..
Figure 7.3 SRTP packet format. (Copyright IETF. Reproduced with permission.)
324 ◾ Handbook on Session Initiation Protocol
pair shared secret is then used as input into a key derivation protection against man-in-the-middle attacks, and, in cases
mechanism to generate the session key for the SRTP cryp- where the signaling protocol provides end-to-end integrity
tographic contexts. In addition, SRTP defines modification protection, authentication. ZRTP can utilize a SDP crypto
of the RTP payload that enables datagram enable datagrams attribute specified in RFC 4568 (see Sections 7.7.3 and 7.7.4)
containing encrypted media and media authentication code to provide discovery and authentication through the signal-
(MAC) field, and this is then transported between SRTP end ing channel.
points. ZRTP has Perfect Forward Secrecy, meaning the keys are
RFC 5669 standardizes the use of the SEED (RFC destroyed at the end of the call, which precludes retroactively
4269) block cipher algorithm in the SRTP for providing compromising the call by future disclosures of key material.
confidentiality for RTP traffic and for the control traffic that However, even if the users are too lazy to bother with short
is sent by RTCP. SEED is a symmetric encryption algorithm. authentication strings, we still get reasonable authentica-
The input/output block size of SEED is 128-bit, and the key tion against a man-in-the-middle attack, based on a form
length is also 128-bit. SEED has the 16-round Feistel struc- of key continuity. It does this by caching some key material
ture. A 128-bit input is divided into two 64-bit blocks, and to use in the next call, to be mixed in with the next call’s
the right 64-bit block is an input to the round function with DH shared secret, giving it key continuity properties. To
a 64-bit subkey generated from the key scheduling. It is eas- provide best-effort SRTP, ZRTP utilizes normal RTP/AVP
ily implemented in various software and hardware because it (audio-visual profile) profiles. ZRTP secures media sessions
is designed to increase the efficiency of memory storage and that include a voice media stream and can also secure media
the simplicity of generating keys without degrading the secu- sessions that do not include voice by using an optional digital
rity of the algorithm. In addition, the Datagram Transport signature. In addition, ZRTP does not rely on SIP signaling
Layer Security (DTLS) extension to establish keys for the for the key management, and in fact, it does not rely on any
SRTP and the Secure RTCP (SRTCP) flows is specified in servers at all.
RFC 5764. DTLS keying happens on the media path, inde- It performs its key agreements and key management in
pendent of any out-of-band signaling channel present. RFC a purely peer-to-peer manner over the RTP packet stream.
6188 specifies the use of AES-192 and AES-256 for SRTP ZRTP can be used and discovered without being declared
and SRTCP for much stronger security. The protocol archi- or indicated in the signaling path. This provides a best-effort
tecture of the RTP and the associated RTCP is very flexible SRTP capability. Also, this reduces the complexity of imple-
for a wide variety of scenarios with different security needs of mentations and minimizes interdependency between the sig-
multimedia applications where a single security mechanism naling and media layers. However, when ZRTP is indicated
will be applicable as explained in RFC 7203. As a result, it in the signaling via the zrtp-hash SDP attribute, ZRTP has
is recommended that the RTP/RTCP shall not be mandated additional useful properties. By sending a hash of the ZRTP
for use as a single security mechanism. Hello message in the signaling, ZRTP provides a useful
binding between the signaling and media paths. When this
is done through a signaling path that has end-to-end integ-
rity protection, the DH exchange is automatically protected
7.4 ZRTP from a man-in-the-middle attack. ZRTP is designed for uni-
The complexity of obtaining session keys/certificates by end cast media sessions in which there is a voice media stream.
points in SRTP using a central authority such as trusted their For multiparty secure conferencing, separate ZRTP sessions
party or public key infrastructure are paramount. More may be negotiated between each party and the conference
over, SRTP is vulnerable to the m an-in-the-middle attack. bridge. For sessions lacking a voice media stream, man-in-
ZRTP specified in RFC 6189 is designed to obtain key the-middle protection may be provided by the mechanisms
exchanges without trusted third parties or certificate infra- signing the SAS or integrity-protected signaling as described
structure while providing protection against man-in-the- in RFC 6189. The detailed description of all capabilities of
middle attacks. The caller and callee can simply verify that ZRTP is beyond the scope of this section.
there is no man-in-the-middle attacker by displaying a short
authentication string (SAS) for the users to read and verbally
compare over the phone. Basically, ZRTP is a protocol for
media path Diffie–Hellman (DH) exchange to agree on a
7.5 Real-Time Streaming Protocol (RTSP)
session key and parameters for establishing unicast SRTP Near-real-time (near-RT) media streaming like audio and
sessions for real-time multimedia applications. The ZRTP is video is usually a one-way distribution from a streaming
media path keying because it is multiplexed on the same port server to multiple users. The near-RT streaming between
as RTP and does not require support in the signaling proto- the server and the users may persist for a long time until the
col. For the media session, ZRTP provides confidentiality, viewer stops receiving the stream as if there is a connection
Media Transport Protocol and Media Negotiation ◾ 325
between the server and each of these receivers persists ◾◾ Data is carried out-of-band by a different protocol.
although it does not have to be so in practice. As a result, (There is an exception to this.)
a near-RT streaming is considered as stateful. Moreover, ◾◾ RTSP is defined to use ISO 10646 (UTF-8) rather
a near-RT streaming protocol may need to enable seeking than ISO 8859-1, consistent with current HTML
random points in the media (e.g., audio or video) file, and internationalization efforts (RFC 2854).
adaptive streaming where multiple encoded files could be ◾ ◾ The Request-Uniform Resource Identifier (Request-
distributed to the receiver. From the commercial imple- URI) always contains the absolute URI. Because
mentation point of view, the streaming server may need to of backward compatibility with a historical blunder,
stream out the flow of near-RT media flow to the receiver HTTP/1.1 (RFs 7230–7235) carries only the absolute
on a just time basis. path in the request and puts the host name in a separate
The Real-Time Streaming Protocol (RTSP), an application- header field. This makes virtual hosting easier, where a sin-
level protocol, is being developed for control over the deliv- gle host with one IP address hosts several document trees.
ery of data with real-time properties of the near-RT media
stream. RTSP (RFC 2326) provides an extensible framework The protocol supports the following operations:
to enable controlled, on-demand delivery of real-time data,
such as audio and video. Sources of data can include both ◾◾ Retrieval of media from media server: The client can
live data feeds and stored clips. This protocol is intended to request a presentation description via HTTP or some
control multiple data delivery sessions; provide a means for other method. If the presentation is being multicast,
choosing delivery channels such as UDP, multicast UDP, the presentation description contains the multicast
and TCP; and to provide a means for choosing delivery addresses and ports to be used for the continuous
mechanisms based on RTP (RFC 3550). media. If the presentation is to be sent only to the cli-
RTSP, as specified in RFC 2326, establishes and controls ent via unicast, the client provides the destination for
either a single or several time-synchronized streams of con- security reasons.
tinuous media such as audio and video. It does not typically ◾◾ Invitation of a media server to a conference: A media
deliver the continuous streams itself per se, although inter- server can be invited to join an existing conference,
leaving of the continuous media stream with the control either to play back media into the presentation or to
stream is possible. In other words, RTSP acts as a network record all or a subset of the media in a presentation.
remote control for multimedia servers. The set of streams to This mode is useful for distributed teaching applica-
be controlled is defined by a presentation description. This tions. Several parties in the conference may take turns
memorandum does not define a format for a presentation pushing the remote control buttons.
description. There is no notion of an RTSP connection;
instead, a server maintains a session labeled by an identi- Addition of media to an existing presentation:
fier. An RTSP session is in no way tied to a transport-level
connection such as a TCP connection. During an RTSP ◾◾ Particularly for live presentations, it is useful for
session, an RTSP client may open and close many reliable the media server to indicate for adding of media, if
transport connections to the server to issue RTSP requests. the server can tell the client about additional media
Alternatively, it may use a connectionless transport protocol becoming available.
such as UDP. ◾◾ RTSP requests that establish and control streams for
The streams controlled by RTSP may use RTP (RFC continuous media may be handled by proxies, tunnels,
3550), but the operation of RTSP does not depend on the and caches as in HTTP/1.1 (RFCs 7230–7235).
transport mechanism used to carry continuous media. The
protocol is intentionally similar in syntax and operation to Table 7.3 describes the RSTP methods, their direction,
HTTP/1.1 (RFCs 7230–7235) so that extension mecha- objects on which they operate on, their requirement, and
nisms to HTTP can, in most cases, also be added to RTSP. brief description of methods (RFC 2326).
However, RTSP differs in a number of important aspects In summary, the RTSP provides videocassette recorder
from HTTP: (VCR)-like control operations, allows to choose delivery
channels (e.g., UDP, TCP), supports the description of any
◾◾ RTSP introduces a number of new methods and has a session, and establishes and controls stream of continuous
different protocol identifier. near-RT streaming media (e.g., audio stream, video stream)
◾◾ An RTSP server needs to maintain state by default in like retrieval of the requested media and adding media to
almost all cases, as opposed to the stateless nature of an existing session. More important, the operation of RTSP
HTTP. is very HTTP friendly. The detailed description of RTSP is
◾◾ Both an RTSP server and client can issue requests. beyond the scope of this section.
326 ◾ Handbook on Session Initiation Protocol
Table 7.3 RSTP Methods, Their Direction, and Which Objects (P: Presentation, S: Stream) They Operate On
RTSP Method Direction Object Description Requirement
mandatory to support SIP as the session establishment pro- describing SDP. An SDP session description is denoted by
tocol to ensure interoperability. the media type application/sdp, and the session description is
It is clearly seen that MRCP uses SIP and SDP to cre- entirely textual using the ISO 10646 character set in UTF-8
ate the speech client–server dialog and set up the media encoding. SDP field names and attribute names use only
channels to the server. It also uses SIP and SDP to establish the US-ASCII subset of UTF-8, but textual fields and attri-
MRCP control sessions between the client and the server for bute values may use the full ISO 10646 character set. Field
each media-processing resource required for that dialog. The and attribute values that use the full UTF-8 character set
MRCP commands are executed asynchronously, and the are never directly compared, hence there is no requirement
exchanges between the client and the media resource are car- for UTF-8 normalization. The textual form, as opposed to
ried on that control session. MRCP exchanges do not change a binary encoding such as ASN.1 or XDR, was chosen to
the state of the SIP dialog, media sessions, or other param- enhance portability, to enable a variety of transports to be
eters of the dialog initiated via SIP. It controls and affects used, and to allow flexible, text-based toolkits to be used to
the state of the media-processing resource associated with the generate and process session descriptions. However, since
MRCP session(s). SDP may be used in environments where the maximum per-
MRCP defines the messages to control the different missible size of a session description is limited, the encoding
media-processing resources and the state machines required is deliberately compact. An SDP session description consists
to guide their operation. It also describes how these messages of a number of lines of text of the form:
are carried over a transport layer protocol such as the TCP
(RFC 0793) or the Transport Layer Security Protocol (RFC <type>=<value>
5246). MRCP facilitates to build the client library com-
mands for supporting automatic speech recognition, text-to- where <type> must be exactly one case-significant charac-
speech, verification, authentication, and recorder functions ter and <value> is structured text whose format depends
providing interoperability. The details of the MRCP and on <type>. In general, <value> is either a number of fields
their implementations for providing value-added services are delimited by a single space character or a free format string,
beyond the scope of this section. and is case significant unless a specific field defines other-
wise. White space must not be used on either side of the “=”
sign.
An SDP session description consists of a session-level
7.7 Session Description Protocol (SDP) section followed by zero or more media-level sections. The
7.7.1 Overview session-level part starts with a v= line and continues to the
first media-level section. Each media-level section starts with
The SDP is used for media negotiations in SIP. The origi- an m= line and continues to the next media-level section or
nal SDP was designed to be used to describe the session of end of the whole session description. In general, session-level
the multicast session where media negotiations are not pos- values are the default for all media unless overridden by an
sible. In the course of time, SDP capabilities are modified equivalent media-level value. Some lines in each description
and enhanced to be used for media negotiations for point-to- are required and some are optional, but all must appear in
point and even to some extent for real-time networked multi- exactly the order given here (the fixed order greatly enhances
point multimedia conferencing. For example, RFC 3264 (see error detection and allows for a simple parser). SDP provides
Section 3.8.4) provides the detailed description of how SDP the media description as follows where optional items are
can be used for media negotiations using the offer–answer marked as “*:”
model. However, SDP needs the help of a transport protocol
like SIP as a part of its message body to be carried between
◾◾ Session description
the end points as it lacks capabilities to be transported all by
itself as other protocols do. In this respect, SDP rather pro-
v= (protocol version)
vides the syntax of media that is text encoding to be used in o= (originator and session identifier)
the session as opposed to be a protocol. s= (session name)
i=* (session information)
u=* (URI of description)
7.7.2 SDP Specification e=* (e-mail address)
p=* (phone number)
RFC 4566 provides the core SDP specifications enhancing the c=* (connection information—not required if
previous versions, although many other RFCs have recently included in all media)
extended the capabilities of RFC 4566. We will briefly sum- b=* (zero or more bandwidth information
marize SDP capabilities mainly centering on RFC 4566 in lines)
328 ◾ Handbook on Session Initiation Protocol
One or more time descriptions (t= and r= lines; see below of 0x00 (Nul), 0x0a (ASCII newline), and 0x0d (ASCII car-
in Time Description) riage return). The sequence CRLF (0x0d0a) is used to end a
record, although parsers should be tolerant and also accept
z=* (time zone adjustments) records terminated with a single newline character. If the
k=* (encryption key)
a=charset attribute is not present, these octet strings must be
a=* (zero or more session attribute lines)
interpreted as containing ISO-10646 characters in UTF-8
Zero or more media descriptions encoding (the presence of the a=charset attribute may force
◾◾ Time description some fields to be interpreted differently).
A session description can contain domain names in the
t= (time the session is active) o=, u=, e=, c=, and a= lines. Any domain name used in SDP
r=* (zero or more repeat times) must comply with RFCs 1034 and 1035. Internationalized
domain names (IDNs) must be represented using the ASCII
◾◾ Media description, if present Compatible Encoding (ACE) form defined in RFC 3490
and must not be directly represented in UTF-8 or any other
m= (media name and transport address)
i=* (media title) encoding (this requirement is for compatibility with RFC
c=* (connection information—optional if 2327 and other SDP-related standards, which predate the
included at session level) development of IDNs).
b=* (zero or more bandwidth information
lines)
k=* (encryption key) 7.7.3 SDP Field Description
a=* (zero or more media attribute lines)
7.7.3.1 Protocol Version
The set of type letters is deliberately small and not
intended to be extensible. However, the attribute mecha- v=0
nism (a=, described in Section 7.7.3) is the primary means
for extending SDP and tailoring it to particular applications This indicates that the SDP protocol version is 0. There is
or media. Some attributes that are already standardized and no minor version number.
described here have a defined meaning; however, others may
be added on an application-, media-, or session-specific basis.
An SDP session description may contain URIs that reference 7.7.3.2 Origin
external content in the u=, k=, and a= lines. These URIs may The v= field gives the version of the SDP.
be dereferences in some cases, making the session description
non-self-contained. o=<username> <sess-id> <sess-version>
The connection (c=) and attribute (a=) information in the <nettype> <addrtype>
<unicast-address>
session-level section applies to all the media of that session
unless overridden by connection information or an attribute
of the same name in the media description. For instance, in The o= field gives the originator of the session (her user
the example below, each media behaves as if it were given a name and the address of the user’s host) plus a session identi-
recvonly attribute. An example SDP description is as follows: fier and version number:
giving the name of the person who may be contacted. This deprecated; applications should use an administra-
must be enclosed in parentheses if it is present. For example: tively scoped address instead.
[email protected] (Jane Doe)
The TTL for the session is appended to the address using
The alternative RFC 2822 name quoting convention is a slash as a separator. An example is
also allowed for both e-mail addresses and phone numbers. c=IN IP4 224.2.36.42/127
For example:
IPv6 multicast does not use TTL scoping, and hence
e=Jane Doe <[email protected]>
the TTL value must not be present for IPv6 multicast. It
The free text string should be in the ISO-10646 charac- is expected that IPv6 scoped addresses will be used to limit
ter set with UTF-8 encoding, or alternatively in ISO-8859-1 the scope of conferences. Hierarchical or layered encoding
or other encodings if the appropriate session-level a=charset schemes are data streams where the encoding from a single
attribute is set. media source is split into a number of layers. The receiver can
choose the desired quality (and hence bandwidth) by only
subscribing to a subset of these layers. Such layered encod-
7.7.3.7 Connection Data ings are normally transmitted in multiple multicast groups to
The c= field contains connection data. A session descrip- allow multicast pruning. This technique keeps unwanted traf-
tion must contain either at least one c= field in each media fic from sites only requiring certain levels of the hierarchy. For
description or a single c= field at the session level. It may con- applications requiring multiple multicast groups, we allow the
tain a single-session-level c= field and additional c= field(s) following notation to be used for the connection address:
per media description, in which case the per-media values
<base multicast address>[/<ttl>]/<number of
override the session-level settings for the respective media. addresses>
c=<nettype> <addrtype> <connection-address> If the number of addresses is not given, it is assumed to
be one. Multicast addresses so assigned are contiguously allo-
The first subfield (<nettype>) is the network type, which
cated above the base address, so that, for example:
is a text string giving the type of network. Initially, IN is
defined to have the meaning Internet, but other values may be c=IN IP4 224.2.1.1/127/3
registered in the future (see Section 2.4). The second subfield
(<addrtype>) is the address type. This allows SDP to be used would state that addresses 224.2.1.1, 224.2.1.2, and 224.2.1.3
for sessions that are not IP based. This memo only defines IP4 are to be used at a TTL of 127. This is semantically identical
and IP6, but other values may be registered in the future. to including multiple c= lines in a media description:
The third subfield (<connection-address>) is the connec-
c=IN IP4 224.2.1.1/127
tion address. Optional subfields may be added after the con- c=IN IP4 224.2.1.2/127
nection address depending on the value of the <addrtype> c=IN IP4 224.2.1.3/127
field. When the <addrtype> is IP4 and IP6, the connection
address is defined as follows: Similarly, an IPv6 example would be
◾◾ If the session is multicast, the connection address will c=IN IP6 FF15::101/3
be an IP multicast group address. If the session is not
multicast, then the connection address contains the which is semantically equivalent to
unicast IP address of the expected data source or data
c=IN IP6 FF15::101
relay or data sink as determined by additional attribute c=IN IP6 FF15::102
fields. It is not expected that unicast addresses will be c=IN IP6 FF15::103
given in a session description that is communicated by a
multicast announcement, though this is not prohibited. (remembering that the TTL field is not present in IPv6
◾◾ Sessions using an IPv4 multicast connection address multicast).
must also have a time to live (TTL) value present in Multiple addresses or c= lines may be specified on a per-
addition to the multicast address. The TTL and the media basis only if they provide multicast addresses for dif-
address together define the scope with which multi- ferent layers in a hierarchical or layered encoding scheme.
cast packets sent in this conference will be sent. TTL They must not be specified for a session-level c= field. The
values must be in the range 0–255. Although the TTL slash notation for multiple addresses described above must
must be specified, its use to scope multicast traffic is not be used for IP unicast addresses.
Media Transport Protocol and Media Negotiation ◾ 331
This optional field denotes the proposed bandwidth to be If the session is active at regular times, an r= line (see
used by the session or media. The <bwtype> is an alphanu- below in the next section) should be used in addition to, and
meric modifier giving the meaning of the <bandwidth> figure. following, a t= line, in which case the t= line specifies the
start and stop times of the repeat sequence.
b=<bwtype>:<bandwidth>
The first and second subfields give the start and stop
Two values are defined in this specification, but other val- times, respectively, for the session. These values are the deci-
ues may be registered in the future: mal representation of NTP time values in seconds since 1900
(RFC 1305). To convert these values to UNIX time, sub-
◾◾ Conference total (CT): If the bandwidth of a session tract decimal 2208988800. NTP time stamps are elsewhere
or media in a session is different from the bandwidth represented by 64-bit values, which wrap sometime in the
implicit from the scope, a “b=CT:...” line should be year 2036. Since SDP uses an arbitrary length decimal rep-
supplied for the session giving the proposed upper limit resentation, this should not cause an issue (SDP time stamps
to the bandwidth used (the conference total bandwidth). must continue counting seconds since 1900, NTP will use
The primary purpose of this is to give an approximate the value modulo the 64-bit limit).
idea as to whether two or more sessions can coexist If the <stop-time> is set to zero, then the session is not
simultaneously. When using the CT modifier with bounded, though it will not become active until after the
RTP, if several RTP sessions are part of the conference, <start-time>. If the <start-time> is also zero, the session is
the conference total refers to total bandwidth of all regarded as permanent. User interfaces should strongly
RTP sessions. discourage the creation of unbounded and permanent ses-
◾◾ Application s pecific ( AS): The bandwidth is inter- sions as they give no information about when the session is
preted to be application specific (it will be the appli- actually going to terminate, and so make scheduling diffi-
cation’s concept of maximum bandwidth). Normally, cult. The general assumption may be made, when display-
this will coincide with what is set on the application’s ing unbounded sessions that have not timed out to the user,
maximum bandwidth control if applicable. For RTP- that an unbounded session will only be active until half an
based applications, AS gives the RTP session bandwidth hour from the current time or the session start time, which-
as defined in RFC 3550 (see Section 7.2). ever comes later. If behavior other than this is required, an
end time should be given and modified as appropriate when
Note that CT gives a total bandwidth figure for all the new information becomes available about when the session
media at all sites. AS gives a bandwidth figure for a single should really end. Permanent sessions may be shown to the
media at a single site, although there may be many sites send- user as never being active unless there are associated repeat
ing simultaneously. A prefix X- is defined for <bwtype> names. times that state precisely when the session will be active.
This is intended for experimental purposes only. For example:
b=X-YZ:128 7.7.3.10 Repeat Times
Use of the X- prefix is not recommended: instead, new r= fields specify repeat times for a session. For example, if a
modifiers should be registered with the Internet Assigned session is active at 10 am on Monday and 11 am on Tuesday
Numbers Authority (IANA) in the standard namespace. SDP for 1 hour each week for 3 months, then the <start-time> in
parsers must ignore bandwidth fields with unknown modifi- the corresponding t= field would be the NTP representation
ers. Modifiers must be alphanumeric and, although no length of 10 am on the first Monday, the <repeat interval> would
limit is given, it is recommended that they be short. The <band- be 1 week, the <active duration> would be 1 hour, and the
width> is interpreted as kilobits per second by default. The offsets would be 0 and 25 hours.
definition of a new <bwtype> modifier may specify that the
bandwidth is to be interpreted in some alternative unit (the CT r=<repeat interval> <active duration>
<offsets from start-time>
and AS modifiers defined in this memo use the default units).
The corresponding t= field stop time would be the NTP
7.7.3.9 Timing representation of the end of the last session 3 months later.
By default, all fields are in seconds, so the r= and t= fields
The t= lines specify the start and stop times for a session.
might be the following:
Multiple t= lines may be used if a session is active at multiple
irregularly spaced times; each additional t= line specifies an t=3034423619 3042462419
additional period of time for which the session will be active. r=604800 3600 0 90000
332 ◾ Handbook on Session Initiation Protocol
If transported over a secure and trusted channel, the SDP No key is included in this SDP description, but the ses-
may be used to convey encryption keys. sion or media stream referred to by this key field is encrypted.
Media Transport Protocol and Media Negotiation ◾ 333
The user should be prompted for the key when attempting to m=<media> <port>/<number of ports> <proto>
join the session, and this user-supplied key should then be <fmt>...
used to decrypt the media streams. The use of user-specified
In such a case, the ports used depend on the trans-
keys is not recommended, since such keys tend to have weak
port protocol. For RTP, the default is that only the
security properties. The key field must not be used unless it
even-numbered ports are used for data with the cor-
can be guaranteed that the SDP is conveyed over a secure
responding one-higher odd ports used for the RTCP
and trusted channel. An example of such a channel might
belonging to the RTP session, and the <number of
be SDP embedded inside an S/MIME message or a TLS-
ports> denoting the number of RTP sessions.
protected HTTP session. It is important to ensure that the
secure channel is with the party that is authorized to join
◾◾ <proto> is the transport protocol. The meaning of the
the session, not an intermediary: if a caching proxy server is
transport protocol is dependent on the address type
used, it is important to ensure that the proxy is either trusted
field in the relevant c= field. Thus, a c= field of IP4
or unable to access the SDP.
indicates that the transport protocol runs over IP4. The
following transport protocols are defined (RFC 4566),
7.7.4 SDP Media but may be extended through registration of new pro-
tocols with IANA:
7.7.4.1 Media Descriptions
– udp: denotes an unspecified protocol running over
A session description may contain a number of media descrip- UDP
tions. Each media description starts with an m= field and is – RTP/AVP: denotes RTP (RFC 3550, see Section 7.2)
terminated by either the next m= field or by the end of the used under the RTP profile for audio and video
session description. conferences with minimal control (RFC 3551, see
Section 7.2) running over UDP
m=<media> <port> <proto> <fmt>... – RTP/SAVP: denotes the SRTP (RFC 3711, see
Section 7.3) running over UDP
A media field has several subfields:
The main reason to specify the transport proto-
◾◾ <media> is the media type. Currently defined media are
col in addition to the media format is that the same
audio, video, text, application, and message, although
standard media formats may be carried over different
this list may be extended in the future.
transport protocols even when the network protocol
◾◾ <port> is the transport port to which the media stream
is the same—a historical example is vat Pulse Code
is sent. The meaning of the transport port depends
Modulation (PCM) audio and RTP PCM audio;
on the network being used as specified in the relevant
another might be TCP/RTP PCM audio. In addition,
c= field, and on the transport protocol defined in the
relays and monitoring tools that are transport protocol
<proto> subfield of the media field. Other ports used
specific but format independent are possible.
by the media application, such as the RTCP port
defined in RFC 3550 (see Section 7.2), may be derived
algorithmically from the base media port or may be ◾◾ <fmt> is a media format description. The fourth and
specified in a separate attribute (e.g., a=rtcp: as defined any subsequent subfields describe the format of the
in RFC 3605). If noncontiguous ports are used or if media. The interpretation of the media format depends
they do not follow the parity rule of even RTP ports on the value of the <proto> subfield.
and odd RTCP ports, the a=rtcp: attribute MUST be
used. Applications that are requested to send media to If the <proto> subfield is RTP/AVP or RTP/SAVP,
a <port> that is odd and where the a=rtcp: is present the <fmt> subfields contain RTP payload type num-
must not subtract 1 from the RTP port: that is, they bers. When a list of payload type numbers is given,
must send the RTP to the port indicated in <port> this implies that all of these payload formats may be
and send the RTCP to the port indicated in the a=rtcp used in the session; however, the first of these formats
attribute. should be used as the default format for the session.
For dynamic payload type assignments, the a=rtpmap:
For applications where hierarchically encoded attribute (see Section 7.7.4.2) should be used to map
streams are being sent to a unicast address, it may be from an RTP payload type number to a media encoding
necessary to specify multiple transport ports. This is name that identifies the payload format. The a=fmtp:
done using a similar notation to that used for IP multi- attribute may be used to specify format parameters (see
cast addresses in the c= field: Section 7.7.4.2).
334 ◾ Handbook on Session Initiation Protocol
If the <proto> subfield is udp, the <fmt> subfields standardized for each media in RFC 3551. The payload for-
must reference a media type describing the format mat/type number (<fmt>) that is not specified in RFC 3551
under the audio, video, text, application, or message can be standardized with IANA in the future.
top-level media types. The media type registration Let us consider the following example for video:
should define the packet format for use with UDP
transport. For media using other transport protocols, m=video 49170/2 RTP/AVP 31
the <fmt> field is protocol specific. Rules for interpre-
would specify that ports 49170 and 49171 form one RTP/
tation of the <fmt> subfield must be defined when reg-
RTCP pair and 49172 and 49173 form the second RTP/RTCP
istering new protocols.
pair. RTP/AVP is the transport protocol and 31 is the format. If
noncontiguous ports are required, they must be signaled using a
7.7.4.2 Standardized Media Types separate attribute (e.g., a=rtcp: as defined in RFC 3605). If mul-
tiple addresses are specified in the c= field and multiple ports are
The RTP used for audio and video conferencing is speci- specified in the m= field, a one-to-one mapping from port to the
fied in RFC 3550 (see Section 7.2) while RFC 3711 (see corresponding address is implied. For example:
Section 7.3) defines the secure RTP for audio and video
conferencing. However, there can be different types of audio c=IN IP4 224.2.1.1/127/2
(or video) codecs that may transfer audio (or video) media m=video 49170/2 RTP/AVP 31
streams that have specific payload format/types depending
on which kinds of codecs are being used. RFC 3551 defines would imply that address 224.2.1.1 is used with ports 49170
a specific payload format/type number (i.e., <fmt>) for each and 49171, and address 224.2.1.2 is used with ports 49172
audio and video codec for differentiating the bit streams of all and 49173. The semantics of multiple m= lines using the same
media transferred over the network, and the RTP transport transport address are undefined. This implies that, unlike
protocol for audio and video is designated as RTP/AVP (i.e., limited past practice, there is no implicit grouping defined
<proto>). In the same token, the SRTP transport protocol for by such means and an explicit grouping framework (e.g.,
audio and video is designated as RTP/AVP (i.e., <proto>). In RFC 3388) should instead be used to express the intended
general, the SDP media description for audio or video can semantics. Table 7.4 depicts some additional standardized
be expressed as follows depending on whether RTP or SRTP SDP media types.
transport protocol is used:
m=image 50011 TCP t38 Real-Time Facsimile (T.38) over TCP transport protocol RFC 3362
M=audio 51221 TCP t38 Real-Time Facsimile (T.38) over TCP transport protocol RFC 4612
m=application 5000 TCP/TLS/BFCP* Binary Floor Control Protocol (BFCP) application over TCP/ RFC 4583
TLS transport protocol
m=message 7394 TCP/MSRP* Message Session Relay Protocol (MSRP) message over TCP RFC 4975
transport protocol
m=text 11000 RTP/AVP 98 Real-Time Text (T.140) over RTP transport protocol RFC 4103
Source: Copyright IETF. Reproduced with permission.
Media Transport Protocol and Media Negotiation ◾ 335
A media description may have any number of attributes the session; there is no central registry of keywords. It is a
(a= fields) that are media specific. These are referred to as session-level attribute. It is a charset-dependent attribute,
media-level attributes and add information about the media meaning that its value should be interpreted in the charset
stream. Attribute fields can also be added before the first specified for the session description if one is specified, or by
media field; these session-level attributes convey additional default in ISO 10646/UTF-8.
information that applies to the conference as a whole rather
than to individual media. Attribute fields may be of two a=tool:<name and version of tool>
forms:
This gives the name and version number of the tool used
to create the session description. It is a session-level attribute,
◾◾ A property attribute is simply of the form a=<flag>.
and it is not dependent on charset.
These are binary attributes, and the presence of the
attribute conveys that the attribute is a property of the a=ptime:<packet time>
session. An example might be a=recvonly.
◾◾ A value attribute is of the form a=<attribute>:<value>. This gives the length of time in milliseconds represented
For example, a whiteboard could have the value attri- by the media in a packet. This is probably only meaningful
bute a=orient: landscape. for audio data, but may be used with other media types if it
makes sense. It should not be necessary to know ptime to
Attribute interpretation depends on the media tool being decode RTP or vat audio, and it is intended as a recommen-
invoked. Thus, receivers of session descriptions should be dation for the encoding/packetization of audio. It is a media-
configurable in their interpretation of session descriptions level attribute, and it is not dependent on charset.
in general and of attributes in particular. Attribute names
MUST use the US-ASCII subset of ISO-10646/UTF-8. a=maxptime:<maximum packet time>
Attribute values are octet strings, and may use any octet
value except 0x00 (Nul), 0x0A (LF), and 0x0D (CR). By This gives the maximum amount of media that can be
default, attribute values are to be interpreted as in ISO- encapsulated in each packet, expressed as time in milliseconds.
10646 character set with UTF-8 encoding. Unlike other text The time shall be calculated as the sum of the time the media
fields, attribute values are not normally affected by the char- present in the packet represents. For frame-based codecs, the
set attribute, as this would make comparisons against known time should be an integer multiple of the frame size. This attri-
values problematic. However, when an attribute is defined, bute is probably only meaningful for audio data, but may be
it can be defined to be charset dependent, in which case its used with other media types if it makes sense. It is a media-
value should be interpreted in the session charset rather than level attribute, and it is not dependent on charset. Note that
in ISO-10646. Attributes must be registered with IANA. If this attribute was introduced after RFC 2327, and nonupdated
an attribute is received that is not understood, it must be implementations will ignore this attribute.
ignored by the receiver.
a=rtpmap:<payload type> <encoding
name>/<clock rate> [/<encoding
7.7.5.2 Standardized Attribute Values parameters>]
The following attributes are defined in RFC 4566. Since This attribute maps from an RTP payload type number
application writers may add new attributes as they are (as used in an m= line) to an encoding name denoting the
required, this list is not exhaustive. payload format to be used. It also provides information on
the clock rate and encoding parameters. It is a media-level
a=cat:<category> attribute that is not dependent on charset. Although an RTP
profile may make static assignments of payload type numbers
This attribute gives the dot-separated hierarchical cat-
to payload formats, it is more common for that assignment to
egory of the session. This is to enable a receiver to filter
be done dynamically using a=rtpmap: attributes.
unwanted sessions by category. There is no central registry of
As an example of a static payload type, consider u-law
categories. It is a session-level attribute, and it is not depen-
PCM-coded single-channel audio sampled at 8 kHz. This is
dent on charset.
completely defined in the RTP audio/video profile as payload
a=keywds:<keywords> type 0, so there is no need for an a=rtpmap: attribute, and
the media for such a stream sent to UDP port 49232 can be
Like the cat attribute, this is to assist identifying wanted specified as
sessions at the receiver. This allows a receiver to select inter-
esting session based on keywords describing the purpose of m=audio 49232 RTP/AVP 0
336 ◾ Handbook on Session Initiation Protocol
An example of a dynamic payload type is 16-bit linear charset. If none of the attributes sendonly, recvonly, inactive,
encoded stereo audio sampled at 16 kHz. If we wish to use and sendrecv is present, sendrecv should be assumed as the
the dynamic RTP/AVP payload type 98 for this stream, default for sessions that are not of the conference type broad-
additional information is required to decode it: cast or H332 (see as shown below).
This specifies the character set to be used to display the any session-level language specified. Multiple lang attributes
session name and information data. By default, the ISO- can be provided either at session or media level if the session
10646 character set in UTF-8 encoding is used. If a more description or media use multiple languages, in which case
compact representation is required, other character sets may the order of the attributes indicates the order of importance
be used. For example, the ISO 8859-1 is specified with the of the various languages in the session or media from most
following SDP attribute: important to least important.
The lang attribute value must be a single RFC 3066 lan-
a=charset:ISO-8859-1 guage tag in US-ASCII. It is not dependent on the charset
attribute. A lang attribute should be specified when a session
This is a session-level attribute and is not dependent on is of sufficient scope to cross geographic boundaries where
charset. The charset specified must be one of those registered the language of recipients cannot be assumed, or where the
with IANA, such as ISO-8859-1. The character set identi- session is in a different language from the locally assumed
fier is a US-ASCII string and must be compared against the norm.
IANA identifiers using a case-insensitive comparison. If the
identifier is not recognized or not supported, all strings that a=framerate:<frame rate>
are affected by it should be regarded as octet strings.
Note that a character set specified must still prohibit This gives the maximum video frame rate in frames per
the use of bytes 0x00 (Nul), 0x0A (LF), and 0x0d (CR). second. It is intended as a recommendation for the encoding
Character sets requiring the use of these characters must of video data. Decimal representations of fractional values
define a quoting mechanism that prevents these bytes from using the notation <integer>.<fraction> are allowed. It is a
appearing within text fields. media-level attribute, defined only for video media, and it is
not dependent on charset.
a=sdplang:<language tag>
a=quality:<quality>
This can be a session-level attribute or a media-level attri- This gives a suggestion for the quality of the encoding
bute. As a session-level attribute, it specifies the language for as an integer value. The intention of the quality attribute for
the session description. As a media-level attribute, it specifies video is to specify a nondefault trade-off between frame-rate
the language for any media-level SDP information field asso- and still-image quality. For video, the value is in the range 0
ciated with that media. Multiple sdplang attributes can be to 10, with the following suggested meaning:
provided either at session or media level if multiple languages
in the session description or media use multiple languages, in 10—the best still-image quality the compression scheme
which case the order of the attributes indicates the order of can give
importance of the various languages in the session or media 5—the default behavior given no quality suggestion
from most important to least important. 0—the worst still-image quality the codec designer thinks
In general, sending session descriptions consisting of mul- is still usable
tiple languages is discouraged. Instead, multiple descriptions
should be sent describing the session, one in each language.
However, this is not possible with all transport mechanisms, It is a media-level attribute, and it is not dependent on
and so multiple sdplang attributes are allowed although not charset.
recommended. The sdplang attribute value must be a single
RFC 3066 language tag in US-ASCII. It is not dependent a=fmtp:<format> <format specific parameters>
on the charset attribute. An sdplang attribute should be
specified when a session is of sufficient scope to cross geo- This attribute allows parameters that are specific to a
graphic boundaries where the language of recipients cannot particular format to be conveyed in a way that SDP does
be assumed, or where the session is in a different language not have to understand them. The format must be one
from the locally assumed norm. of the formats specified for the media. Format-specific
parameters may be any set of parameters required to be
a=lang:<language tag> conveyed by SDP and given unchanged to the media tool
that will use this format. At most, one instance of this
This can be a session-level attribute or a media-level attribute is allowed for each format. It is a media-level
attribute. As a session-level attribute, it specifies the default attribute, and it is not dependent on charset. In addition,
language for the session being described. As a media-level Table 7.5 shows more SDP attributes that are defined in
attribute, it specifies the language for that media, overriding different RFCs.
338 ◾ Handbook on Session Initiation Protocol
7.7.6.4 Further Mechanisms That Change bandwidth values is given by the c= connection address
the Bandwidth Utilization line. This line contains either a multicast group address or
a unicast address of the data source or sink. The c= line’s
There exist a number of other mechanisms that also may address type may be assumed to be of the same type as the
change the overhead at layers below media transport. We will one used in the bandwidth calculation, although no docu-
briefly cover a few of these here. ment specifying this point seems to exist. In cases of SDP
transported by RTSP, this is even less clear. The normal
7.7.6.4.1 IPsec usage for a unicast on-demand streaming session is to set the
connection data address to a null address. This null address
IP security (IPsec) (RFC 4301) can be used between end does have an address type, which could be used as an indica-
points to provide confidentiality through the application of tion. However, this is also not clarified anywhere. Figure 7.4
the IP Encapsulating Security Payload (ESP) (RFC 4303) illustrates a connection scenario between a streaming server
or integrity protection using the IP Authentication Header A and a client B over a translator. When B receives the SDP
(AH) (RFC 4302) of the media stream. The addition of the from A over RTSP, it will be very difficult for B to know what
ESP and AH headers increases each packet’s size. To pro- the bandwidth values in the SDP represent. The following
vide virtual private networks, complete IP packets may be possibilities exist:
encapsulated between an end node and the private networks
security gateway, thus providing a secure tunnel that ensures 1. The SDP is unchanged and the c= null address is of
confidentiality, integrity, and authentication of the packet type IPv4. The bandwidth value represents the band-
stream. In this case, the extra IP and ESP header will signifi- width needed in an IPv4 network.
cantly increase the packet size. 2. The SDP has been changed by an Application Level
Gateway (ALG). The c= address is changed to an IPv6
7.7.6.4.2 Header Compression type. The bandwidth value is unchanged.
3. The SDP is changed and both c= address type and
Another mechanism that alters the actual overhead over bandwidth value is converted. Unfortunately, this can
links is header compression. Header compression uses the seldom be done.
fact that most network protocol headers have either static
or predictable values in their fields within a packet stream. In case 1, the client can understand that the server
Compression is normally only done on a per-hop basis, that is located in an IPv4 network and that it uses IPv4 over-
is, on a single link. The normal reason for doing header com- head when calculating the bandwidth value. The client can
pression is that the link has fairly limited bandwidth and almost never convert the bandwidth value described below.
significant gain in throughput is achieved. There exist several In case 2, the client does not know that the server is in an
different header compression standards. For compressing IP IPv4 network and that the bandwidth value is not calculated
headers only, there is RFC 2507. For compressing packets with IPv6 overhead. In cases where a client uses this value to
with IP/UDP/RTP headers, RFC 2508 was created at the determine if its end of the network has sufficient resources,
same time. More recently, the Robust Header Compression the client will underestimate the required bit rate, potentially
(ROHC) working group has been developing a framework resulting in bad application performance very rare. If one
and profiles (RFC 3095) for compressing certain combina- tries to convert the bandwidth value without further infor-
tions of protocols, like IP/UDP and IP/UDP/RTP. mation about the packet rate, significant errors may be intro-
duced into the value.
7.7.6.5 Bandwidth Signaling Problems
When an application wants to use SDP to signal the band- 7.7.6.5.2 Taking Other Mechanisms into Account
width required for this application, some problems become We have described earlier that there will be a number of
evident due to the inclusion of the lower layers in the band- reasons, like header compression and tunnels, that would
width values. change lower-layer header sizes. For these mechanisms,
there exist different possibilities to take them into account.
7.7.6.5.1 IP Version Using IPsec directly between end points should definitely be
known to the application, thus enabling it to take the extra
If one signals the bandwidth in SDP, for example, using headers into account. However, the same problem also exists
b= S: as an RTP-based application, one cannot know if with the current SDP bandwidth modifiers where a receiver
the overhead is calculated for IPv4 or IPv6. An indica- is not able to convert these values taking the IPsec headers
tion of which protocol has been used when calculating the into account. It is less likely that an application would be
Media Transport Protocol and Media Negotiation ◾ 341
aware of the existence of a virtual private network. Thus, the the same time, this results in unfairness in the reporting
generality of the mechanism to tunnel all traffic may prevent between an IPv4 and IPv6 node. In the worst-case scenario,
the application from even considering whether it would be the IPv6 node may report with 25% longer intervals. These
possible to convert the values. problems have been considered insignificant enough to not
When using header compression, the actual overhead be worth any complex solutions. Therefore, only a simple
will be less deterministic; however, in most cases, an average algorithm for deriving RTCP bandwidth is defined in this
overhead can be determined for a certain application. If a specification.
network node knows that some type of header compression is
employed, this can be taken into consideration. For resources
reservation protocol (RSVP) (RFC 2205), there exists an 7.7.6.5.5 Problem Conclusion
extension (RFC 3006) that allows the data sender to inform A shortcoming of the current SDP bandwidth modifiers is
network nodes about the compressibility of the data flow. To that they also include the bandwidth needed for lower layers.
be able to do this with any accuracy, the compression factor It is, in many cases, difficult to determine which lower layers
and packet rate or size is needed, as RFC 3006 provides. and their versions were included in the calculation, especially
in the presence of translation or proxying between different
domains. This prevents a receiver from determining if given
7.7.6.5.3 Converting Bandwidth Values bandwidth needs to be converted based on the actual lower
If one would like to convert a bandwidth value calculated layers being used. Second, an attribute to give the receiver
using IPv4 overhead to IPv6 overhead, the packet rate is an explicit determination of the maximum packet rate that
required. The new bandwidth value for IPv6 is normally will be used does not exist. This value is necessary for accu-
IPv4 bandwidth + packet rate * 20 bytes, where 20 bytes is rate conversion of any bandwidth values if the difference in
the usual difference between IPv6 and IPv4 headers. The overhead is known.
overhead difference may be some other value in cases when
IPv4 options (RFC 791) or IPv6 extension headers (RFC
2460) are used. As converting requires the packet rate of 7.7.6.6 Problem Scope
the stream, this is not possible in the general case. Many The problems described earlier are common and effect
codecs have either multiple possible packet/frame rates or application-level signaling using SDP, other signaling pro-
can perform payload format aggregation, resulting in many tocols, and also resource reservation protocols. However,
possible rates. Therefore, some extra information in the SDP this document targets the specific problem of signaling the
will be required. The a=ptime: parameter may be a possible bit rate in SDP. As SDP information is normally transported
candidate. However, this parameter is normally only used end-to-end by an application protocol, nodes between the
for audio codecs. Its definition as described earlier (RFC end points will not have access to the bit-rate information.
4566) is that it is only a recommendation, which the sender It will normally only be the end points that are able to
may disregard. A better parameter is needed. take this information into account. An interior node will
need to receive the information through a means other
than SDP, and that is outside the scope of this specifica-
7.7.6.5.4 RTCP Problems tion. Nevertheless, the bit-rate information provided in this
When RTCP is used between hosts in IPv4 and IPv6 net- specification is sufficient for cases such as first-hop resource
works over a translator, similar problems exist. The RTCP reservation and admission control. It also provides informa-
traffic going from the IPv4 domain will result in a higher tion about the maximum codec rate, which is independent
RTCP bit rate than intended in the IPv6 domain due to of lower-level protocols. This specification does not try to
the larger headers. This may result in up to a 25% increase solve the problem of detecting network address translators
in required bandwidth for the RTCP traffic. The largest or other middle boxes.
increase will be for small RTCP packets when the num-
ber of IPv4 hosts is much larger than the number of IPv6
hosts. Fortunately, as RTCP has a limited bandwidth
7.7.6.7 Requirements
compared with RTP, it will only result in a maximum of The problems outlined in the preceding sections and with the
1.75% increase of the total session bandwidth when RTCP above applicability should meet the following requirement:
bandwidth is 5% of RTP bandwidth. The RTCP random-
ization may easily result in short-term effects of the same ◾◾ The bandwidth value shall be given in a way such that
magnitude, so this increase may be considered tolerable. it can be calculated for all possible combinations of
The increase in bandwidth will, in most cases, be less. At transport overhead.
342 ◾ Handbook on Session Initiation Protocol
RS modifiers. A specification of how to derive the RTCP bit streams. If this cannot be measured (stored media) or esti-
rate when using TIAS is presented later. mated (live), the sum of all media level values provides a ceil-
ing value. Note that the value at session level can be less than
7.7.6.8.4 Bandwidth Modifier Usage Rules the sum of the individual media streams due to temporal
distribution of media stream’s maximums. The maxprate
TIAS is primarily intended to be used at the SDP media level. attribute must not be present at the session level if the media
The TIAS bandwidth attribute MAY be present at the session streams use different transport. The attribute may be pres-
level in SDP, if all media streams use the same transport. In ent if the media streams use the same transport. If the attri-
cases where the sum of the media level values for all media bute is present at the session level, it should also be present at
streams is larger than the actual maximum bandwidth need the media level for all media streams. The maxprate shall be
for all streams, it should be included at session level. However, included for all transports where a packet rate can be derived
if present at the session level, it should be present also at the and TIAS is included. For example, if you use TIAS and a
media level. TIAS shall not be present at the session level transport like IP/UDP/RTP, for which the max packet rate
unless the same transport protocols are used for all media (actual or estimated) can be derived, then maxprate shall be
streams. The same transport is used as long as the same com- included. However, if either (a) the packet rate for the trans-
bination of protocols is used, like IPv6/UDP/RTP. To allow port cannot be derived, or (b) TIAS is not included, then
for backwards compatibility with applications of SDP that maxprate is not required to be included.
do not implement TIAS, it is recommended to also include
the AS modifier when using TIAS. The presence of a value
including lower-layer overhead, even with its problems, is bet- 7.7.6.8.6 Converting to Transport-Dependent Values
ter than none. However, an SDP application implementing When converting the transport-independent bandwidth
TIAS should ignore the AS value and use TIAS instead when value (bw-value) into a transport-dependent value, including
both are present. When using TIAS for an RTP-transported the lower layers, the following steps must be carried out:
stream, the maxprate attribute, if possible to calculate, defined
next, shall be included at the corresponding SDP level. 1. Determine which lower layers will be used and calcu-
late the sum of the sizes of the headers in bits (h-size).
7.7.6.8.5 Packet Rate Parameter In cases of variable header sizes, the average size shall
be used. For RTP-transported media, the lower lay-
To be able to calculate the bandwidth value including the ers shall include the RTP header with header exten-
lower layers actually used, a packet rate attribute is also sions, if used, the CSRC list, and any profile-specific
defined. The SDP session and media level maximum packet extensions.
rate attribute is defined as 2. Retrieve the maximum packet rate from the SDP
(prate = maxprate).
a=maxprate:<packet-rate>; ABNF definition is 3. Calculate the transport overhead by multiplying the
provided later. header sizes by the packet rate (t over = h-size * prate).
4. Round the transport overhead up to nearest integer in
The <packet-rate> is a floating-point value for the stream’s bits (t-over = CEIL(t-over)).
maximum packet rate in packets per second. If the number 5. Add the transport overhead to the transport-
of packets is variable, the given value shall be the maximum independent bandwidth value (total bit-rate = bw-value
the application can produce in case of a live stream, or for + t-over). When the above calculation is performed
stored on-demand streams, has produced. The packet rate using the maxprate, the bit-rate value will be the abso-
is calculated by adding the number of packets sent within lute maximum the media stream may use over the
a 1-second window. The maxprate is the largest value pro- transport assumed in the calculations.
duced when the window slides over the entire media stream.
In cases when this cannot be calculated, that is, a live stream,
an estimated value of the maximum packet rate the codec
7.7.6.8.7 Deriving RTCP Bandwidth
can produce for the given configuration and content shall be
used. Note that the sliding window calculation will always This chapter does not solve the fairness and possible bit-rate
yield an integer number. However, the attributes field is a change introduced by IPv4 to IPv6 translation. These dif-
floating-point value because the estimated or known maxi- ferences are considered small enough, and known solutions
mum packet rate per second may be fractional. introduce code changes to the RTP/RTCP implementa-
At the SDP session level, the maxprate value is the maxi- tion. This section provides a consistent way of calculating
mum packet rate calculated over all the declared media the bit rate to assign to RTCP, if not explicitly given. First,
344 ◾ Handbook on Session Initiation Protocol
the transport-dependent RTP session bit rate is calculated, in s=Example of TIAS and maxprate in use
accordance with as described in the earlier section, using the c=IN IP4 0.0.0.0
b=AS:60
actual transport layers used at the end point where the cal- b=TIAS:50780
culation is done. The RTCP bit rate is then derived as usual t=0 0
based on the RTP session bandwidth, that is, normally equal a=control:rtsp://server.example.com/media.3gp
to 5% of the calculated value. a=range:npt=0-150.0
Giving the exact same RTCP bit-rate value to both the a=maxprate:28.0
IPv4 and IPv6 hosts will result in the IPv4 host having a m=audio 0 RTP/AVP 97
b=AS:12
higher RTCP sending rate. The sending rate represents the b=TIAS:8480
number of RTCP packets sent during a given time interval. a=maxprate:10.0
The sending of RTCP is limited according to rules defined a=rtpmap:97 AMR/8000
in the RTP specification (RFC 3550, see Section 7.2). For a=fmtp:97 octet-align;
a 100-byte RTCP packet (including UDP/IPv4), the IPv4 a=control:rtsp://server.example.com/
sender has an approximately 20% higher sending rate. This media.3gp/trackID=1
m=video 0 RTP/AVP 99
rate falls with larger RTCP packets. For example, 300-byte
packets will only give the IPv4 host a 7% higher sending rate.
The above rule for deriving RTCP bandwidth gives the b=AS:48
b=TIAS:42300
same behavior as fixed assignment when the RTP session has
a=maxprate:18.0
traffic parameters giving a large TIAS/maxprate ratio. The two a=rtpmap:99 MP4V-ES/90000
hosts will be fair when the TIAS/maxprate ratio is approxi- a=fmtp:99 profile-level-id=8;
mately 40 bytes/packet, given 100-byte RTCP packets. For a config=000001B008000001B509000001010000012000
TIAS/maxprate ratio of 5 bytes/packet, the IPv6 host will be 884006682C2090A21F
allowed to send approximately 15–20% more RTCP packets. a=control:rtsp://server.example.com/media.3gp/
trackID=3
The larger the RTCP packets become, the more it will
favor the IPv6 host in its sending rate. The conclusions is
that, within the normal useful combination of transport- In this SDP example of a streaming session’s SDP, there
independent bit rates and packet rates, the difference in fair- are two media streams, one audio stream encoded with
ness between hosts on different IP versions with different AMR and one video stream encoded with the MPEG-4
overhead is acceptable. For the 20-byte difference in over- video encoder. AMR is used here to produce a constant
head between IPv4 and IPv6 headers, the RTCP bandwidth rate media stream and uses a packetization resulting in
actually used in a unicast connection case will not be larger 10 packets per second. This results in a TIAS bandwidth
than approximately 1% of the total session bandwidth. rate of 8480 bits per second, and the claimed 10 packets
per second. The video stream is more variable. However,
it has a measured maximum payload rate of 42,300 bits
7.7.6.8.8 Augmented Backus– Naur per second. The video stream also has a variable packet
Form (ABNF) Syntax rate, despite the fact that the video is 15 frames per second,
where at least one instance in a second-long window con-
The ABNF syntax for the bandwidth modifier and the packet
tains 18 packets.
rate attribute is provided as follows:
7.7.6.9.2 SIP port. A port field value of zero has the standard SDP mean-
ing (i.e., rejection of the media stream). We define two new
The usage of TIAS together with maxprate should not be
values for the transport field: TCP/BFCP and TCP/TLS/
different from the handling of the AS modifier currently in
BFCP. The former is used when BFCP runs directly on top
use. The needed transport parameters will be available in the
of TCP, and the latter is used when BFCP runs on top of
transport field in the m= line. The address class can be deter-
TLS, which in turn runs on top of TCP. The fmt (format)
mined from the c= field and the client’s connectivity.
list is ignored for BFCP. The fmt list of BFCP m lines should
contain a single “*” character. The following is an example of
7.7.6.9.3 SAP an m line for a BFCP connection:
In the case of SAP, all available information to calculate the m=application 50000 TCP/TLS/BFCP *
transport-dependent bit rate should be present in the SDP.
The c= information gives the address family used for the
multicast. The transport layer, for example, RTP/UDP, for
7.7.7.3 Floor Control Server Determination
each media is evident in the media line (m=) and its transport
field. When two end points establish a BFCP stream, they need
to determine which of them acts as a floor control server.
In the most common scenario, a client establishes a BFCP
7.7.7 SDP Format for BFCP Streams
stream with a conference server that acts as the floor control
7.7.7.1 Overview server. Floor control server determination is straightforward
because one end point can only act as a client and the other
RFC 4583 specifies how to describe BFCP streams in SDP
can only act as a floor control server. However, there are
descriptions. User agents using the offer–answer model to
scenarios where both end points could act as a floor control
establish BFCP streams use this format in their offers and
server. For example, in a two-party session that involves an
answers. A given BFCP client needs a set of data in order to
audio stream and a shared whiteboard, the end points need to
establish a BFCP connection to a floor control server. These
decide which party will be acting as the floor control server.
data include the transport address of the server, the confer-
Furthermore, there are situations where both the offerer and
ence identifier, and the user identifier. One way for clients
the answerer act as both clients and floor control servers in
to obtain this information is to use an offer–answer (RFC
the same session. For example, in a two-party session that
3264, see Section 3.8.4) exchange. This document specifies
involves an audio stream and a shared whiteboard, one party
how to encode this information in the SDP session descrip-
acts as the floor control server for the audio stream and the
tions that are part of such an offer–answer exchange. User
other acts as the floor control server for the shared white-
agents typically use the offer–answer model to establish a
board. We define the floorctrl SDP media-level attribute to
number of media streams of different types. Following this
perform floor control determination. Its ABNF syntax is
model, a BFCP connection is described as any other media
stream by using an SDP m line, possibly followed by a num- floor-control-attribute = "a = floorctrl:"
ber of attributes encoded in “a” lines. role *(SP role)
role = "c-only"/"s-only"/"c-s"
7.7.7.2 Fields in the m Line The offerer includes this attribute to state all the roles it
This section describes how to generate an m line for a BFCP would be willing to perform:
stream. According to the SDP specification described earlier,
the m line format is as follows: c-only: The offerer would be willing to act as a floor con-
trol client only.
m=<media> <port> <transport> <fmt>... s-only: The offerer would be willing to act as a floor con-
trol server only.
The media field must have a value of application. The port c-s: The offerer would be willing to act both as a floor
field is set following the rules in RFC 4145. Depending on control client and as a floor control server.
the value of the setup attribute discussed later, the port field
contains the port to which the remote end point will initiate If an m line in an offer contains a floorctrl attribute,
its TCP connection or is irrelevant (i.e., the end point will the answerer must include one in the corresponding m line
initiate the connection toward the remote end point) and in the answer. The answerer includes this attribute to state
should be set to a value of 9, which is the discard port. Since which role the answerer will perform. That is, the answerer
BFCP only runs on top of TCP, the port is always a TCP chooses one of the roles the offerer is willing to perform
346 ◾ Handbook on Session Initiation Protocol
and generates an answer with the corresponding role for the BFCP connections MUST support the confid and the user-
answerer. The following table shows the corresponding roles id attributes. A floor control server acting as an offerer or
for an answerer, depending on the offerer’s role. as an answerer should include these attributes in its session
descriptions.
Offerer Answerer
c-only s-only
7.7.7.5 Association between Streams
and Floors
s-only c-only
We define the floorid SDP media-level attribute. Its ABNF
c-s c-s syntax is
authenticate each other using some mechanism. Once this 72 characters. A backslash character marks where this line
mutual authentication takes place, all the offerer and the folding has taken place. This backslash and its trailing CRLF
answerer need to ensure is that the entity they are receiving and white space would not appear in actual SDP content.
the BFCP messages from is the same as the one that generated The following is the answer returned by the client.
the previous offer or answer. When SIP is used to perform
an offer–answer exchange, the initial mutual authentication m=application 9 TCP/TLS/BFCP *
a=setup:active
takes place at the SIP level. Additionally, SIP uses S/MIME
a=connection:new
(see Section 19.6) to provide an integrity-protected channel a=fingerprint:SHA-1 \
with optional confidentiality for the offer–answer exchange. 3D:B4:7B:E3:CC:FC:0D:1B:5D:31:33:9E:48
BFCP takes advantage of this integrity-protected offer– :9B:67:FE:68:40:E8:21
answer exchange to perform authentication. Within the a=floorctrl:c-only
offer–answer exchange, the offerer and answerer exchange m=audio 55000 RTP/AVP 0
m=video 55002 RTP/AVP 31
the fingerprints of their self-signed certificates. These self-
signed certificates are then used to establish the TLS con-
nection that will carry BFCP traffic between the offerer and
the answerer. 7.7.8 SDP Content Attribute
BFCP clients and floor control servers follow the rules in
7.7.8.1 Overview
(RFC 4572) regarding certificate choice and presentation.
This implies that unless a fingerprint attribute is included in There are situations where one application receives several
the session description, the certificate provided at the TLS similar media streams, which are described in an SDP ses-
level must either be directly signed by one of the other party’s sion description. The media streams can be similar in the
trust anchors or be validated using a certification path that sense that their content cannot be distinguished just by
terminates at one of the other party’s trust anchors (RFC examining their media description lines (e.g., two video
3280). End points that use the offer–answer model to estab- streams). The content attribute is needed so that the receiving
lish BFCP connections must support the fingerprint attribute application can treat each media stream appropriately based
and should include it in their session descriptions. When TLS on its content. RFC 4796 defines a new SDP media level
is used, once the underlying TCP connection is established, attribute, content. The content attribute defines the content
the answerer acts as the TLS server regardless of its role (pas- of the media stream to a more detailed level than the media
sive or active) in the TCP establishment procedure. description line. The SDP content media-level attribute pro-
vides more information about the media stream than the m
line in an SDP session description. The sender of an SDP
7.7.7.8 Examples
session description can attach the content attribute to one or
For the purpose of brevity, the main portion of the session more media streams. The receiving application can then treat
description is omitted in the examples, which only show m each media stream differently (e.g., show it on a big or small
lines and their attributes. The following is an example of an screen) based on its content The main purpose of this speci-
offer sent by a conference server to a client. fication is to allow applications to take automated actions
based on the content attributes. However, this specification
m=application 50000 TCP/TLS/BFCP * does not define those actions. Consequently, two implemen-
a=setup:passive
tations can behave completely differently when receiving the
a=connection:new
a=fingerprint:SHA-1 \ same content attribute.
4A:AD:B9:B1:3F:82:18:3B:54:02:12:DF:3E
:5D:49:6B:19:E5:7C:AB
a=floorctrl:s-only
7.7.8.2 Related Techniques
a=confid:4321 The label attribute defined in RFC 4574 enables a sender
a=userid:1234
to attach a pointer to a particular media stream. The
a=floorid:1 m-stream:10
a=floorid:2 m-stream:11 namespace of the label attribute itself is unrestricted; so,
m=audio 50002 RTP/AVP 0 in principle, it could also be used to convey information
a=label:10 about the content of a media stream. However, in practice,
m=video 50004 RTP/AVP 31 this is not possible because of the need for backward com-
a=label:11 patibility. Existing implementations of the label attribute
already use values from that unrestricted namespace in an
Note that due to RFC formatting conventions, this doc- application-specific way. Thus, it is not possible to reserve
ument splits SDP across lines whose content would exceed portions of the label attribute’s namespace without possible
348 ◾ Handbook on Session Initiation Protocol
conflict with already used application-specific labels. It controlling application. The content attribute would make it
is possible to assign semantics to a media stream with an possible, for example, for an end user to specify, only once,
external document that uses the label attribute as a pointer. which output each media stream of a given session should
The downside of this approach is that it requires an external use. The application could automatically apply the same
document. Therefore, this kind of mechanism is only appli- media layout for subsequent sessions. Therefore, the content
cable to special-use cases where such external documents attribute can help reduce the amount of required end-user
are used (e.g., centralized conferencing). Yet another way interaction considerably.
to attach semantics to a media stream is to use the “i” SDP
attribute, defined in RFC 4566 described earlier. However,
7.7.8.4 Content Attribute
values of the “i” attribute are intended for human users and
not for automata. This specification defines a new media-level value attribute,
content. Its formatting in SDP is described by the following
ABNF:
7.7.8.3 Motivation for the New
Content Attribute content-attribute = "a=content:" mediacnt-tag
mediacnt-tag = mediacnt*("," mediacnt)
Currently, SDP does not provide any means for describing mediacnt = "slides"/"speaker"/"sl"/"
the content of a media stream (e.g., speaker’s image, slides, main"/"alt"/mediacnt-ext
sign language) in a form that the application can under- mediacnt-ext = token
stand. Of course, the end user can see the content of the
media stream and read its title, but the application cannot The content attribute contains one or more tokens,
understand what the media stream contains. The application which may be attached to a media stream by a sending
that is receiving multiple similar (e.g., same type and for- application. An application may attach a content attribute
mat) media streams needs, in some cases, to know what the to any media stream it describes. This document provides
contents of those streams are. This kind of situation occurs, a set of predefined values for the content attribute. Other
for example, in cases where presentation slides, the speaker’s values can be defined in the future. The predefined values
image, and sign language are transported as separate media are as follows:
streams. It would be desirable that the receiving application
could distinguish them in a way that it could handle them ◾◾ Slides: the media stream includes presentation slides.
automatically in an appropriate manner. Figure 7.6 shows a The media type can be, for example, a video stream or
screen of a typical communication application. The content a number of instant messages with pictures. Typical
attribute makes it possible for the application to decide where use cases for this are online seminars and courses. This
to show each media stream. From an end user’s perspective, it is similar to the presentation role in H.239 [1].
is desirable that the user does not need to arrange each media ◾◾ Speaker: the media stream contains the image of the
stream every time a new media session starts. speaker. The media can be, for example, a video stream
The content attribute could also be used in more com- or a still image. Typical use cases for this are online
plex situations. An example of such a situation is application seminars and courses.
controlling equipment in an auditorium. An auditorium can ◾◾ Sl: the media stream contains sign language. A typical
have many different output channels for video (e.g., main use case for this is an audio stream that is translated
screen and two smaller screens) and audio (e.g., main speak- into sign language, which is sent over a video stream.
ers and headsets for the participants). In this kind of environ- ◾◾ Main: the media stream is taken from the main source.
ment, a lot of interaction from the end user who operates the A typical use case for this is a concert where the camera
application would be required in the absence of cues from a is shooting the performer.
◾◾ Alt: the media stream is taken from the alternative
source. A typical use case for this is an event where
the ambient sound is separated from the main sound.
Speaker’s The alternative audio stream could be, for example, the
image
sound of a jungle. Another example is the video of a
Presentation slides conference room, while the main stream carries the
Sign language video of the speaker. This is similar to the live role in
H.239.
Figure 7.6 Application’s screen. (Copyright IETF. Reproduced All these values can be used with any media type. We
with permission.) chose not to restrict each value to a particular set of media
Media Transport Protocol and Media Negotiation ◾ 349
its own merit due to complexities of media negotiations 7. How do the RTP translators and mixers work? How is
in SIP and related protocols. Finally, the security aspects the RTCP report generated specific to the operations
of both real-time transport protocols and SDP have spe- and maintenance of RTP translators and mixers?
cifically articulated although the entire security and privacy 8. Explain the salient functional and performance char-
of both SIP signaling and media are specified in Sections acteristics of SRTP/SRTCP, ZRTP, RTSP, and MRCP
3.8.4 and 19.2.3. including security as applicable briefly. Compare their
protocol functional capabilities in the context of SIP.
9. Explain the SDP briefly. Explain the characteristics of
PROBLEMS SDP content-agnostics and content-aware attributes.
10. How does the SDP transport–independent bandwidth
1. Why is there a need to develop RTP specifically for modifier work for the following: conference total,
transporting over the IP network for real-time multi- application-specific maximum, RTCP report bandwidth,
media applications like audio and video? IPv6 and IPv4, and IPsec and header compression?
2. What are the specific functional characteristics of RTP 11. What are the problems of bandwidth signaling in SDP
that differ fundamentally with respect to those of the for the following: IP version, taking other mechanisms
non-real-time application protocols? into account, converting bandwidth values, and future
3. Why does RTP need to be transferred over the con- development? How do you scope all of those problems?
nectionless transport protocols like UDP for real-time Discuss the solution of these problems in SDP using
audio and video multimedia conferencing? examples.
4. What are the problems for transmission through mul- 12. Describe all SDP attributes for the BFCP streams.
tiplexing of multiple interleaving media streams (say 13. Explain how the SDP offer–answer model is used for
audio and video) over the single RTP session? How can media negotiations mitigating security and privacy.
these problems be avoided?
5. How does the RTCP report work? Describe all features
and functionalities of RTCP reports and packets. How References
does the RTCP facilitate monitoring and operations of
ongoing multimedia sessions? 1. ITU-T, “Infrastructure of audiovisual services, Systems
6. How does the RTP and RTCP complement each other aspects; Role management and additional media channels
for H.300-series terminals,” Series H H.239, July 2003.
for the ongoing multimedia sessions in real-time? How
2. Michel, T. and Ayars, J., “Synchronized Multimedia
do the operations and maintenance of multimedia ses- Integration Language (SMIL 2.0) [Second Edition],”
sions by RTCP differ from those of the non-real-time World Wide Web Consortium Recommendation REC-
management applications such as Simple Network SMIL2-20050107, January 2005. Available at http://www
Management Protocol (SNMP)? .w3.org/TR/2005/REC-SMIL2-20050107.
Chapter 8
Abstract 8.1 Introduction
The use of the Domain Name System (DNS) The Domain Name System (DNS) resolves the human-read-
for resolving of the domain host names (e.g., able host names (e.g., [email protected]) into Internet Protocol (IP)
[email protected]) to the Internet Protocol addresses. The DNS is a hierarchical, domain-based nam-
(IP) addresses and other attributes related to ing scheme distributed database system for implementing the
the Session Initiation Protocol (SIP) functional naming scheme. Because of the distributed nature of the DNS
entities is essential for the client–server SIP pro- architecture, the use of caching is highly scalable over a large
tocol. Request for Comment 3263 that is also network, as large as the public Internet, including any private
described here specifies the DNS procedures for networks. However, it can also be used for many other pur-
locating/discovering the SIP entities. Before we poses, such as controlling access to resources, traffic manage-
describe the DNS usage in SIP, we describe the ment and load balancing, and network planning. As a result,
DNS architecture itself first because DNS has it appears that DNS has become a general-purpose distributed
become almost a general-purpose distributed database for controlling accesses to resources: virtually all appli-
database for controlling accesses to resources for cations including Session Initiation Protocol/Voice over IP (SIP/
client–server applications over the IP network. VoIP), e-mail, World Wide Web (WWW), instant messaging,
In the beginning, public switched telephone File Transfer Protocol, Lightweight Directory Access Protocol,
network (PSTN) has been used for telecom- Network Time Protocol, Post Office Protocol (POP), Simple
munications administered by the International Mail Transfer Protocol (SMTP) mail, and peer-to-peer (P2P)
Telecommunication Union-Telephone (ITU-T). applications.
ITU-T Recommendation E.164 is used as the Historically, circuit-switched-based public switched tele-
number system of the PSTN functional enti- phone network (PSTN) was used for telecommunications in
ties, including telephones throughout the world, the beginning. The PSTN network uses the numbering plan
and the assignment of number prefixes to each as administered by the ITU-T, and the plan, Recommendation
country code is also administered by the ITU-T. E.164, involves the assignment of number prefixes to each
It implies that interoperability between the IP country code administrator. This E.164 numbering plan has
addresses and E.164 numbers is needed. In this evolved today into a global numbering plan where every device
context, the E.164 number (ENUM) technical connected to the telephone network is assigned a unique
standard facilitates the mapping using a DNS numerical address.
name parent of e164.arpa. This chapter exclu- If the Internet telephony that is also the IP device with
sively discusses both DNS and ENUM address an IP address needs to interoperate seamlessly with the tele-
resolution mechanisms, as well as the mapping phone network, it becomes urgent for supporting this E.164
between the IP addresses and the E.164 tele- numbering plain into the realm of the Internet. The seamless
phone numbers in detail. interworking between the IP and PSTN telephones requires
351
352 ◾ Handbook on Session Initiation Protocol
a mapping between the IP address and the E.164 address. way, independent of each entity’s physical location. The DNS
The Electronic Number (ENUM) technical standard facili- uses an application program known as the resolver passing
tates this mapping using a DNS name parent of e164.arpa. the name as a parameter to map a name onto the IP address.
It also implies that every Internet device that supports tele- Figure 8.1 shows the hierarchical naming system with
phone operation needs to also have an alias in the form of domain names, with each domain having a name server.
a unique telephone address. In this way, the Internet tele- All of these domains are represented by a tree. The leaves
phony world is able to interface to the telephone network of the tree represent domains that have no subdomains, and
by allowing Internet-connected telephone devices to make a leaf domain may contain a single host or thousands of
and receive calls to any other telephone device, whether the hosts. Each domain is named by the path upward from it to
other device is connected to the Internet, connected to the the unnamed root, and domain names are case insensitive.
telephone network, or connected to any other network that The components are separated by periods (e.g., eng.llc.com).
seamlessly interoperates with the telephone network. However, DNS names avoid specifics such as IP addresses
and port numbers.
The decentralization of DNS administration is obtained
through delegation of domain names, a given domain can
8.2 Domain Name System be divided into subdomains, and each of the subdomains
can be delegated to other organizations. It implies that the
8.2.1 Namespace
delegated organization becomes responsible for maintaining
The hierarchical naming system with a distributed database all the data responsible to that subdomain. For example, the
for hosts of DNS makes it possible to assign domain names engineering.llc domain (Figure 8.1) is delegated to the folks
to groups of Internet resources and users in a meaningful of engineering. The efficient caching scheme that is applied
“.”
(root)
opt
finance
deleg
Engineering
sugar wheat
jack jill
Figure 8.1 DNS naming system. (Copyright IETF. Reproduced with permission.)
DNS and ENUM in SIP ◾ 353
in DNS allows most DNS queries to traverse only one or two ◾◾ NAME is the fully qualified domain name (FQDN) of
DNS servers. As a result, DNS updates require significant the node in the tree. The domain name may be com-
time to propagate in the Internet. The mobility services that pressed where ends of domain names in the packet can
require IP addresses to be changed rapidly cannot use DNS be substituted for the end of the current domain name.
services. However, a cached record may sometimes be out of ◾◾ TTL is the time to live and is expressed in seconds so
date, and authoritative records need to be used to avoid this that the RR stays valid up to that duration. The maxi-
problem. An authoritative record is one that comes from the mum is 231 – 1 (~68 years).
authority that manages the record, and is thus always correct. ◾◾ TYPE is the record type and indicates the format of
the data. It provides a hint of its intended use. Table
8.2 shows different RR types and the value of each
8.2.2 Resource Records type as assigned by the Internet Assigned Numbers
Every domain can have a set of resource records (RRs) Authority (IANA) for DNS.
associated with it. The most common item of the RR is the ◾◾ RDATA is data of type-specific relevance, such as the
IP address of the host; however, many other RRs can also IP address for address records, or the priority and host
exist as DNS and can be used for many other applications. name for mail exchange (MX) records. Well-known
Thus, when a resolver sends a query passing a namespace record types may use label compression in the RDATA
(i.e., domain name) to DNS, the DNS simply maps the field, but unknown record types must not (RFC 3597).
domain name onto an RR associated with it and sends the ◾◾ The CLASS of a record is set to IN (for Internet) for
query back to the resolver. An RR is five tuple, as shown common DNS records involving Internet host names,
in Table 8.1 (Request for Comment [RFC] 1035): NAME, servers, or IP addresses. In addition, the classes Chaos
TTL, CLASS, TYPE, RDLENGTH, and RDATA. Table (CN) and Hesiod (HS) exist. Each class is an indepen-
8.2 depicts the different types of RRs, the value for each RR dent namespace with potentially different delegations
type, and the description of each RR type. of DNS zones.
AAAA 28 RFC 3596 IPv6 address record for a host (forward-mapped zones recommended by
IETF).
A6 38 RFC 2874 Forward mapping of IPv6 addresses record for a host within the zone
(experimental).
AFSDB 18 RFC 1183 Location record of AFS servers for special apps only (experimental).
DNAME 39 RFC 2672 Delegation record of reverse addresses primarily for IPv6 (experimental).
HINFO 13 RFC 1035 Host information/description record about a host (optional text data).
ISDN 20 RFC 1183 ISDN address record for special applications only (experimental).
KEY 25 RFC 2535 Public key record associated with a DNS name.
LOC 29 RFC 1876 GPS data record related to location information—widely used (experimental).
MX 15 RFC 1035 Mail exchanger record and RFC 974-defined valid names—a preference value
and the host name for a mail server/exchanger that will service this zone.
NAPTR 35 RFC 3403 Naming Authority Pointer Record. General-purpose definition of rule set to be
used by applications (e.g., SIP/VoIP).
NS 2 RFC 1035 Name Server record that defines the authoritative name server(s) for the
domain (defined by the SOA record) or the subdomain.
NSEC 47 RFC 4034 Next Secure record used to provide proof of nonexistence of a name.
NXT 30 RFC DNS Security (DNSSEC) Next Domain record type that obsoletes use of NSEC.
3755/2535
OPT 41 RFC 2671 Known as pseudo-RR because it pertains to a particular transport level
message and not to any actual DNS data. OPT RRs shall never be cached,
forwarded, or stored in or loaded from master files.
PTR 12 RFC 1035 Alias records for the IP address (IPv4 or IPv6) for the host used in reverse
maps.
RP 17 RFC 1183 Information record about responsible person for special applications only
(experimental).
RT 21 RFC 1183 Through-route binding record for special applications only (experimental).
SIG 24 RFC Signature record for DNS Security (DNSSEC) that obsoletes the use of RR
2931/2535 signature (RRSIG) and SIG(0) is used as a special meta RR in Dynamic DNS
(DDNS) and zone transfer security.
SOA 6 RFC 1035 Start of Authority record that defines the zone name, an e-mail contact and
various time and refresh values applicable to the zone.
(Continued)
DNS and ENUM in SIP ◾ 355
Table 8.2 (Continued) DNS Resource Record Type, Value, and Description
RR Type
RR Type Value RFC Description
SPF 99 RFC 4408 Sender Policy Framework (SPF) (v1) record that defines the servers that are
authorized to send mail for a domain, and its primary function is to prevent
identity theft by spammers.
SRV 33 RFC 2872 Service record that defines services available in the zone (e.g., SIP, IM, XMPP,
LDAP, HTTP, SMTP).
TXT 16 RFC 1035 Text information record associated with a name. Note: The SPF record should
be defined using TXT record and may be defined using an SPF RR. Domain
Keys Identified Mail (DKIM) (RFC 487) (also makes use of the TXT RR for
authenticating e-mail.
WKS 11 RFC 1035 Well-Known Services record deprecated in favor of SRV record.
X25 19 RFC 1183 X.25 addresses record for special applications only (experimental).
Source: IANA DNS Parameters: http://www.iana.org/assignments/dns-parameters. Copyright IETF. Reproduced with permission.
“.”
“.” (root)
Name server
?
Quer
y AAAA llc NS
w .l lc .com” Refer to
“ww
Query
Name “www.llc.com” AA
AA? llc
server com org
Refer to llc. Name server
com NS
Qu
ery
“ww
w.llc
ww .com
w.ll ” AA
c.co AA?
m:
200
1:66 AAA
0:30 A
Query Response: 03:2 for
::4:2 llc.com
“www.llc.com” www.llc.com has IPv6 @ 0 Name server
AAAA? 2001:660:3003:2::4:20
llc bpc rnl
Cache
additions
Resolver References Cache
User program
Another scenario can be that the domain is remote and manually) depending on implementation. DHCP is useful
the local naming server does not have access to that domain because it automates the network parameter assignment to
directly. The local name server sends the query to the top network devices from one or more DHCP servers. DHCP
domain server for the domain requested. For example, a makes it easy to add new machines in the network locally
resolver on finance.mem.net (Figure 8.1) wants to resolve the and can complement work with DNS that resolves the IP
IP address of the host jack.engineering.llc.com. At first, the addresses across the global Internet. The DNS and DHCP
resolver will send the query to the local name server mem mechanisms described here are also equally applicable for the
.net. The query will then be sent to com.server.net as the enterprise and private IP network.
name server, mem.net or net, does not know the domain
space. The com name server will then forward the query
to its child, engineering.llc.com, name server as it does not 8.2.4 Locating/Discovering SIP Entities
know the domain space. In turn, it will send the query to
8.2.4.1 Overview
the engineering.llc.com name server that must have the
authoritative RRs. The query has formed a path from the The SIP, being a client–server protocol, uses DNS procedures
client to the server, and the response from the engineering to allow a client to resolve a SIP Uniform Resource Identifier
.llc.com name server will follow the same path and will come (URI) into the IP address, port, and transport protocol of
back to the originator. It is important to note that once these the next hop to contact. It also uses DNS to allow a server to
RRs are obtained by the engineering.llc.com name server, send a response to a backup client if the primary client has
the server will keep these records into a cache in case they failed. RFC 3263 that is described here specifies those DNS
are used later. However, the TTL field of these RRs should procedures in detail. A typical SIP configuration, referred
not live long because cache data, as explained earlier, is not to as the SIP trapezoid with two administrative domains, is
authoritative. shown in Figure 8.3. In this diagram, a caller in domain A
It should be noted that the Dynamic Host Configuration (UA 1) wishes to call Joe in domain B (joe@B). To do so, it
Protocol (DHCP) can be used to discover the IP addresses of communicates with proxy 1 in its domain (domain A). Proxy
local catching resolvers. The DHCP server manages a pool 1 forwards the request to the proxy for the domain of the
of IP addresses and information about client configuration called party (domain B), which is proxy 2. Proxy 2 forwards
parameters, such as domain names and name servers in local the call to the called party, UA 2. As part of this call flow,
environments. The allocated IP addresses can be assigned to proxy 1 needs to determine a SIP server for domain B. To
the local host dynamically (i.e., as if leasing IP addresses for do this, proxy 1 makes use of DNS procedures, using both
a limited time), automatically (i.e., permanent assignment of Service (SRV) (RFC 2782) and Naming Authority Pointer
IP addresses), and statically (i.e., allocation of IP addresses (NAPTR) (RFCs 3401–3404) records. We describe the
Domain A Domain B
SIP proxy 1 SIP proxy 2
(example.com) (example.net)
IP network
SIP network
Bob Joe
Party A Party B
SIP UA 1 SIP UA 2
bob@A joe@B
Figure 8.3 SIP trapezoid with two administrative domains. (Copyright IETF. Reproduced with permission.)
DNS and ENUM in SIP ◾ 357
specific problems for which SIP uses DNS to help solve, and would be a cluster of homogeneously configured proxies.
provides a solution. However, the use of DNS by the state- DNS needs to provide the ability for domain B to configure a
ful client (stateful user client or stateful proxy) and stateless set of servers, along with prioritization and weights, in order
proxy differs slightly. to provide a crude level of capacity-based load balancing. SIP
assures high availability by having upstream elements detect
failures. For example, assume that proxy 2 is implemented
8.2.4.2 Problems DNS Needs to Solve as a cluster of two proxies, proxy 2.1 and proxy 2.2. If proxy
DNS is needed to help solve two aspects of the general call 1 sends a request to proxy 2.1 and the request fails, it retries
flow described in Section 8.1. The first is for proxy 1 to dis- the request by sending it to proxy 2.2. In many cases, proxy
cover the SIP server in domain B, in order to forward the call 1 will not know which domains it will ultimately commu-
for joe@B. The second is for proxy 2 to identify a backup for nicate with. That information would be known when a user
proxy 1 in the event that it fails after forwarding the request. actually makes a call to another user in that domain. Proxy
For the first aspect, proxy 1 specifically needs to determine 1 may never communicate with that domain again after the
the IP address, port, and transport protocol for the server call completes. Proxy 1 may communicate with thousands of
in domain B. The choice of transport protocol is particu- different domains within a few minutes, and proxy 2 could
larly noteworthy. Unlike many other protocols, SIP can run receive requests from thousands of different domains within
over a variety of transport protocols, including Transmission a few minutes. Because of this many-to-many relationship,
Control Protocol (TCP), User Datagram Protocol, and and the possibly long intervals in communications between
Stream Control Transmission Protocol (SCTP). SIP can also a pair of domains, it is not generally possible for an element
use TLS. Currently, use of TLS is defined for TCP only. to maintain a dynamic availability state for the proxies it will
Thus, clients need to be able to automatically determine communicate with.
which transport protocols are available. The proxy sending When a proxy gets its first call with a particular domain,
the request has a particular set of transport protocols it sup- it will try the servers in that domain in some order until
ports and a preference for using those transport protocols. it finds one that is available. The identity of the available
Proxy 2 has its own set of transport protocols it supports, and server would ideally be cached for some amount of time in
relative preferences for those transport protocols. All prox- order to reduce call setup delays of subsequent calls. The
ies must implement both UDP and TCP, along with TLS client cannot query a failed server continuously to deter-
over TCP, so that there is always an intersection of capabili- mine when it becomes available again, since this does not
ties. Some form of DNS procedures are needed for proxy 1 scale. Furthermore, the availability state must eventually be
to discover the available transport protocols for SIP services flushed in order to redistribute load to recovered elements
at domain B, and the relative preferences of those transport when they come back online. It is possible for elements to
protocols. Proxy 1 intersects its list of supported transport fail in the middle of a transaction. For example, after proxy
protocols with those of proxy 2, and then chooses the proto- 2 forwards the request to UA 2, proxy 1 fails. UA 2 sends
col preferred by proxy 2. its response to proxy 2, which tries to forward it to proxy 1,
It is important to note that DNS lookups can be used which is no longer available. The second aspect of the flow in
multiple times throughout the processing of a call. In gen- the introduction for which DNS is needed is for proxy 2 to
eral, an element that wishes to send a request (called a cli- identify a backup for proxy 1 that it can send the response to.
ent) may need to perform DNS processing to determine the This problem is more realistic in SIP than it is in other trans-
IP address, port, and transport protocol of a next-hop ele- actional protocols. The reason is that some SIP responses
ment, called a server (it can be a proxy or a user agent). Such can take a long time to be generated, because a human user
processing could, in principle, occur at every hop between frequently needs to be consulted in order to generate that
elements. Since SIP is used for the establishment of interac- response. As such, it is not uncommon for tens of seconds to
tive communications services, the time it takes to complete elapse between a call request and its acceptance.
a transaction between a caller and the called party is impor-
tant. Typically, the time from when the caller initiates a call
8.2.4.3 Client Usage
until the time the called party is alerted should be no more
than a few seconds. Given that there can be multiple hops, Usage of DNS differs for clients and for servers. This section
each of which is doing DNS lookups in addition to other discusses client usage. We assume that the client is stateful
potentially time-intensive operations, the amount of time (either a UAC [user agent client] or a stateful proxy). Stateless
available for DNS lookups at each hop is limited. proxies are discussed later in this chapter. The procedures
Scalability and high availability are important in SIP. SIP here are invoked when a client needs to send a request to
services scale up through clustering techniques. Typically, a resource identified by a SIP or SIPS URI. This URI can
in a realistic version of the network in Figure 8.3, proxy 2 identify the desired resource to which the request is targeted
358 ◾ Handbook on Session Initiation Protocol
(in which case, the URI is found in the Request-URI), or it no transport was present in the SIP URI. However, another
can identify an intermediate hop toward that resource (in transport, such as TCP, may be used if the guidelines of SIP
which case, the URI is found in the Route header). The pro- mandate it for this particular request. That is the case, for
cedures defined here in no way affect this URI (i.e., the URI example, for requests that exceed the path maximum trans-
is not rewritten with the result of the DNS lookup); they mission unit (MTU).
only result in an IP address, port, and transport protocol Otherwise, if no transport protocol or port is specified,
where the request can be sent. RFC 3261 provides guidelines and the target is not a numeric IP address, the client should
on determining which URI needs to be resolved in DNS to perform a NAPTR query for the domain in the URI. The
determine the host that the request needs to be sent to. In services relevant for the task of transport protocol selection
some cases, also documented in RFC 3261, the request can are those with NAPTR service fields with values SIP+D2X
be sent to a specific intermediate proxy not identified by a and SIPS+D2X, where X is a letter that corresponds to a
SIP URI, but rather by a host name or numeric IP address. In transport protocol supported by the domain. This specifi-
that case, a temporary URI, used for purposes of this speci- cation defines D2U for UDP, D2T for TCP, and D2S for
fication, is constructed. That URI is of the form sip:<proxy>, SCTP. We also establish an IANA registry for NAPTR ser-
where <proxy> is the FQDN or numeric IP address of the vice name to transport protocol mappings. These NAPTR
next-hop proxy. As a result, in all cases, the problem boils records provide a mapping from a domain to the SRV record
down to resolution of a SIP or SIPS URI in DNS to deter- for contacting a server with the specific transport protocol
mine the IP address, port, and transport of the host to which in the NAPTR services field. The RR will contain an empty
the request is to be sent. regular expression and a replacement value, which is the SRV
The procedures here must be done exactly once per trans- record for that particular transport protocol. If the server
action, where transaction is as defined in RFC 3261. That supports multiple transport protocols, there will be multiple
is, once a SIP server has successfully been contacted (suc- NAPTR records, each with a different service value. As per
cess is defined below), all retransmissions of the SIP request RFCs 3401–3404, the client discards any records whose ser-
and the ACK for non-2xx SIP responses to INVITE must be vice fields are not applicable. For the purposes of this specifi-
sent to the same host. Furthermore, a CANCEL for a par- cation, several rules are defined.
ticular SIP request must be sent to the same SIP server that First, a client resolving a SIPS URI must discard any ser-
the SIP request was delivered to. Because the ACK request vices that do not contain SIPS as the protocol in the service
for 2xx responses to INVITE constitutes a different transac- field. The converse is not true, however. A client resolving a
tion, there is no requirement that it be delivered to the same SIP URI should retain records with SIPS as the protocol, if
server that received the original request (indeed, if that server the client supports TLS. Second, a client must discard any
did not record-route, it will not get the ACK). We define service fields that identify a resolution service whose value
TARGET as the value of the maddr parameter of the URI, is not D2X, for values of X that indicate transport protocols
if present; otherwise, as the host value of the hostport com- supported by the client. The NAPTR processing as described
ponent of the URI. It identifies the domain to be contacted. in RFCs 3401–3404 will result in the discovery of the most
A description of the SIP and SIPS URIs and a definition of preferred transport protocol of the server that is supported by
these parameters can be found in RFC 3261. We determine the client, as well as an SRV record for the server. It will also
the transport protocol, port, and IP address of a suitable allow the client to discover if TLS is available and its prefer-
instance of TARGET described below. ence for its usage.
As an example, consider a client that wishes to resolve
sip:[email protected]. The client performs a NAPTR query
8.2.4.3.1 Selecting a Transport Protocol
for that domain, and the following NAPTR records are
First, the client selects a transport protocol. If the URI speci- returned:
fies a transport protocol in the transport parameter, that
transport protocol should be used. Otherwise, if no trans- ; order pref flags service regexp replacement
port protocol is specified, but the TARGET is a numeric IP IN NAPTR 50 50 "s" "SIPS+D2T" "" _sips._tcp.
example.com.
address, the client should use UDP for a SIP URI, and TCP
IN NAPTR 90 50 "s" "SIP+D2T" "" _sip._tcp.
for a SIPS URI. Similarly, if no transport protocol is speci- example.com
fied, and the TARGET is not numeric, but an explicit port is IN NAPTR 100 50 "s" "SIP+D2U" "" _sip._udp.
provided, the client should use UDP for a SIP URI and TCP example.com.
for a SIPS URI. This is because UDP is the only mandatory
transport in RFC 2543 superseded by RFC 3261, and thus This indicates that the server supports TLS over TCP,
the only one guaranteed to be interoperable for a SIP URI. It TCP, and UDP, in that order of preference. Since the cli-
was also specified as the default transport in RFC 2543 when ent supports TCP and UDP, TCP will be used, targeted to
DNS and ENUM in SIP ◾ 359
a host determined by an SRV lookup of _sip._tcp.example 8.2.4.3.2 Determining Port and IP Address
.com. That lookup would return
Once the transport protocol has been determined, the next
step is to determine the IP address and port. If the TARGET
;; Priority Weight Port Target
IN SRV 0 1 5060 server1.example.com
is a numeric IP address, the client uses that address. If the
IN SRV 0 2 5060 server2.example.com URI also contains a port, it uses that port. If no port is speci-
fied, it uses the default port for the particular transport pro-
tocol. If the TARGET is not a numeric IP address, but a port
If a SIP proxy, redirect server, or registrar is to be con-
is present in the URI, the client performs an A or AAAA
tacted through the lookup of NAPTR records, there must
record lookup of the domain name. The result will be a list of
be at least three records—one with a SIP+D2T service field,
IP addresses, each of which can be contacted at the specific
one with a SIP+D2U service field, and one with a SIPS+D2T
port from the URI and transport protocol determined pre-
service field. The records with SIPS as the protocol in the
viously. The client should try the first record. If an attempt
service field should be preferred (i.e., have a lower value of
should fail, based on the definition of failure described in
the order field) above records with SIP as the protocol in the
the next section, the next should be tried, and if that should
service field. A record with a SIPS+D2U service field should
fail, the next SHOULD be tried, and so on. This is a change
not be placed into the DNS, since it is not possible to use
from RFC 2543 (obsoleted by RFC 3261). Previously, if the
TLS over UDP. It is not necessary for the domain suffixes in
port was explicit, but with a value of 5060, SRV records
the NAPTR replacement field to match the domain of the
were used. Now, A or AAAA records will be used. If the
original query (i.e., example.com above). However, for back-
TARGET is not a numeric IP address, and no port is present
wards compatibility with RFC 2543, a domain must main-
in the URI, the client performs an SRV query on the record
tain SRV records for the domain of the original query, even if
returned from the NAPTR processing described in the ear-
the NAPTR record is in a different domain. As an example,
lier section, if such processing was performed.
even though the SRV record for TCP is _sip._tcp.school.edu,
If it was not, because a transport was specified explicitly,
there must also be an SRV record at _sip._tcp.example.com.
the client performs an SRV query for that specific transport,
RFC 2543 will look up the SRV records for the domain
using the service identifier _sips for SIPS URIs. For a SIP
directly. If these do not exist because the NAPTR replace-
URI, if the client wishes to use TLS, it also uses the ser-
ment points to a different domain, the client will fail.
vice identifier _sips for that specific transport; otherwise, it
For NAPTR records with SIPS protocol fields (if the
uses _sip. If the NAPTR processing was not done because
server is using a site certificate), the domain name in the
no NAPTR records were found, but an SRV query for a sup-
query and the domain name in the replacement field must
ported transport protocol was successful, those SRV records
both be valid based on the site certificate handed out by the
are selected. Regardless of how the SRV records were deter-
server in the TLS exchange. Similarly, the domain name in
mined, the procedures of RFC 2782, as described in the sec-
the SRV query and the domain name in the target in the
tion titled “Usage Rules,” are followed, augmented by the
SRV record must both be valid based on the same site certifi-
additional procedures described in the next section. If no
cate. Otherwise, an attacker could modify the DNS records
SRV records were found, the client performs an A or AAAA
to contain replacement values in a different domain, and the
record lookup of the domain name. The result will be a list of
client could not validate that this was the desired behavior
IP addresses, each of which can be contacted using the trans-
or the result of an attack. If no NAPTR records are found,
port protocol determined previously, at the default port for
the client constructs SRV queries for those transport proto-
that transport. Processing then proceeds as described above
cols it supports, and does a query for each. Queries are done
for an explicit port once the A or AAAA records have been
using the service identifier _sip for SIP URIs and _sips for
looked up.
SIPS URIs. A particular transport is supported if the query
is successful. The client may use any transport protocol it
desires that is supported by the server. This is a change from
8.2.4.3.3 Details of RFC 2782 Process
RFC 2543 (obsoleted by RFC 3261). It specifies that a client
would look up SRV records for all transports it supported, RFC 2782 spells out the details of how a set of SRV records
and merge the priority values across those records. Then, it are sorted and then tried. However, it only states that the cli-
would choose the most preferred record. If no SRV records ent should “try to connect to the (protocol, address, service)”
are found, the client should use TCP for a SIPS URI, and without giving any details on what happens in the event
UDP for a SIP URI. However, another transport protocol, of failure. Those details are described here for SIP. For SIP
such as TCP, may be used if the guidelines of SIP mandate it requests, failure occurs if the transaction layer reports a 503
for this particular request. That is the case, for example, for Server Unavailable error response or a transport failure of
requests that exceed the path MTU. some sort (generally, due to fatal Internet Control Message
360 ◾ Handbook on Session Initiation Protocol
Protocol errors in UDP or connection failures in TCP). If the first server is contacted successfully, the proxy can
Failure also occurs if the transaction layer times out without remain stateless. However, if the first server is not contacted
ever having received any response, provisional or final (i.e., successfully, and a subsequent server is, the proxy cannot
timer B or timer F in RFC 3261 fires; see Section 3.12). If a remain stateless for this transaction. If it were stateless, a
failure occurs, the client should create a new request that is retransmission could very well go to a different server if the
identical to the previous one but has a different value of the failed one recovers between retransmissions. As such, when-
Via branch ID (and therefore constitutes a new SIP transac- ever a proxy does not successfully contact the first server, it
tion). That request is sent to the next element in the list as should act as a stateful proxy. Unfortunately, it is still pos-
specified by RFC 2782. sible for a stateless proxy to deliver retransmissions to differ-
ent servers, even if it follows the recommendations above.
This can happen if the DNS TTLs expire in the middle of
8.2.4.3.4 Consideration for Stateless Proxies
a transaction, and the entries had changed. This is unavoid-
The process of the previous sections is highly stateful. When able. Network implementers should be aware of this limita-
a server is contacted successfully, all retransmissions of the tion and not use stateless proxies that access DNS if this error
request for the transaction, as well as ACK for a non-2xx is deemed critical.
final response, and CANCEL requests for that transaction,
must go to the same server. The identity of the successfully
8.2.4.3.5 Server Usage
contacted server is a form of transaction state. This presents
a challenge for stateless proxies, which still need to meet RFC 3261 (see Chapters 2 and 3) defines procedures for
the requirement for sending all requests in the transaction sending responses from a server back to the client. Typically,
to the same server. The problem is similar, but different, to for unicast UDP requests, the response is sent back to the
the problem of HTTP transactions within a cookie session source IP address where the request came from, using the
getting routed to different servers based on DNS random- port contained in the Via header. For reliable transport pro-
ization. There, such distribution is not a problem. Farms of tocols, the response is sent over the connection the request
servers generally have common back-end data stores, where arrived on. However, it is important to provide failover
the session data is stored. Whenever a server in the farm support when the client element fails between sending the
receives an HTTP request, it takes the session identifier, if request and receiving the response. A server, according to
present, and extracts the needed state to process the request. RFC 3261, will send a response on the connection it arrived
A request without a session identifier creates a new one. The on (in the case of reliable transport protocols), and for unreli-
problem with stateless proxies is at a lower layer; it is retrans- able transport protocols, to the source address of the request,
mitted requests within a transaction that are being poten- and the port in the Via header field. The procedures here are
tially spread across servers. invoked when a server attempts to send to that location and
Since none of these retransmissions carries a session iden- that response. Fails is defined as any closure of the transport
tifier (a complete dialog identifier in SIP terms), a new dialog connection the request came in on before the response can
would be created identically at each server. This could, for be sent, or communication of a fatal error from the transport
example, result in multiple phone calls to be made to the layer. In these cases, the server examines the value of the sent-
same phone. Therefore, it is critical to prevent such a thing by construction in the topmost Via header.
from happening in the first place. The requirement is not dif- If it contains a numeric IP address, the server attempts to
ficult to meet in the simple case where there were no failures send the response to that address, using the transport proto-
when attempting to contact a server. Whenever the stateless col from the Via header, and the port from sent-by, if pres-
proxy receives the request, it performs the appropriate DNS ent, else the default for that transport protocol. The transport
queries as described above. However, the procedures of RFC protocol in the Via header can indicate TLS, which refers to
2782 are not guaranteed to be deterministic. This is because TLS over TCP. When this value is present, the server must
records that contain the same priority have no specified order. use TLS over TCP to send the response. If, however, the sent-
The stateless proxy must define a deterministic order to the by field contained a domain name and a port number, the
records in that case, using any algorithm at its disposal. One server queries for A or AAAA records with that name. It tries
suggestion is to alphabetize them, or, more generally, sort to send the response to each element on the resulting list of
them by ASCII-compatible encoding. To make processing IP addresses, using the port from the Via, and the transport
easier for stateless proxies, it is recommended that domain protocol from the Via (again, a value of TLS refers to TLS
administrators make the weights of SRV records with equal over TCP). As in the client processing, the next entry in the
priority different (e.g., using weights of 1000 and 1001 if two list is tried if the one before it results in a failure. If, however,
servers are equivalent, rather than assigning both a weight of the sent-by field contains a domain name and no port, the
1000), and similarly for NAPTR records. server queries for SRV records at that domain name using
DNS and ENUM in SIP ◾ 361
the service identifier _sips if the Via transport is TLS, _sip supported by the domain. This specification defines D2U for
otherwise, and the transport from the topmost Via header UDP, D2T for TCP, and D2S for SCTP.
(TLS implies that the transport protocol in the SRV query is
TCP). The resulting list is sorted as described in RFC 2782,
8.2.4.6 Transport Determination Application
and the response is sent to the topmost element on the new
list described there. If that results in a failure, the next entry The Dynamic Delegation Discovery System (DDDS) rep-
on the list is tried. resents the evolution of the NAPTR RR. DDDS defines
applications that can make use of the NAPTR record for
specific resolution services. This application is called the
8.2.4.4 Constructing SIP URIs
Transport Determination Application, and its goal is to map
In many cases, an element needs to construct a SIP URI an incoming SIP or SIPS URI to a set of SRV records for
for inclusion in a Contact header in a REGISTER, or in the various servers that can handle the URI. The following
a Record-Route header in an INVITE. According to RFC are the details that the DDDS requests an application to
3261 (see Section 4.2), these URIs have to have the property provide:
that they resolve to the specific element that inserted them.
However, if they are constructed with just an IP address, for ◾◾ Application U nique S tring ( AUS): The AUS is the
example input to the resolution service. For this application, it
is the URI that is to be resolved.
sip:1.2.3.4 ◾◾ First Well-Known Rule: The First Well-Known Rule
extracts a key from the AUS. For this application, the
then should the element fail, there is no way to route the
First Well-Known Rule extracts the host portion of the
request or response through a backup. SRV provides a way to
SIP or SIPS URI.
fix this. Instead of using an IP address, a domain name that
◾◾ Valid D atabases: The key resulting from the First
resolves to an SRV record can be used:
Well-Known Rule is looked up in a single database, the
sip:server23.provider.com DNS. Expected output: the result of the application is
an SRV record for the server to contact.
The SRV records for a particular target can be set up so
that there is a single record with a low value for the priority
field (indicating the preferred choice), and this record points
to the specific element that constructed the URI. However,
there are additional records with higher values of the priority
8.3 ENUM
field that point to backup elements that would be used in the ENUM is a protocol that has the capability to map ITU-T’s
event of failure. This allows the constraint of RFC 3261 to be E.164 numbers into URIs as described in RFCs 6116
met while allowing for robust operation. and 6117 using DNS in the Internet and then to the IP
addresses. It first transforms E.164 numbers into ENUM
domain names and then uses the DNS-based architecture to
8.2.4.5 Selecting the Transport Protocol
access records from which the URIs are derived. As a result,
The SIP URI of a SIP entity that needs to be resolved may ENUM allows the exiting telephone numbering plan or its
have the transport parameters specified. If the transport administration to be kept without modifying it, bridging
parameters are specified, those transport parameters should networks overseen by different standards bodies: ITU-T and
be used by the stateful client for communication with DNS. IETF. ENUM is also used for many other services in addi-
Otherwise, if no transport protocol is specified, but the tar- tion to address translation. In fact, ENUM can be defined
get is a numeric IP address, the client should use UDP for a as a protocol that maps a telephone number to a domain
SIP URI and TCP for a SIPS URI. Similarly, if no transport name, maps the domain name to a group of service-specific
protocol is specified, and the target is not numeric, but an URIs, and then looks up what services (e.g., e-mail address,
explicit port is provided, the client should use UDP for a SIP web site, VoIP service address, or others) are available for a
URI and TCP for a SIPS URI. particular telephone number including the translation to the
If no transport protocol or port is specified, and the tar- IP address. ENUM uses the DNS NAPTR RR type to store
get is not a numeric IP address, the client should perform a its DDDS rules into DNS domains. ENUM relies on DNS
NAPTR query for the domain in the URI. The services rele- services and, thereby, it is also important for ENUM imple-
vant for the task of transport protocol selection are those with mentation to carry out a thorough analysis of all of the exist-
NAPTR service fields with values SIP+D2X and SIPS+D2X, ing DNS standard documents to understand what services
where X is a letter that corresponds to a transport protocol are provided to ENUM and what load ENUM provisioning
362 ◾ Handbook on Session Initiation Protocol
and queries will place on the DNS. The ENUM implemen- as administratively responsible for the domain cor-
tation employs a DNS-based tiered architecture as shown in responding to their country code. The ENUM Tier
Figure 8.4. 1 Manager for a domain corresponding to a country
The different ENUM’s DNS-based tiered architecture is code is the entity responsible for the management of
described as follows: the numbering plan in this country. The Registry of
the domain may be chosen by this entity. The name
◾◾ Tier 0: It corresponds to the ENUM root level. At this servers of the domain contain records that indicate
level, the ENUM architecture contains only one domain the authoritative name servers for individual E.164
(the ENUM root). The ITU-Telecommunication numbers or blocks of numbers in the country code or
Standardization Bureau (TSB) is the ENUM Tier 0 portion thereof.
Registrar for that domain. The ENUM Tier 0 Registry ◾◾ Tier 2 : It corresponds to the E.164 number. Which
should be designated by the ENUM Tier 0 Manager. entity will act as the ENUM Tier 2 Manager for
The Tier 0 name servers contain records that point to domains at the Tier 2 level is a national matter. The
ENUM Tier 1 name servers. name servers will contain domain names correspond-
◾◾ Tier 1 : It corresponds to the E.164 country code ing to E.164 numbers and NAPTR RRs with infor-
(CC), or a portion of an integrated numbering mation for specific communication services. Some
plan that is assigned to an individual country. entity must interact with E.164 number subscribers
Delegations of the subdomains are made by the (i.e., the ENUM Registrant) to have records for their
ITU-TSB to the entities designated by each country numbers provisioned into the ENUM DNS-based
Tier 3 … Tier 3
Application service provider Application service provider
architecture. This entity, the ENUM Registrar, might handles decisions about delegation requests. The several
in some implementations be the same as the ENUM regional Internet registries manage and register public
Tier 2 Name Server Provider, which maintains the Internet number resources within their respective regions.
subscriber’s NAPTR RRs, of the corresponding E.164 Figure 8.5 depicts the hierarchical structure and functional
number. architecture for North America with functional entities in
◾◾ Tier 3 : The functions described here have not been different tiers employed within the framework as shown
officially defined as Tier 3. The ENUM Registrar in Figure 8.4. ENUM relies on the DDDS for its opera-
(and potentially other entities) may also have to inter- tion, as it is an application of DDDS. Because the DDDS
act with other parties such as Application Service is designed to be flexible, this property of the DDDS opens
Providers (ASPs) that provide services like telephony/ the possibility of different interpretations. It requires that
VoIP/SIP, e-mail, web, fax, and others. The ASPs will the international domain consists of the DNS root and Tier
have the knowledge of number assignments, includ- 0 for ENUM. Tier 0 has the domain name e164.arpa and
ing telephone service providers and, in some cases, contains the delegations for country codes. In this imple-
number portability administrators of central reference mentation, Tier 1 has further been divided into two parts:
databases. Tier 1A and Tier 1B. Tier 1A is for county code 1 with
the domain name 1.e164.arpa and contains delegations for
It should be noted that the Internet Architecture number planning areas (area codes) from North American
Board (IAB) is responsible for the architectural and stan- countries. The individual E.164 zone (or zones) consists of
dards oversight of the Internet and DNS, while the ITU-T the Tier 1B registry, registrars, and registrants. This layer
International domain
“.arpa”
Tier 0
e164.arpa
National domain
Authentication
and validation
Tier 3
ASPs (users)
enables each numbering plan area to be managed as a DNS ◾◾ First Well-Known Rule: This is a Rewrite Rule that is
zone. It also includes authentication and verification entities defined by the application and not actually in the Rule
and points to Tier 2 for an E.164 number. Below these are Database. It is used to produce the first valid key.
Tier 2 providers, which maintain the NAPTR records, and ◾◾ Terminal R ule: A Rewrite Rule that, when used,
application service providers. However, in Tier 3, the ASPs yields a string that is the final result of the DDDS pro-
will assign numbers for different services including number cess, rather than another database key.
portability. ◾◾ Application: A set of protocols and specifications that
Because the DDDS is designed to be flexible, this prop- specify actual values for the various generalized parts
erty of the DDDS opens the possibility of different interpre- of the DDDS algorithm. An Application must define
tations. It implies that ENUM relies on the DDDS for its the syntax and semantics of the AUS, the First Well-
operation. The ENUM-specific interpretation of text within Known Rule, and one or more Databases that are valid
the DDDS specifications should be done carefully. The goal for the Application. In ENUM, it is the client applica-
should be to ensure interoperability between ENUM clients tion that uses ENUM services for conversion of E.164
and provisioning systems used to populate domains with numbers into URIs.
E2U NAPTRs. As part of ongoing development works on ◾◾ Services: A common rule database may be used to asso-
the ENUM specifications, RFC 5483 provides an analysis ciate different services with a given AUS, for example,
of the way in which ENUM client and provisioning system different protocol functions, different operational char-
implementations behave and the interoperability issues that acteristics, geographic segregation, backwards compati-
have arisen. ENUM has two constrains. First, the input is bility, etc. Possible service differences might be message
represented by a single telephone number in the form accord- receiving services for e-mail/fax/voice mail, load balanc-
ing to ITU-T E.164, although there are services that require ing over web servers, selection of a nearby mirror server,
processing of nondigit inputs such as password. Second, a cost versus performance trade-offs, etc. These Services
single input can be processed, although there are services are included as part of a Rule to allow the Application
such as abbreviated dialing that require additional input to make branching decisions based on the applicability
parameters for converting the abbreviated number into called of one branch or the other from a Service standpoint.
party identifier. DDDS application represents an abstract Service Parameters for this Application take the form
algorithm operating on a database with rewrite rules used by of a string of characters that follow this Augmented
the application for string conversion. In order to design an Backus–Naur Form (ABNF):
alternative DDDS application to ENUM, several parameters
need to be defined: algorithm, database, and application- servicefffield = [[protocol] *("+" rs)]
protocol = ALPHA *31ALPHANUM
specific parameters (e.g., Unique Inputs of E.164 numbers,
rs = ALPHA *31ALPHANUM
First Well-Known Rule, Database Selection, and Outputs). ; The protocol and rs fields are limited to 32
In addition, DDDS has used some terminologies in ; characters and must start with an alphabetic.
defining the algorithm as follows:
In other words, an optional protocol specification
◾◾ Application Unique String ( AUS): A string that is the followed by 0 or more resolution services. Each resolu-
initial input to a DDDS application. The lexical structure tion service is indicated by an initial + character.
of this string must imply a unique delegation path, which is In ENUM, it is a <character-string> that specifies
analyzed and traced by the repeated selection and applica- the service parameters applicable to this delegation
tion of Rewrite Rules. In ENUM, the AUS is a fully quali- path. It is up to the application specification to specify
fied E.164 number minus any nondigit characters except the values found in this field. Service parameters for
for the “+” character that appears at the beginning of the this Application take the following ABNF (specified in
number. The + is kept to provide a well-understood anchor RFC 5234) and are found in the Services field of the
for the AUS in order to distinguish it from other telephone NAPTR record that holds a terminal Rule. Where the
numbers that are not part of the E.164 namespace. For NAPTR holds a nonterminal rule, the Services field
example, the E.164 number could start out as +44-1164- should be empty, and clients should ignore its content.
960348. All nondigits except + are removed, ensuring that The services fields are defined as follows:
no syntactic sugar is allowed into the AUS.
◾◾ Rewrite Rule: It is also simply known as Rule. A rule service-field = "E2U" 1*(servicespec)
servicespec = "+" enumservice
that is applied to an AUS to produce either a new key
enumservice = type 0*(subtypespec)
to select a new rewrite rule from the rule database, subtypespec = ":" subtype
or a final result string that is returned to the calling type = 1*32(ALPHA/DIGIT/"-")
application. subtype = 1*32(ALPHA/DIGIT/"-")
DNS and ENUM in SIP ◾ 365
In other words, a nonoptional E2U (used to denote – Substitution Expression: This is the actual string
ENUM only Rewrite Rules in order to mitigate record modification part of the rule. It is a combination
collisions) is followed by one or more ENUM services of a POSIX Extended Regular Expression and a
that indicate the class of functionality a given end replacement string similar to Unix sed-style sub-
point offers. Each ENUM service is indicated by an stitution expression. The syntax of the Substitution
initial + character. Expression part of the rule is a sed-style substitu-
◾◾ Flags: Most Applications will require a way for a Rule to tion expression. True sed-style substitution expres-
signal to the Application that some Rules provide partic- sions are not appropriate for use in this application
ular outcomes that others do not, for example, different for a variety of reasons; therefore, the contents of
output formats, extensibility mechanisms, terminal rule the regexp field must follow this grammar:
signaling, etc. Most Databases will define a Flags field
that an Application can use to encode various values subst-expr = delim-char ere delim-char repl
delim-char *flags
that express these signals. In ENUM, it is a <character-
delim-char = "/"/"!"/<Any octet not in
string> containing flags to control aspects of the rewrit- "POS-DIGIT" or "flags">
ing and interpretation of the fields in the record. Flags ; All occurrences of a
are single characters from the set A–Z and 0–9. delimffchar in a substffexpr
◾◾ Rule: A Rule is made of four functional components: ; must be the same character.>
Priority, Set of Flags, Description of Services, and ere = <POSIX Extended Regular
Expression>
Substitution of Expression.
repl = *(string/backref)
– Priority: A priority is simply a number used to string = *(anychar/escapeddelim)
show which of two otherwise equal rules may have anychar = <any character other than
precedence. This allows the database to express delim-char>
rules that may offer roughly the same results, but escapeddelim = "\" delim-char
one delegation path may be faster, better, and backref = "\" POS-DIGIT
flags = "i"
cheaper than the other.
POS-DIGIT = "1"/"2"/"3"/"4"/"5"/"6"/"7"/"8"/"9"
– Set o f F lags: Flags are used to specify attributes
of the rule that determine if this rule is the last
The result of applying the substitution expression
one to be applied. The last rule is called the ter-
to the String must result in a key that obeys the rules
minal rule, and its output should be the intended
of the Database (unless of course it is a Terminal
result for the application. Flags are unique across
Rule in which case the output follows the rules of
Applications. An Application may specify that it is
the application). Since it is possible for the regular
using a flag defined by yet another Application but
expression to be improperly specified, such that a
it must use that other Application’s definition. One
nonconforming key can be constructed, client soft-
Application cannot redefine a Flag used by another
ware SHOULD verify that the result is a legal data-
Application. This may mean that a registry of Flags
base key before using it.
will be needed in the future but at this time it is not
Backref expressions in the repl portion of the
a requirement.
substitution expression are replaced by the (possibly
– Description of Services: Services are used to spec-
empty) string of characters enclosed by “(” and “)”
ify semantic attributes of a particular delegation
in the ERE portion of the substitution expression. N
branch. There are many cases where two delegation
is a single digit from 1 through 9, inclusive. It speci-
branches are identical except that one delegates
fies the Nth backref expression, the one that begins
down to a result that provides one set of features
with the Nth “(” and continues to the matching “).”
while another provides some other set. Features
For example, the ERE-(A(B(C)DE)(F)G) has backref
may include operational issues such as load bal-
expressions:
ancing, geographically based traffic segregation,
degraded but backwardly compatible functions
for older clients, etc. For example, two rules may \1 = ABCDEFG
\2 = BCDE
equally apply to a specific delegation decision for \3 = C
a string. One rule can lead to a terminal rule that \4 = F
produces information for use in high-availability \5..\9 = error—no matching subexpression
environments, while another may lead to an archi-
val service that may be slower but is more stable The “i” flag indicates that the ERE matching shall be
over long periods of time. performed in a case-insensitive fashion. Furthermore,
366 ◾ Handbook on Session Initiation Protocol
any backref replacements may be normalized to lower initial E in ENUM stands for E.164, and the term ENUM
case when the “i” flag is given. This flag has meaning is used exclusively to describe application of these techniques
only when both the Application and Database define a to E.164 numbers according to this specification.
character set where case insensitivity is valid. Second, the First Well-Known Rule for any ENUM query
The first character in the substitution expression creates a key (an FQDN within the e164.arpa domain apex)
shall be used as the character that delimits the com- from an E.164 number. This FQDN is queried for NAPTR
ponents of the substitution expression. There must be records, and returned records are processed and interpreted
exactly three nonescaped occurrences of the delim- according to this specification. The DDDS database used by
iter character in a substitution expression. Since the application is found in RFC 3403, which is the docu-
escaped occurrences of the delimiter character will ment that defines the NAPTR DNS RR type. The NAPTR
be interpreted as occurrences of that character, digits RR packet format that has DNS type code 35 contains the
MUST NOT be used as delimiters. Backrefs would fields namely Order, Preference, Flags, Services, Regexp, and
be confused with literal digits were this allowed. Replacement. However, Flags and Services are explained
Similarly, if flags are specified in the substitution earlier; Order, Preference, Regexp, and Replacement are
expression, the delimiter character must not also be a described below:
flag character.
◾◾ Rule Database: Any store of Rules such that a unique ◾◾ Order: It is a16-bit unsigned integer specifying the order
key can identify a set of Rules that specify the delega- in which the NAPTR records must be processed in
tion step used when that particular Key is used. In order to accurately represent the ordered list of rules. The
ENUM, the database contains rewrite rules for string ordering is from lowest to highest. If two records have the
conversion. DDDS specification does not imply any same order value, then they are considered to be the same
specific database; however, a DNS-based hierarchical rule and should be selected on the basis of the combina-
system has been proposed in RFC 3403. Properties tion of the Preference values and Services offered.
of well-known DNS and its scalability make it desir- ◾◾ Preference: It is a 16-bit unsigned integer that speci-
able storage for application rewrite rules. The rules are fies the order in which NAPTR records with equal
stored in the format of NAPTR RRs. Order values should be processed, low numbers being
processed before high numbers. Although it is called
8.3.1 DDDS Algorithm preference in deference to DNS terminology, this field is
equivalent to the Priority value in the DDDS algorithm.
The general DDDS algorithm is specified in RFC 3402. The ◾◾ Regexp: It is actually regular expression and is a
service is provided as a string processing defined by the algo- <character-string> containing a substitution expres-
rithm depicted diagrammatically in Figure 8.6. In brief, the sion that is applied to the original string held by the
input is initially converted into a database search key used client in order to construct the next domain name to
later to query the database. The key is then matched to data- look up. The only place where NAPTR field content is
base records in order to retrieve rewrite rules for input string case sensitive is in any static text in the Repl subfield of
conversion. In case a rule is not final, it is applied on the ini- the Regexp field (see RFCs 3402 and 6116 for Regexp
tial input and the search is repeated. Once the terminal rule field definitions). In that subfield, case must be pre-
is reached, it is applied on the input string to produce output served when generating the record output. Elsewhere,
string for further call processing. case sensitivity is not used.
◾◾ Replacement: It is a <domain-name> that is the next
domain-name to query for depending on the potential
8.3.2 DDDS Algorithm Application to ENUM
values found in the Flags field. This field is used when
First, the AUS is the initial input to a DDDS application the regular expression is a simple replacement opera-
(Figure 8.6). AUS in the context of ENUM is explained tion. Any value in this field MUST be a fully qualified
earlier with an example. In a similar example, to address domain-name. Name compression is not to be used
the E.164 number +44-3069-990038, a user might dial for this field. Note that this field and the Regexp field
03069990038 or 00443069990038 or 011443069990038. together make up the substitution expression in the
These dialed digit strings differ from one another, but none DDDS algorithm.
of them start with the + character. Finally, if these techniques
are used for dialing plans or other digit strings, implement- In fact, ENUM uses the NAPTR to provide a URI and
ers and operators of systems using these techniques for such look up what services are available for a particular telephone
purpose must not describe these schemes as ENUM. The number, mapping the ITU-T E.164 telephone number onto
DNS and ENUM in SIP ◾ 367
4b. Continue
3a. Apply the substitution expression for each rule in
through already
the list in order to AUS until a nonempty string is
retrieved list of
produced
rules starting
3b. Note the rule and its position in the list that
from the next
produces the nonempty result
new rule in order
Yes
Yes
Figure 8.6 DDDS algorithm flow chart. (Copyright IETF. Reproduced with permission.)
the DNS. Once a phone number is mapped into a domain an AUS of +442079460148), this step would simply
name, the ENUM protocol can query the DNS and provide remove the leading +, producing 442079460148.
a corresponding URI location or the locations of multiple ◾◾ Reverse the order of the digits. Example: 84106497
URIs and their order of processing, and service preferences 0244.
information it finds in the NAPTR record. ENUM then ◾◾ Put dots (“.”) between each digit. Example: 8.4.1.0.6.
maps the telephone number to a group of service-specific 4.9.7.0.2.4.4.
URIs, making it possible to manage multiple services. To ◾◾ Append the string .e164.arpa. to the end and interpret
convert the AUS to a unique key in this database, the string as a domain name. Example: 8.4.1.0.6.4.9.7.0.2.4.4.
is converted into a domain name according to this algorithm e164.arpa.
using the four-step process for mapping a telephone number
onto DNS: The e164.arpa domain provides the DNS infrastructure
for storing qualified E.164 telephone numbers. The .arpa top-
◾◾ Remove all characters with the exception of the digits. level domain’s (TLD’s) ability to reverse-map IP addresses
For example, given the E.164 number +44-20-7946- to domain names is foundational, since ENUM looks up
0148 (which would then have been converted into services by one-to-one reverse-mapping digits in the E.164
368 ◾ Handbook on Session Initiation Protocol
0–9, and the Application defines the Flags specified in the NAPTR could well be used to indicate a mobile phone that
DNS database. The case of the alphabetic characters is not supports both voice:tel and sms:tel ENUM services. The
significant. The field can be empty. It is up to the applica- Services field in that case would be E2U+voice:tel+sms:tel.
tion (e.g., ENUM) specifying how it is using this database to A compound NAPTR can be treated as a set of NAPTRs
define the Flags in this field. It must define which ones are that each holds a single ENUM service. These reconstructed
terminal and which ones are not. In fact, the database’s Flags NAPTRs share the same Order and Preference/Priority field
field signals when the DDDS algorithm has finished. At this values but should be treated as if each had a logically differ-
time, only one flag, U, is defined. This means that this rule ent priority. A left-to-right priority is assumed.
is the last one and that the output of the rule is a URI (RFC
3986). If a client encounters an RR with an unknown flag, it
must ignore it and move to the next rule. This test takes pre-
8.3.4 ENUM Operations
cedence over any ordering since flags can control the inter- We are describing with a high-level example how the
pretation placed on fields. ENUM client interacts with the ENUM DNS server to
A novel flag might change the interpretation of the obtain the NATAR record that, as explained earlier, con-
Regexp or Replacement fields such that it is impossible to tains six resource fields: Order, Preference, Flags, Services,
determine if an RR matched a given target. If this flag is Regexp, and Replacement. An example NAPTR record is
not present, then this rule is nonterminal. If a rule is non- as follows (Figure 8.7): 7.2.8.6.9.5.3.2.1.2.1.e164.arpa.IN
terminal, then the result produced by this rewrite rule must NAPTR 110 10 “u” “E2U+sip” “!^.*$!sip:2123596827@rrr
be an FQDN. Clients must use this result as the new key in .rnl.com!”. In this record, Order = 110, Preference = 10,
the DDDS loop (i.e., the client will query for NAPTR RRs Flags = u, Services = E2U+sip, and Regular Expression =
at this FQDN). Lastly, the process notifies the Application !^.*$!sip:[email protected]!. As described, ENUM
that the database search has been finished, and provides the specifies a method for storing information in the DNS
Application with the Flags and Services part of the Rule server to URIs (e.g., SIP phone, SIP servers, cell phone,
along with the output of the last Substitution Expression.
The output of the last DDDS loop is a Uniform Resource 1-212-359-6827 ENUM
Identifier in its absolute form according to the <absolute- DNS
server
URI> production in the Collected ABNF found in RFC 7.2.8.6.9.5.3.2.1.2.1.e164.arpa
3986.
7.2.8.6.9.5.3.2.1.2.1.e164.arpa
8.3.3 ENUM with Compound NAPTRs
It is possible to have more than one ENUM service associated sip:[email protected]
with a single NAPTR. These ENUM services share the same
Regexp field and so generate the same URI. Such a compound Figure 8.7 High-level example of ENUM operations.
DNS and ENUM in SIP ◾ 371
and other entities) for associated services (e.g., SIP audio/ 8.3.5.1 ENUM Service Registration
video conferencing services, XMPP chat services, Fax
As defined in RFC 3761 (obsoleted by RFCs 6116 and 6117),
services, e-mail services, and others). Each URI is stored
the following is a template covering information needed for
in a DNS NAPTR record, which is in an E.164 domain.
the registration of the ENUM service specified as follows:
Figure 8.7 shows that an ENUM client is trying to resolve
an E.164 telephone number into the corresponding URI of Enumservice Name: "E2U+SIP"
SIP telephony services. Type(s): "SIP"
The main steps of operations can be described as follows: Subtype(s): N/A
URI Scheme(s): "sip:", "sips:"
◾◾ An ENUM client needs to resolve the E.164 telephone
number of the called party (e.g., 1-212-359-6827) to 8.3.5.2 AOR in SIP
set up the call over the IP network.
◾◾ The ENUM client converts the E.164 number into a RFC 3764 specifies an ENUM service field that is appropriate
domain name (e.g., 7.2.8.6.9.5.3.2.1.2.1.e164.arpa) as for SIP AOR URIs. Various other types of URIs can be pres-
described earlier. ent in SIP requests. A URI that is associated with a particular
◾◾ The client queries a DNS server with the domain name SIP user agent (e.g., a SIP phone) is commonly known as a SIP
using a resolver as described earlier. contact address. The difference between a contact address and
◾◾ The DNS server returns the NAPTR records that con- an AOR is like the difference between a device and its user.
tain services and URIs (e.g., sip:[email protected]. While there is no formal distinction in the syntax of these two
com) in the domain to the client after resolving this, forms of addresses, contact addresses are associated with a par-
running the DDDS algorithm specific to ENUM ser- ticular device and may have a very device-specific form (like
vices as explained earlier. sip:10.0.0.1 or sip:[email protected]). An AOR, how-
◾◾ If the multiple NAPTR records are retuned, the ever, represents an identity of the user, generally a long-term
ENUM client picks one to use based on the Order, identity, and it does not have a dependency on any device; users
Preference, and Services field values in the records. can move between devices or even be associated with multiple
◾◾ The ENUM client will do a second non-ENUM DNS devices at one time while retaining the same AOR. A simple
query to determine the called party’s IP address, if the URI, generally of the form sip:[email protected], is used
URI in the selected NAPTR record contains the called for an AOR. When a SIP request is created by a user agent,
party’s name (e.g., rrr.rnl.com). it populates the AOR of its target in its To header field and
(generally) Request-URI. The AOR of the user that is sending
the request populates the From header field of the message; the
8.3.5 ENUM Service Registration for SIP contact address of the device from which the request is sent is
listed in the Contact header field.
Addresses of Record (AORs) By sending a registration to a registrar on behalf of its
RFC 3764 that is described here registers an ENUM ser- user, a SIP device (i.e., a user agent) can temporarily associ-
vice focusing on provisioning SIP AORs, pursuant to the ate its own contact address with the user’s AOR. In so doing,
guidelines in RFC 3761. ENUM, as explained earlier, is a the device becomes eligible to receive requests that are sent
system that uses DNS to translate telephone numbers, like to the AOR. Upon receiving the registration request, the reg-
+12025332600, into URIs, like sip:[email protected]. istrar modifies the provisioning data in a SIP location service
ENUM exists primarily to facilitate the interconnection of to create a mapping between the AOR for the user and the
systems that rely on telephone numbers with those that use device where the user can currently be reached. When future
URIs to route transactions. RFC 3764 uses the text-based requests arrive at the administrative domain of this location
ENUM application protocol (RFC 6116) that allows end service for the user in question, proxy servers ask the loca-
points on the Internet to discover one another in order to tion service where to find the user, and will in turn discover
exchange context information about a session they would like the registered contact address(es). A SIP-based follow-me
to share. Common forms of communication that are set up telephony service, for example, would rely on this real-time
by SIP include Internet telephony, instant messaging, video, availability data in order to find the best place to reach the
Internet gaming, and other forms of real-time communica- end user without having to cycle through numerous devices
tions. SIP is a multiservice protocol capable of initiating ses- from which the user is not currently registered. Note that
sions involving different forms of real-time communications AORs can be registered with other AORs; for example, while
simultaneously. SIP is a protocol that finds the best way for at home, a user might elect to register the AOR they use as
parties to communicate. their personal identity under their work AOR in order to
372 ◾ Handbook on Session Initiation Protocol
direct requests for their work identity to whatever devices when used in an AOR, indicates that the user it represents
they might have associated with their home AOR. can only be reached over a secure connection (using TLS).
When a SIP entity (be it a user agent or proxy server)
needs to make a forwarding decision for a Request-URI con-
8.3.5.3 E2U+SIP ENUM Service
taining an AOR, it uses the mechanisms described in the
SIP specification (RFC 3263, see Section 8.2.4) to locate Traditionally, the services field of a NAPTR record (as
the proper resource in the network. Ordinarily, this entails defined in RFC 3403) contains a string that is composed
resolving the domain portion of the URI (example.com in of two subfields: a protocol subfield and a resolution service
the example above) in order to route the call to a proxy server subfield. ENUM in particular defines an E2U (E.164 to
that is responsible for that domain. SIP user agents have spe- URI) resolution service. This document defines an E2U+SIP
cific communications capabilities (such as the ability to initi- ENUM services for SIP. The scheme of the URI that will
ate voice communications with particular codecs, or support appear in the regexp field of a NAPTR record using the
for particular SIP protocol extensions). Because an AOR E2U+SIP ENUM services may either be SIP or SIPS. This
does not represent any particular device or set of devices, an ENUM services is best suited to SIP AORs. When a SIP
AOR does not have capabilities as such. AOR appears in the regexp field of a NAPTR record, there is
When a SIP user agent sends a request to an AOR, it no need to further qualify the ENUM services field with any
begins a phase of capability negotiation that will eventually capability data, since AORs do not have capabilities. There
discover the best way for the originator to communicate with is also generally no need to have more than one NAPTR
the target. The originating user agent first expresses capabili- record under a single telephone number that points to a SIP
ties of its own in the request it sends (and preferences for the AOR. Note that the user portion of a SIP URI may contain
type of session it would like to initiate). The expression of a telephone number (e.g., sip:+1442079460148@example
these capabilities may entail the usage of SDP (see Section .com). Clients should be careful to avoid infinite loops when
7.7) to list acceptable types of media supported and favored recursively performing ENUM queries on URIs that result
by the client, the inclusion of Required/Supported headers to from an ENUM lookup.
negotiate compatibility of extensions, and possibly the usage
of optional SIP extensions, for example using callee capa-
bilities (see Section 3.4) to communicate request handling
8.3.5.4 Example of E2U+SIP ENUM Service
dispositions. Proxy servers or end points subsequently return The following is an example of the use of the ENUM services
responses that allow a rich bidirectional capability negotia- registered by this document in a NAPTR RR.
tion process.
The process by which SIP end points negotiate capa- $ORIGIN 8.4.1.0.6.4.9.7.0.2.4.4.e164.arpa.
IN NAPTR 10 100 "u" "E2U+sip"
bilities can overlap with the primary service provided by
"!^.*$!sip:[email protected]!".
NAPTR records: permitting the originating client to select a
particular URI for communications based on an ordered list
of ENUM services. However, ENUM’s capability manage- 8.3.6 ENUM Services Registration
ment mechanism is decidedly one way—the administrator of
the telephone number expresses capabilities (in the form of
in XML Chunk
protocol names) and preferences that the client must evalu- RFC 6117 has obsoleted the IANA registration section of
ate without negotiation. Moreover, listing available protocols RFC 3761. Since the IANA ENUM service registry contains
is not comparable to agreement on session media (down to various ENUM services registered under the regime of RFC
the codec/interval level) and protocol extension support— 3761, those registrations do not conform to the new guide-
it would be difficult to express, in the level of detail nec- lines as specified in RFC 6117. To ensure consistency among
essary to arrange a desired session, the capabilities of a SIP all ENUM service registrations at IANA, this document
device within a NAPTR service field. Provisioning contact adds the (nowadays) missing elements to those legacy regis-
addresses in ENUM rather than AORs would compromise trations. Furthermore, all legacy ENUM service registrations
the SIP capability negotiation and discovery process. Much are converted to the new XML-chunk format, and, where
of the benefit of using a URI comes from the fact that it deemed necessary, minor editorial corrections are applied.
represents a logical service associated with a user, rather However, this document only adds the missing elements to
than a device—indeed, if ENUM wished to target particu- the XML chunks as specified in the IANA Considerations
lar devices, E2IPv4 would be a more appropriate resolution section of RFC 6117, but it does not complete the (nowa-
service to define than E2U. SIP AORs may use the SIP URI days) missing sections of the corresponding ENUM service
scheme or the SIPS URI scheme. The SIPS URI scheme, Specifications. To conform to the new registration regime as
DNS and ENUM in SIP ◾ 373
specified in RFC 6117, those ENUM service specifications is familiar with NAPTR records (RFC 3403) and ENUM
still have to be revised. Legacy Enumservice Registrations (RFC 6117). Only those aspects of NAPTR record author-
have been converted to XML Chunks for the following ing and processing that have special bearing on SIP, or that
ENUM services whose details can be seen in RFC 6117: require general clarification, are covered in this document;
these procedures do not update or override the NAPTR or
email:mailto, ems:mailto, ems:tel, fax:tel,
ENUM core documents. Note that the ENUM specification
ft:ftp, h323, ical-access:http, ical-
access:https, ical-sched:mailto, ifax:mailto, has undergone a revision shortly before the publication of
im, mms:mailto, mms:tel, pres, pstn:sip, this document, driven by the update of the NAPTR system
pstn:tel, sip, sms:mailto, sms:tel, described in RFC 2915 (that is obsoleted by RFCs 3401–
unifmsg:http, unifmsg:https, unifmsg:sip, 3404) to the DDDS family of specifications (including RFC
unifmsg:sips, vcard, videomsg:http, 3403). This document therefore provides some guidance for
videomsg:https, videomsg:sip, videomsg:sips,
handling records designed for the original RFC 2916 (obso-
voice:tel, voicemsg:http, voicemsg:https,
voicemsg:sip, voicemsg:sips, voicemsg:tel, leted by RFCs 6116 and 6117).
vpim:ldap, vpim:mailto, web:http, web:https,
and xmpp.
8.3.7.1 Handling Telephone Numbers in SIP
There are a number of reasons why a user might want to ini-
8.3.7 Using E.164 Numbers with SIP tiate a SIP request that targets an E.164 number. One com-
There are a number of contexts in which telephone numbers mon reason is that the user is calling from the PSTN through
are employed by SIP applications, many of which can be a PSTN–SIP gateway; such gateways usually map routing
addressed by ENUM. Although SIP was one of the primary information from the PSTN directly onto SIP signaling.
applications for which ENUM was created, there is never- Or a native SIP user might intentionally initiate a session
theless a need to define procedures for integrating ENUM addressed to an E.164 number—perhaps because the target
with SIP implementations. RFC 3824 that is described here user is canonically known by that number, or the originator’s
illustrates how the two protocols might work in concert, and SIP user agent only supports a traditional numeric telephone
clarifies the authoring and processing of ENUM records for keypad. A request initially targeting a conventional SIP URI
SIP applications. It also provides guidelines for instances in might also be redirected to an E.164 number. In most cases,
which ENUM, for whatever reason, cannot be used to resolve these are requests for a telephony session (voice communi-
a telephone number. SIP is a text-based application protocol cation), though numerous other services are also reached
that allows two end points in the Internet to discover one through telephone numbers (including instant messaging
another in order to exchange context information about a ses- services). Unlike a URI, a telephone number does not con-
sion they would like to share. Common applications for SIP tain a host name, or any hints as to where one might deliver
include Internet telephony, instant messaging, video, Internet a request targeting a telephone number on the Internet.
gaming, and other forms of real-time communications. SIP is While SIP user agents or proxy servers could be statically
a multiservice protocol capable of initiating sessions involving provisioned with a mapping of destinations corresponding to
different forms of real-time communications simultaneously. particular telephone numbers or telephone number ranges,
The most widespread application for SIP today is Voice considering the size and complexity of a complete mapping,
over IP (VoIP). As such, there are a number of cases in which it would be preferable for SIP user agents to be able to query
SIP applications are forced to contend with telephone num- as needed for a destination appropriate for a particular tele-
bers. Unfortunately, telephone numbers cannot be routing phone number.
in accordance with the traditional DNS resolution proce- In such cases, a user agent might use ENUM to discover
dures standardized for SIP (see Section 8.2.4), which rely on a URI associated with the E.164 number—including a SIP
SIP URIs. ENUM provides a method for translating E.164 URI. URIs discovered through ENUM can then be used
numbers into URIs, including potentially SIP URIs. This normally to route SIP requests to their destination. Note that
document therefore provides an account of how SIP can han- support for the NAPTR DNS RR format is specified for ordi-
dle telephone numbers by making use of ENUM. Guidelines nary SIP URI processing in RFC 3263 (see Section 8.2.4),
are proposed for the authoring of the DNS records used and thus support for ENUM is not a significant departure
by ENUM, and for client-side processing once these DNS from baseline SIP DNS routing. Most of the remainder of
records have been received. this document provides procedures for the use of ENUM,
The guidelines in this document are oriented toward but a few guidelines are given in the remainder of this sec-
authoring and processing ENUM records specifically for tion for cases in which ENUM is not used, for whatever rea-
SIP applications. These guidelines assume that the reader son. If a user agent is unable to translate an E.164 number
374 ◾ Handbook on Session Initiation Protocol
with ENUM, it can create a type of SIP Request-URI that identity than the URI of any device with which a user is
contains a telephone number. Since one of the most com- temporarily associated. If ENUM was purposed to map to
mon applications of SIP is telephony, a great deal of attention specific devices, it would be better to translate telephone
has already been devoted to the representation of telephone numbers to IPv4 addresses than to URIs (which express
numbers in SIP. In particular, the tel URL RFC 3966 (see something richer).
Section 4.2.2) has been identified as a way of carrying tele- SIP URIs in ENUM do not convey capability infor-
phone routing information within SIP. A tel URL usually mation. SIP has its own methods for negotiating capabil-
consists of the number in E.164 format preceded by a plus ity information between user agents (see Sections 3.4 and
sign, for example, tel:+12025332600. This format is so useful 7.7); providing more limited capability information within
that it has been incorporated into the baseline SIP specifica- ENUM is at best redundant and at worst potentially mis-
tion; the user portion of a SIP URI can contain a tel URL leading to SIP’s negotiation system. Also, AORs do not have
(without the scheme string, like sip:+12025332600@carrier. capabilities (only devices registered under an AOR have
com;user=phone). A SIP proxy server might therefore receive actual capabilities), and putting contact addresses in ENUM
a request from a user agent with a tel URL in the Request- is not recommended. Only one SIP URI, ideally, appears
URI; one way in which the proxy server could handle this in an ENUM record set for a telephone number. While it
sort of request is by launching an ENUM query request in may initially seem attractive to provide multiple SIP URIs
accordance with the returned ENUM records. that reach the same user within ENUM, if there are mul-
In the absence of support for ENUM, or if ENUM tiple addresses at which a user can be contacted, considerably
requests return no records corresponding to a telephone greater flexibility is afforded if multiple URIs are managed
number, local policy can be used to determine how to for- by a SIP location service that is identified by a single record
ward SIP requests with an E.164 number in the Request- in ENUM. Behavior for parallel and sequential forking in
URI. Frequently, such calls are routed to gateways that SIP, for example, is better managed in SIP than in a set
interconnect SIP networks with the PSTN. These proxy of ENUM records. User agents, rather than proxy servers,
server policies might be provisioned dynamically with rout- should process ENUM records. The assumptions underlying
ing information for telephone numbers by TRIP (RFC 3219). the processing of NAPTR records dictate that the ENUM
As a matter of precedence, SIP user agents should attempt to client knows the set of ENUM services supported by the
translate telephone numbers to URIs with ENUM, if imple- entity that is attempting to communicate. A SIP proxy server
mented, before creating a tel URL, and deferring the routing is unlikely to know the ENUM services supported by the
of this request to a SIP proxy server. originator of a SIP request.
8.3.7.3.1 Service Field other sorts of URIs that might be considered appropriate for
SIP applications: tel URIs, im or pres URIs, or others that
The Service field of a NAPTR record (per RFC 3403) con-
describe specific services that might be invoked through SIP
tains a string token that designates the protocol or service
are all potentially candidates. While the use of these URIs
associated with a particular record (and which imparts some
might seem reasonable under some circumstances, including
inkling of the sort of URI that will result from the use of the
these in NAPTR records rather than SIP URIs could weaken
record). ENUM requires the IANA registration of service
the proper composition of services and negotiation of capa-
fields known as ENUM services. An ENUM Service for SIP
bilities in SIP. It is recommended that authors of ENUM
has been described earlier, which uses the format E2U+sip
records should always use the SIP or SIPS URI scheme when
to designate that a SIP AOR appears in the URI field of a
the service field is E2U+sip, and the URIs in question MUST
NAPTR record. It is strongly recommended that authors of
be AORs, not contact addresses. Users of SIP can register one
NAPTR records use the E2U+sip service field whenever the
or more contact addresses with a SIP registrar that will be
regexp contains a SIP AOR URI.
consulted by the proxy infrastructure of an administrative
domain to contact the end user when requests are received
8.3.7.3.2 Creating the Regular Expression: Matching for their AOR. Much of the benefit of using a URI comes
from the fact that it represents a logical service associated
The authorship of the regular expression (henceforth regexp)
with a user rather than a device—indeed, if ENUM needs to
in a NAPTR record intended for use by ENUM is vastly
target specific devices rather than URIs, then a hypothetical
simplified by the absence of an antecedent in the substitution
E2IPv4+sip ENUM service would be more appropriate.
(i.e., the section between the first two delimiters). It is rec-
ommended that implementations use an exclamation point
as a delimiter, since this is the only delimiter used through- 8.3.7.3.4 Setting Order and Preference
out the ENUM core specification. When a NAPTR record among Records
is processed, the expression in the antecedent is matched
For maximal compatibility, authors of ENUM records
against the starting string (for ENUM, the telephone num-
for SIP SHOULD always use the same order value for all
ber) to assist in locating the proper record in a set; however,
NAPTR records in an ENUM record set. If relative pref-
in ENUM applications, since the desired record set is located
erence among NAPTR records is desirable, it should be
through a reverse resolution in the e164.arpa domain that is
expressed solely with the preference field.
based on the starting string, further analysis of the starting
string on the client side will usually be unnecessary. In such
cases, the antecedent of the regular expression is commonly 8.3.7.3.5 Example
greedy—it uses the regexp ^.*$, which matches any starting
The following example shows a well-formed ENUM NAPTR
string. Some authors of ENUM record sets may want to use
Record Set for SIP:
the full power of regexps and create nongreedy antecedents;
the DDDS standard requires that ENUM resolvers support $ORIGIN 0.0.6.2.3.3.5.2.0.2.1.e164.arpa.
these regexps when they are present. For providing a trivial IN NAPTR 100 10 "u" "E2U+sip"
mapping from a telephone number to a SIP URI, the use of "!^.*$!sip:[email protected]!".
a greedy regexp usually suffices. IN NAPTR 100 20 "u" "E2U+mailto"
"!^.*$!mailto:[email protected]!".
Example: !^.*$!sip:[email protected]!
mean that we recommend any of the potentially troublesome points in the string. Once a URI has been extracted from
authoring practices that make this generosity necessary. the NAPTR record, it should be used as the Request-URI of
the SIP request for which the ENUM query was launched.
SIP clients should perform some sanity checks on the URI,
8.3.7.4.1 Contending with Multiple SIP Records
primarily to ensure that they support the scheme of the URI,
If an ENUM query returns multiple NAPTR records that but also to verify that the URI is well formed. Clients must
have a service field of E2U+sip, or other service field that may at least verify that the Request-URI does not target them-
be used by SIP (such as E2U+pres; see RFC 3953), the ENUM selves. Once an AOR has been extracted from the selected
client must first determine whether or not it should attempt to NAPTR record, clients follow the standard SIP mecha-
make use of multiple records or select a single one. The pitfalls nisms (RFC 3263, see Section 8.2.4) for determining how
of intentionally authoring ENUM record sets with multiple to forward the request. This may involve launching subse-
NAPTR records for SIP are detailed above. If the ENUM cli- quent NAPTR or SRV queries in order to determine how
ent is a user agent, then at some point a single NAPTR record best to route to the domain identified by an AOR; clients,
must be selected to serve as the Request-URI of the desired however, must not make the same ENUM query recursively
SIP request. If the given NAPTR records have different prefer- (if the URI returned by ENUM is a tel URL; see Section
ences, the most preferred record should be used. If two or more 4.2.2). Note that SIP requests based on the use of NAPTR
records share the most preferred status, the ENUM client records may fail for any number of reasons. If there are mul-
should randomly determine which record will be used, though tiple NAPTR records relevant to SIP present in an ENUM
it may defer to a local policy that employs some other means to record set, then after a failure has occurred on an initial
select a record. If the ENUM client is a SIP intermediary that attempt with one NAPTR record, SIP user agents may try
can act a redirect server, then it should return a 3xx response their request again with a different NAPTR record from the
with more than one contact header field corresponding to the ENUM record set.
multiple selected NAPTR records in an ENUM record set. If
the NAPTR records have different preferences, then q-values
8.3.7.5 Compatibility with RFC 2916
may be used in the Contact header fields to correspond to
these preferences. Alternatively, the redirect server may select Note that both RFCs 2916 and 3716 are obsoleted by RFCs
a single record in accordance with the NAPTR preference 6116 and 6117, and these newer RFCs were created before the
fields (or randomly when no preference is specified) and send publication of RFC 3824 that has been described above. As a
this resulting URI in a Contact header field in a 3xx response. result, we are also discussing the older RFCs 2916 and 3716
Otherwise, if the ENUM client is a SIP intermediary that can in this section as RFC 3824 has done. Note that RFC 6117,
act as a proxy server, then it MAY fork the request when it as described earlier, converts ENUM services into XML
receives multiple appropriate NAPTR records in an ENUM chunk only. RFC 3761 is based on the DDDS (RFC 3401)
record set. Depending on the relative precedence values of the revision to the NAPTR RR specified in RFC 2915 (obso-
NAPTR records, the proxy may wish to fork sequentially or in leted by RFCs 3401–3404). For the most part, DDDS is an
parallel. However, the proxy must build a route set from these organizational revision that makes the algorithmic aspects
NAPTR records that consists exclusively of SIP or SIPS URIs, of record processing separable from any underlying database
not other URI schemes. Alternatively, the proxy server may format (such as the NAPTR DNS RR). The most important
select a single record in accordance with the NAPTR prefer- revision in RFC 3761 is the concept of ENUM services. The
ence fields (or randomly when no preference is specified, or original ENUM specification, RFC 2916, specified a num-
in accordance with local policy) and proxy the request with a ber of service values that could be used for ENUM, including
Request-URI corresponding to the URI field of this NAPTR the sip+E2U service field. RFC 3761 introduces an IANA
record—though again, it must select a record that contains a registration system with new guidelines for the registration
SIP or SIPS URI. Note that there are significant limitations of ENUM services, which are no longer necessarily divided
that arise if a proxy server processes ENUM record sets instead into discreet service and protocol fields, and which admit more
of a user agent, and that therefore it is recommended that SIP complex structures. In order to differentiate ENUM services
network elements act as redirect servers rather than proxy serv- in RFC 3761 from those in RFC 2916, the string E2U is the
ers after performing an ENUM query. leading element in an ENUM service field, whereas by RFC
2916 it was the trailing element.
An ENUM service for SIP AORs is described in RFC
8.3.7.4.2 Processing the Selected NAPTR Record
5947. This ENUM service uses the ENUM service field
Obviously, when an appropriate NAPTR record has been E2U+sip. RFC 3761–compliant authors of ENUM records
selected, the URI should be extracted from the regexp field. for SIP MUST therefore use the E2U+sip ENUM service field
The URI is between the second and third exclamation instead of the sip+E2U field. For backwards compatibility
DNS and ENUM in SIP ◾ 377
with existing legacy records, however, the sip+E2U field The CSCFs handle all the SIP session signaling; how-
should be supported by an ENUM client that supports SIP. ever, they neither take part in transferring user media nor
Also note that the terminology of DDDS differs in a num- are they on the path of the application data. The IMS
ber of respects from the initial NAPTR terminology in RFC proxies are hierarchically divided in two categories: the
2916. DDDS introduces the concept of an Application, an proxies-CSCFs (P-CSCFs) are the IMS contact points for
Application-Specific String, a First Well-Known Rule, and the SIP–user agents (SIP–UAs) and the serving-CSCF
so on. The terminology used in this document is a little looser (S-CSCF) is the proxy server controlling the session.
(it refers to a starting string, e.g., where Application-Specific In some topologies, there is a third type of CSCF, the
String would be used for DDDS). The new terminology is interrogating- CSCF (I-CSCF). The I-CSFC is an element
reflected in RFC 3761. used mainly for topology-hiding purposes between dif-
ferent operators and, also, in the case of having several
S-CSCFs in the domain, to assist in selecting the appro-
8.3.8 ENUM for SIP Services priate one. In addition, there can be Breakout Gateway
The use of ENUM in the SIP network is straightforward. Control Function (BGCF) proxy that controls resources
We are considering DNS and ENUM as a single server for allocation to IP sessions, such as which IP-TDM net-
illustration, although the DNS and ENUM servers are usu- work gateway needs to be used to reach the destination
ally separate in most networks for simplification. We are also efficiently. We have taken two point-to-point high-level
assuming that the called party remains in the calling party’s call-flow examples as follows: IMS subscriber over the IP
home network. Also, although not shown, the following is network to the PSTN subscriber over the TDM network
done: the S-CSCF sends a SIP Invite message to the I-CSCF, (Figure 8.8), and IMS subscriber to IMS subscriber over
and it queries the Home Subscriber Server (not shown) in the the IP network (Figure 8.9).
Called Party Home Network to determine the S-CSCF cur- It should be noted that TDM network uses the ISDN User
rently serving Client B. Part/Signaling System # 7 (SS7) for voice calls. Interestingly,
Application DNS/ENUM
server(s) server
3
4
5 6
S-CSCF BGCF
7
2 8
Calling
P-CSCF party’s MGC
home SG
network
11 9
MG SS7
IP PSTN
access network
1 10
Signaling
flow
Figure 8.8 Point-to-point call flows from IP subscriber to the PSTN subscriber when both subscribers have the same
home network. (Copyright IETF. Reproduced with permission.)
378 ◾ Handbook on Session Initiation Protocol
IP IP
access network access network
14
1
Signaling
flow
SIP client A SIP client B
Calling party Called party
Figure 8.9 Point-to-point call flows between IP subscribers when both subscribers have separate home networks.
(Copyright IETF. Reproduced with permission.)
in Figure 8.8, we have shown the media gateway controller 5. The S-CSCF queries the DNS/ENUM server with the
(MGC), signaling gateway (SG), and media gateway that are E.164 number (e.g., 1-732-490-2533).
used to provide interfaces between the IP and TDM net- 6. The DNS/ENUM server does not return a SIP URI,
work. The MGC and SG enable interworking between SIP since the server does not contain a NAPTR record for
and ISUP/SS-7 signaling schemes using technical standards this PSTN number.
as described in Section 14.4. Similarly, the protocol between 7. The S-CSCF proxy sends the call to the BGCF proxy
the MGC and MG can use signaling protocols like H.248- to route the call to the PSTN network.
family of technical standards including SIP. For showing the 8. The BGCF proxy finds the appropriate PSTN gate-
use of ENUM in SIP, we are not showing these interworking way (i.e., MGC/SG/MG) that will route the call over
protocols in the call flow example for the sake of simplic- the PSTN network to the called party cost-effectively
ity. Figure 8.8 shows the following steps that are carried out meeting performances. (Note: In parallel, MGC sends
to set up and route the call through the IP network to the the call signaling messages to the MG as shown in
PSTN network: step 11 for transferring media between the called and
calling party as soon as the call setup is successfully
1. The SIP client A of the calling party calls the E.164 completed.)
number (e.g., 1-732-490-2533) in order to reach SIP 9. The PSTN gateway (i.e., MGC/SG) routes the call to
client B of the called party. the PSTN network.
2. The P-CSCF proxy in the calling party’s network 10. The PSTN network then routes the call to the PSTN
that is the contact point for the SIP client A inter- client B (e.g., 1-732-490-2533) of the called party. A
cepts the call and sends the S-CSCF proxy of this session is established between the two end points, and
domain. bearer traffic between them is routed through the IP
3. The S-CSCF proxy communicates with the application access, IP backbone, IP/PSTN GW (i.e., MG), and
server to provide subscriber services. PSTN networks.
4. The application server checks the service features of the
SIP client A for providing services and returns the call If both subscribers remain in the IP network, the main
to the S-CSCF proxy. steps to set up and route the call are depicted in Figure 8.9.
DNS and ENUM in SIP ◾ 379
1. The SIP client A of the calling party calls the E.164 8.3.9.1 Character Sets
number (e.g., 1-732-490-2533) in order to reach SIP
client B of the called party.
◾◾ Non-ASCII C haracter: If the Regexp filed contain
2. The P-CSCF proxy in the calling party’s network that
non-ASCII characters and there are multibyte charac-
is the contact point for the SIP client A intercepts the
ters within an ENUM NAPTR, incorrect processing
call and sends the S-CSCF proxy of this domain.
may well result from the UTF-8-unaware systems.
3. The S-CSCF proxy communicates with the application
◾◾ Case S ensitivity: The case-sensitivity flag that can
server to provide subscriber services.
reside in any static text in the Repl subfield of the
4. The application server checks the service features of the
Regexp field is inappropriate for ENUM, and should
SIP client A for providing services and returns the call
not be provisioned into E2U NAPTRs.
to the S-CSCF proxy.
◾◾ Regexp F ield D elimiter: It is not possible to select
5. The S-CSCF queries the DNS/ENUM server with the
a delimiter character of the Regexp field that cannot
E.164 number (e.g., 1-732-490-2533).
appear in one of the subfields. A client may attempt to
6. The DNS/ENUM server returns the SIP URI of
process this as a standard delimiter and interpret the
sip:[email protected].
Regexp field contents differently from the system that
7. The S-CSCF proxy queries the DNS/ENUM server
provisioned it.
again to get the host IP address for ims.rrr.com.
◾◾ Regexp M eta-Character: In ENUM, the ERE sub-
8. The DNS/ENUM server returns the IP address for
field Regexp may include the application-specific
ims.rrr.com to the S-CSCF proxy. This IP address is
meta-character that needs to be escaped. Not escaping
the address of the I-CSCF proxy of the called party’s
the meta-character produces an invalid ERE.
home network.
9. The S-CSCF proxy of the calling party’s home network
then sends the call to the I-CSCF proxy of the called
party’s home network. 8.3.9.2 Unsupported NAPTRs
10. The I-CSCF proxy sends the call to the S-CSCF of the
called party’s home network. (Although not shown, An ENUM client may discard a NAPTR received in response
the I-CSCF proxy queries the home subscriber server to an ENUM query because of the following: NAPTR is
in the called party home network to determine which syntactically or semantically incorrect, the NAPTR has a
S-CSCF proxy is currently serving client B.) different (nonempty) DDDS Application identifier from the
11. The S-CSCF proxy then communicates with the appli- E2U used in ENUM, the NAPTR’s ERE does not match
cation server of the called party’s home network to pro- the AUS for this ENUM query, the ENUM client does not
vide subscriber services. recognize any ENUM service held in this NAPTR, or this
12. The application server of the called party’s home net- NAPTR (only) contains an ENUM service that is unsup-
work replies back to the S-CSCF proxy invoking ser- ported. These conditions should not cause the whole ENUM
vices of the called party. query to terminate, and processing should continue with the
13. The S-CSCF proxy of the called party’s network then next NAPTR in the returned RR Set (RRSet).
routes the to the P-CSCF proxy through which the SIP
client B of the called party is being connected. 8.3.9.3 ENUM NAPTR Processing
14. The P-CSCF proxy then sends the call to the SIP client
B of the called party. A session is established between ENUM is a DDDS application, and the way in which
the two end points, and bearer traffic between them is NAPTRs in an RRSet are processed reflects this. The
routed through the IP access and backbone networks. sequence of processing needs to be done seeing the combi-
nation of ORDER and PREFERENCE/PRIORITY field
values. Once the NAPTRs are sorted into sequence, further
processing is done to determine if each of the NAPTRs is
8.3.9 ENUM Implementation Issues appropriate for this ENUM evaluation. These steps must be
ENUM implementations have some issues related to followed strictly to avoid any processing errors.
Character Sets (Non-ASCII Character, Case Sensitivity,
Regexp Field Delimiter, and Regexp Meta-Character),
8.3.9.4 Nonterminal NAPTR Processing
Unsupported NAPTRs, ENUM NAPTR Processing, and
Nonterminal NAPTR Processing. Some highlights of those An ENUM RRSet that contains a nonterminal NAPTR
issues are provided below, although more detail can be found record may hold as its target another domain that has a set
in RFC 5483. of NAPTRs. In effect, this is similar to the nonterminal
380 ◾ Handbook on Session Initiation Protocol
NAPTR being replaced by the NAPTRs contained in the have become attractive places for attackers. Like all DNS
domain to which it points. It may create a set of problems: services, the data of ENUM services becomes the focus of
security threats such as cache poisoning for denial of service
◾◾ Nonterminal NAPTER domains may not be under the (DoS) and Masquerading, client flooding, vulnerabilities in
control of the ENUM management system. dynamic updates, information leakage, and compromising
◾◾ Cascaded domains may create loops. authoritative data, to name a few.
◾◾ The nonterminal NAPTR may have a different
ORDER value from that in the referring nonterminal
NAPTR.
8.4.1 Cache Poisoning
◾◾ The set of specifications defining complex and multi- In cache poisoning, an attacker takes advantage by forg-
layered DDDS and its applications may contain a set ing the query forwarding capability of a DNS server to
of fields with Nonterminal NAPTRs such as Flags, another server. If the server passes the query onto another
Services, Regular Expression, and Replacement. The DNS server that has incorrect information, whether placed
systematic interpretations and appropriate use of these there intentionally or unintentionally, then cache poisoning
fields and their contents may be prone to errors. can occur. This cache poisoning is also known as DNS spoof-
ing. For example, earlier BIND software that implements a
DNS protocol, a DNS server responding to a query, but not
8.3.9.5 Backwards Compatibility necessarily with an answer, filled in the additional records
section of the DNS response message with information that
The change in syntax of the Services field of the NAPTR that did not necessarily relate to the answer. This partial hint
reflects a refinement of the concept of ENUM processing response has been highly susceptible to cache poisoning by a
may create backward compatibility problems unless imple- malicious user. A DNS server accepting this response did not
mented very carefully. perform any necessary checks to assure that the additional
information was correct or even related in some way to the
8.3.9.6 Collected Implications answer indicating that the responding server had appropriate
for ENUM Provisioning authority over those records. The native DNS server accepts
this information and adds to the cache corruption problem.
ENUM NAPTRs should be provisioned complying with all Another problem has been that there was not a mechanism
the recommendations described earlier, such as Character in place to assure that the answer received was related to the
Sets (Non-ASCII Character, Case Sensitivity, Regexp Field original question. The DNS server receiving the response
Delimiter, and Regexp Meta-Character), Unsupported caches the answer, again contributing to the cache corrup-
NAPTRs, ENUM NAPTR Processing, and Nonterminal tion problem.
NAPTR Processing. Rogue DNS servers pose a threat because they facilitate
attack techniques such as host name spoofing and DNS
8.3.9.7 Collected Implications spoofing. DNS pointer records are used in host name spoof-
ing. It differs slightly from most DNS spoofing techniques in
for ENUM Clients
that all the transactions that transpire are legitimate accord-
ENUM clients should not discard NAPTRs in which they ing to the DNS protocol, while this is not necessarily the
detect characters outside the US-ASCII printable range. case for other types of DNS spoofing. With host name spoof-
ing, the DNS server legitimately attempts to resolve a PTR
query using a legitimate DNS server for the zone belong-
ing to that PTR. It is the PTR record in the zone’s data file
8.4 DSN and ENUM Security on the primary server that is purposely configured to point
The security of ENUM services depends on the DNS that somewhere else, typically a trusted host for another site. Host
provides mapping between E.164 and the IP addresses stor- name spoofing can have a TTL of 0 resulting in no caching
ing the appropriate data. DNS uses the hierarchy of distrib- of the misleading information, even though the host name is
uted databases of the name parent of e164.arpa international being spoofed.
domain name. The public data stored in DNS registries of The DoS attack is one of the objectives of the DNS
Tiers 1, 2, and 3 (Figure 8.4) undergoes changes by the appro- cache poisoning objectives by the attacker. This objective is
priate authorities. The popularity of DNS has made it almost achieved by sending back the negative response for a DNS
a general-purpose server for access control to resources, traffic name that could otherwise be resolved. This can result in
management, load balancing, active planning of topology of DoS for the client wishing to communicate in some manner
communications, and many others. As a result, DNS servers with the DNS name in the query. DoS can be accomplished
DNS and ENUM in SIP ◾ 381
With the advent of new contact methods like home num- PROBLEMS
ber, office number, mobile number, fax number, office e-mail 1. What are the DNS namespace, RRs, and name serv-
address, home e-mail address, instant messaging address, and ers? Explain each of these DNS functionalities using
others, ENUM can enable using a single registered contact examples.
number to map to other methods of contact. For example, 2. How are the SIP entities located/discovered using DNS
an ENUM-compliant e-mail application can query the DNS in the context of the following: client usage, selecting
using an E.164 number and what is returned is an associ- transport protocol, determining port and IP address,
ated e-mail address. Once the e-mail address is returned, the detail of RFC 2782, stateful and stateless proxy, server
e-mail application can then send an e-mail message to the usage, constructing SIP URIs, and selecting transport
end user. Similarly, this same E.164 number can be entered protocol?
in a web browser to retrieve an associated web page. All this 3. What is DDDS algorithm? What is ENUM? How is
can be accomplished through the use of only the end-user the DDDS applied to ENUM, and what is the expected
phone number, an E.164 number. A variety service possibili- output?
ties can be created using ENUM. Intelligent applications can 4. Explain the use of ENUM with compound NAPTRs.
be created to allowing potentially multiple methods of con- Explain ENUM operations with examples.
tacting someone in the event the primary method is unavail- 5. How is ENUM used for providing SIP services?
able. Even ENUM can be used to place a call to an office Explain in detail using examples.
based on the time of day using the E.164 number. During 6. What are the implementation issues in ENUM in the
out-of-office hours, if users can be reached using the primary context of the following: character sets, unsupported
number, an application like e-mail or video mail can be NAPTRs, ENUM NAPTR processing, backwards
invoked instead automatically. Order and preference values compatibility, ENUM provisioning, and ENUM
that are assigned in ENUM record can be used to reach the clients?
user at different times of a day and at different days of a week. 7. Discuss in detail the DNS and ENUM security related
In short, many possible intelligent multimedia services can to the following: cache poisoning, client flooding, and
be offered using ENUM. compromising authoritative data.
Chapter 9
Routing in SIP
383
384 ◾ Handbook on Session Initiation Protocol
Registrar
Registrar
(R2)
(R1)
SIP network
U1 R1 P1 P3 P2 R2 U2
M1 M3
M2 M4
M5 M6 M7 M8
Figure 9.1 SIP network, functional entities, and message flows. (Copyright IETF. Reproduced with permission.)
SIP Registrar server (R2) located in its domain (office.com). the proxy MUST strip the maddr and any nondefault
The user’s contact information is updated by the user as one port or transport parameter, and continue processing
changes one’s location. A user can have multiple contact infor- as if those values had not been present in the request.
mation registered at the same time, indicating that one needs 5. A request may arrive with a maddr matching the proxy,
to be tried in multiple locations forking the request, as one but on a port or transport different from that indicated
can be available in one of these registered addresses. It should in the URI. Such a request needs to be forwarded to
be noted that the SIP proxy server or location server may col- the proxy using the indicated port and transport. If the
lect the contact information of SIP users from SIP registrars first value in the Route header field indicates this proxy,
and create location databases for routing purposes. However, the proxy must remove that value from the request.
the protocols that are used for communications between reg-
istrars, proxies, or location servers are beyond the scope of SIP. This scenario is the basic SIP trapezoid, U1 -> P1 -> P3->
P2 -> U2, with both proxies record-routing (Figure 9.1).
Here is the flow. U1 sends (M5):
9.3 SIP Proxy INVITE sip:[email protected] SIP/2.0
Contact: sip:[email protected]
The SIP proxy server is usually the functional entity that is
responsible for routing between intradomain and interdomain to P1. P1 is an outbound proxy. P1 is not responsible for
for SIP messages (see Section 3.11). The proxy must inspect office.com and does not have any direct relationship with
the SIP Request-URI of the request because this may contain this domain, so it looks it up in the Domain Name System
the headers like Route/Route-Record. The Route header field (DNS) and sends it to the proxy (P3) of the network domain.
is used to force routing for a request through the listed set of It also adds a Record-Route header field value (M6):
proxies, for example, Route: <sip:bigbox3.site3.atlanta.com;lr>,
<sip:server10.biloxi.com;lr>. The Record-Route header field is INVITE sip:[email protected] SIP/2.0
inserted by proxies in a request to force future requests in the Contact: sip:[email protected]
dialog to be routed through the proxy, for example, Record- Record-Route: <sip:p1.example.com;lr>
Route: <sip:server10.biloxi.com;lr>, sip:bigbox3.site3.atlanta.
com;lr. In the absence of local policy to the contrary, the pro- P2 gets this and queries the DNS and sends this to the
cessing a proxy performs on a request containing a Route header proxy (P2) that is responsible for the domain (office.com)
field can be summarized in the following steps (RFC 3261). adding the Route-Record header keeping it in the routing
path (M7):
1. The proxy will inspect the Request-URI. If it indi- INVITE sip:[email protected] SIP/2.0
cates a resource owned by this proxy, the proxy will Contact: sip:[email protected]
replace it with the results of running a location service. Record-Route: <sip:p3.network.com;lr>
Otherwise, the proxy will not change the Request-URI. Record-Route: <sip:p1.example.com;lr>
2. The proxy will inspect the URI in the topmost Route
header field value. If it indicates this proxy, the proxy P2 gets this. It is responsible for domain.com, so it runs
removes it from the Route header field (this route node a location service and rewrites the Request-URI. It also adds
has been reached). a Record-Route header field value. There is no Route header
3. The proxy will forward the request to the resource indi- field, so it resolves the new Request-URI to determine where
cated by the URI in the topmost Route header field value to send the request (M8):
or in the Request-URI if no Route header field is present.
INVITE sip:[email protected] SIP/2.0
The proxy determines the address, port, and transport to Contact: sip:[email protected]
use when forwarding the request by applying the proce- Record-Route: <sip:p2.office.com;lr>
dures in step 4 to that URI. If no strict-routing elements Record-Route: <sip:p3.network.com;lr>
are encountered on the path of the request, the Request- Record-Route: <sip:p1.example.com;lr>
URI will always indicate the target of the request.
4. If the Request-URI contains a maddr parameter, The callee at u2.office.com gets this and responds with a
the proxy must check to see if its value is in the set 200 OK (M9):
of addresses or domains the proxy is configured to
SIP/2.0 200 OK
be responsible for. If the Request-URI has a maddr Contact: sip:[email protected]
parameter with a value the proxy is responsible for, and Record-Route: <sip:p2.office.com;lr>
the request was received using the port and transport Record-Route: <sip:p3.network.com;lr>
indicated (explicitly or by default) in the Request-URI, Record-Route: <sip:p1.example.com;lr>
386 ◾ Handbook on Session Initiation Protocol
The callee at u2 also sets its dialog state’s remote target third proxy implements the strict-routing procedures speci-
URI to sip:[email protected] and its route set to fied in RFC 2543 and many works in progress (not relevant
to Figure 9.1):
(<sip:p2.office.com;lr>,<sip:p3.network.
com;lr>,<sip:p1.example.com;lr>) U1 -> P1 -> P2 -> P3 -> P4 -> U2
Since P1 is not responsible for u1.example.com and protocol-switching situations. Furthermore, the consequence
there is no Route header field, P1 will forward the request to of doing rewriting is that the route set seen by the caller is
u1.example.com based on the Request-URI. different from the route set seen by the callee, and this has at
least two negative implications (RFC 5658):
9.5 Rewriting Record-Route ◾◾ The route set gets edited by the proxy in the response
and, as a result, the callee cannot sign the route set.
Header Field Values It implies that end-to-end protection of the route set
In this scenario, U1 and U2 are in different private namespaces cannot be supported by the protocol violating the
and they enter a dialog through a proxy P1, which acts as a Internet’s principles of openness and breaking the end-
gateway between the namespaces (not relevant to Figure 9.1). to-end connectivity.
◾◾ A proxy must implement special multihoming logic in
U1 -> P1 -> U2 view of multiple interfaces. When a proxy forwards a
request, it usually performs an output interface calcu-
U1 sends lation and writes information resolving to the output
interface into the URI of the Record-Route header.
INVITE sip:[email protected].
com SIP/2.0 When handling responses, the proxy must inspect
Contact: <sip:[email protected]> the Record-Route header(s) and will look for an input
interface, and selectively edit them to reference the cor-
P1 uses its location service and sends the following to U2: rect output interface although it implies more process-
ing as it will be done for all responses forwarded by
INVITE sip:[email protected] the proxy.
SIP/2.0
Contact: <sip:[email protected]>
Record-Route: <sip:gateway.rightprivatespace. The double Record-Route approach described next is
com;lr> recommended to avoid rewriting the Record-Route (RFC
5658). This recommendation applies to all uses of Record-
U2 sends this 200 (OK) back to P1: Route rewriting by proxies, including transport protocol-
SIP/2.0 200 OK switching and multihomed proxies.
Contact: <sip:[email protected]>
Record-Route: <sip:gateway.rightprivatespace.
com;lr>
9.6 Record-Routing with Globally
P1 rewrites its Record-Route header parameter to provide Routable UA URI
a value that U1 will find useful, and sends the following to
U1: The Globally Routable UA URI (GRUU) defined in RFC
5627 has been described earlier. The GRUU is a URI that
SIP/2.0 200 OK
routes to a specific UA instance. However, there are two dis-
Contact: <sip:[email protected].
com> tinct requirements for record-routing (R-R) for the GRUU:
Record-Route: <sip:gateway.leftprivatespace.
com;lr> ◾◾ R-R in the originating domain
Later, U1 sends the following BYE request to ◾◾ R-R in the terminating domain
P1:
BYE sip:[email protected]
SIP/2.0 These requirements avoid unnecessary, and possibly prob-
Route: <sip:gateway.leftprivatespace.com;lr> lematic, spirals of requests.
If (i) an originating authoritative proxy receives a dialog-
which P1 forwards to U2 as forming request, (ii) and the Contact header field contains a
BYE sip:[email protected] GRUU in the domain of the proxy, (iii) and that GRUU is
SIP/2.0 a valid one in the domain of the proxy, (iv) and that GRUU
is associated with the AOR matching the authenticated
identity of the requestor (assuming such authentication has
9.5.1 Problems and Recommendation
been performed), (v) and the request contains Record-Route
Record-Route rewriting in responses described earlier is not header fields, then the authoritative proxy must record-route.
the optimal way of handling multihomed and transport If all of these conditions are true, except that the GRUU is
388 ◾ Handbook on Session Initiation Protocol
associated with an AOR that did not match the authenti- Record-Route header. When handling responses, the
cated identity of the requestor, it is recommended that the proxy must inspect the Record-Route header(s), look
proxy reject the request with a 403 (Forbidden) response. for an input interface, and selectively edit them to ref-
If (i) a terminating authoritative proxy receives a dialog- erence the correct output interface. Since this lookup
forming request, (ii) and the Request-URI contains a URI in has to be done for all responses forwarded by the proxy,
the location service (either a GRUU or an AOR), (iii) and the this technique implies a CPU drag.
contact selected for sending the request has an instance ID
and is bound to a GRUU, (iv) and the registration contains The serious drawbacks of the rewriting technique have
Path URI, then the authoritative proxy must record-route. been removed with the help of the double Route-Routing
If a proxy is in either the originating or terminating scheme (RFC 5658). This technique is also backward com-
domain, but is not an authoritative proxy, the proxy may patible with the rewriting of the Record-Route as described
record-route. If a proxy in the terminating domain requires above, and can also solve the spiraling request problem.
mid-dialog requests to pass through it for whatever reason When double Record-Routing scheme is used, the proxy will
(firewall traversal, accounting, etc.), the proxy must still have to handle the subsequent in-dialog request(s) as a spiral,
record-route and must not assume that a UA will utilize its and consequently devote resources to maintain transactions
GRUU in the Contact header field of its response (which required to handle the spiral.
would cause mid-dialog requests to pass through the proxy To avoid a spiral, the proxy can be smart and scan an
without record-routing). extra Route header ahead to determine whether the request
Implementers should note that, if a UA uses a GRUU in will spiral through it. If it does, it can optimize the second
its contact, and a proxy inserted itself into the Path header spiral through itself. Even though this is an implementa-
field of a registration, that proxy will be receiving mid-dialog tion decision, it is much more efficient to avoid spiraling.
requests regardless of whether it record-routes or not. The Therefore, implementers can choose that a proxy may remove
only distinction is what URI the proxy will see in the top- two Route headers instead of one when using the double
most Route header field of mid-dialog requests. If the proxy Record-Routing. We have taken an example that shows a
record-routes, it will see that URI. If it does not, it will see basic call flow (Figure 9.2) using double Record-Routing in a
the Path URI it inserted. multihomed IPv4 to IPv6 proxy (RFC 5658), and annotates
the dialog state on each SIP UA.
SIP Proxy P1 is dual-homed in IP4 and IPv6 network
acting the proxy serving UAs U1 (howell.domain.com) and
9.7 Double Route-Record U2 (atlanta.domain.com), respectively. The address configura-
The Record-Route rewriting that we have described earlier tions of dual-stack proxy P1 are 192.0.2.254:5060 on the IPv4
creates some problems when a proxy has to change some of interface and 2001:db8::1 on the IPv6 interface. The call flow
those parameters between its incoming and outgoing inter- messages with double Route-Record are explained in Figure
faces (multihomed proxies, transport protocol switching, or 9.2, omitting some mandatory SIP headers for simplicity.
IPv4 to IPv6 scenarios, and others), the question arises on The caller (U1) sends INVITE to the callee (U2) via its
what should be put in Record-Route header(s). It is not possi- outgoing proxy (P1):
ble to make one header have the characteristics of both inter-
faces at the same time. Record-Route rewriting in responses F1 INVITE U1 -> P1 (192.0.2.254:5060)
is not the optimal way of handling multihomed and trans-
port protocol-switching situations. Additionally, the conse- INVITE sip:[email protected] SIP/2.0
quence of doing rewriting is that the route set seen by the Route: <sip:192.0.2.254:5060;lr>
caller is different from the route set seen by the callee, and From: Joe <[email protected]>;tag=1234
this has at least two negative implications: To: Ken <sip:[email protected]>
Contact: <sip:[email protected]>
1. The callee cannot sign the route set because it gets The proxy (P1) receives the message (F1) from Joe (U1)
edited by the proxy in the response. Consequently, end- and adds Route-Record header field for both of its interfaces
to-end protection of the route set cannot be supported of IPv4 and IPv6, and then it sends the INVITE message
by the protocol. This means the Internet’s principles of (F2) to Ken (U1):
openness and end-to-end connectivity are broken.
2. A proxy must implement special multihomed logic. F2 INVITE P1 (2001:db8::1) -> U2
During the request-forwarding phase, it performs an
output interface calculation and writes information INVITE sip: [email protected] SIP/2.0
resolving to the output interface into the URI of the Record-Route: <sip:[2001:db8::1];lr>
Routing in SIP ◾ 389
Proxy (P1)
(a)
U1 U2 U1 TCP UDP U2
P1 P1
(IPv4) (IPv4/IPv6) (IPv6) (IPv4) (IPv6)
INVITE (F1) INVITE (M1)
INVITE (F2) INVITE (M2)
100 Trying 100 Trying
200 OK (F3) 200 OK (M3)
200 OK (F4) 200 OK (M4)
(b) (c)
Figure 9.2 Example of double Route-Record: (a) SIP and IP network configuration, (b) call flows with IPv4–IPv6 multi-
homed proxy, and (c) call flows with TCP/UDP transport protocol switching. (Copyright IETF. Reproduced with permission.)
Contact: <sip:ken@[2001:db8::33]>
9.8 Transport Parameter Usage
Joe’s UA (U1) receives the 200 OK message (F4) from Problems and Remedies
the proxy (P1), and the dialog state at Joe’s UA (U1) can be We are considering a scenario as shown in Figure 9.2c for
described as follows: illustrating the transport protocol-switching problems (RFC
Local URI = sip:[email protected] 5658) where a SIP proxy is performing the transport protocol
Remote URI = sip:[email protected] switching from the Transmission Control Protocol (TCP) to
Remote target = sip:bob@[2001:db8::33] the User Datagram Protocol (UDP). In this example, the
Route Set = sip:192.0.2.254:5060:lr proxy (P1), responsible for the domain howell.domain.com,
sip:[2001:db8::1];lr receives a request from Joe’s UA (U1), which uses the TCP
transport protocol. The proxy (P1) sends this request to Ken
Now Joe’s UA (U1) sends the ACK message (F5) via its out- UA (U2), which registers with a Contact specifying UDP as
going proxy (P1) adding both routes in the Route header field: the transport protocol.
F5 ACK U1 -> P1 (192.0.2.254:5060)
9.8.1 UA Implementation
ACK sip:ken@[2001:db8::33] SIP/2.0
Route: <sip:192.0.2.254:5060:lr> We assume that the proxy (P1) receives an initial request
Route: <sip:[2001:db8::1];lr> from Joe over the TCP and forwards it to Ken over the UDP
From: Joe <sip:[email protected]. (Figure 9.2c). For subsequent requests, it is expected that the
com>;tag=1234
TCP could continue to be used between Joe and P1, and the
To: Ken <sip:[email protected].
com>;tag=4567 UDP between P1 and Ken. However, this cannot happen if a
numeric IP address is used and no transport parameter is set
The proxy (P1) receives the ACK message (F5) from Joe’s on Record-Route URI. This happens because of procedures
UA (U1) and forwards the ACK message (F6) to Ken’s UA described in RFC 3263. We have provided the call flows with
(U2) after removing both Route header fields: messages M1–M6 omitting some mandatory parameters of
SIP for simplicity as stated below:
F6 ACK P1 (2001:db8::1) -> U2
M1 INVITE Joe’s UA (U1) -> Proxy (P1) (192.0.2.1/tcp)
ACK sip:ken@[2001:db8::33] SIP/2.0
From: Alice <sip:[email protected]. INVITE sip:[email protected] SIP/2.0
com>;tag=1234 Route: <sip:192.0.2.1;lr;transport=tcp>
To: Bob <sip:[email protected]>;tag=4567 From: Joe<sip:[email protected].
com>;tag=1234
To: Ken <sip:[email protected]>
The session is now established between Joe and Ken. At the
Contact: <sip:[email protected].
end of the session, Ken’s UA (U2) sends the BYE Messages (F7) com;transport=tcp>
via the proxy (P2) adding both routes in Route header fields:
M2 INVITE Proxy (P1) -> Ken’s UA (U2) (u2.atlanta.
F7 BYE U2 -> P1 (2001:db8::1) network.com/udp)
M3 200 OK Ken’s UA (U2) -> Proxy (P1) (192.0.2.1/udp) parameter is present here), and no Naming Authority Pointer
(NAPTR) request will be performed since this is a numeric
SIP/2.0 200 OK IP address. In general, the interoperability problems arise
Record-Route: <sip:192.0.2.1;lr> when a UA (e.g., U1) is trying to send the ACK: it is not ready
From: Joe <sip:[email protected].
com>;tag=1234
to change its transport protocol for a mid-dialog request and
To: Ken <sip:[email protected]. just fails to do so, requiring the proxy implementer to insert
com>;tag=4567 the transport protocol in the Record-Route URI.
Contact: <sip:[email protected]> A more important generic question is this: What happens if
the proxy had Record-Routed its logical name (howell.network.
M4 200 OK Proxy (P1) -> Joe’s UA (U1) (u1.howell.net- com)? The protocol-switching problem can only be avoided only
work.com/tcp) if the resulting transport protocol per procedures described in
RFC 3263 (see Section 8.2.4) is UDP since Ken is to be con-
SIP/2.0 200 OK tacted over UDP per this example. For any other resulting trans-
Record-Route: <sip:192.0.2.1;lr>
From: Joe <sip:[email protected].
port protocol, the transport protocol-switching problems will
com>;tag=1234 occur as described above (RFC 5658). Also, if one of the UAs
To: Ken <sip:[email protected]. sends an initial request using a different transport than the one
com>;tag=4567 retrieved from the DNS, this scenario would be problematic.
Contact: <sip:[email protected]> In practice, there are multiple situations where UA imple-
mentations do not use logical names and NAPTR records
Dialog State at Joe’s UA (U1): when sending an initial request to a proxy. This happens, for
instance, when (RFC 5658)
Local URI = sip:[email protected]
Remote URI = sip: [email protected]
Remote target = sip:[email protected] 1. UAs offer the ability to choose the transport to be used
Route Set = sip:192.0.2.1;lr for initial requests, even if they support RFC 3263 (see
Section 8.2.4). This is a frequent UA functionality that
M5 ACK Joe’s UA (U1) -> Proxy (P1) (192.0.2.1/udp) is justified by the following use cases:
– When it is not possible to change the DNS server
ACK sip:[email protected] SIP/2.0 configuration and the implementation does not
Route: <sip:192.0.2.1;lr>
From: Ken <sip:[email protected].
support all the transport protocols that could be
com>;tag=1234 configured by default in DNS (e.g., TLS)
To: Ken <sip:[email protected]. – When the end user wants to choose his transport
com>;tag=4567 protocol for whatever reason, for example, needing
to force TCP, avoiding UDP/congestion, retrans-
M6 BYE Ken’s UA (U2) -> Proxy (P1) (192.0.2.1/udp) mitting, fragmenting, or other problems
subsequent request, it should rely on DNS records for ◾◾ Record-Route r ewriting on r esponses: In the
that; thus, it should avoid configuring statically the INVITE request sent in M2, the proxy puts the outgo-
outbound proxy with a numeric IP address. A logical ing transport protocol in the transport parameter of
name, with no transport parameter, should be used Record-Route URI. By doing so, Ken’s UA (U2) will
instead. correctly send its BYE request in M6 using the same
3. UAs do not support RFC 3263 (see Section 8.2.4) at transport protocol as previous messages of the same
all, or do not have any DNS server available. In that dialog. The proxy rewrites the Record-Route when pro-
case, as illustrated previously, forcing Joe’s UA (U1) to cessing the 200 OK response, changing the transport
switch from TCP to UDP between the initial request parameter on the fly to transport=tcp, so that the Route
and subsequent request(s) is clearly not the desired set will appear to be <sip:192.0.2.1;lr;transport=tcp>
default behavior, and it typically leads to interoper- for UA1 and <sip:192.0.2.1;lr;transport=udp> for U2.
ability problems. UA implementations should then be
ready to change the transport protocol between initial It is a common practice in proxy implementations to
and subsequent requests. In theory, any UA or proxy support double Record-Route and to insert the transport
using UDP must also be prepared to use TCP for parameter in the Record-Route URI. This practice is accept-
requests that exceed the size limit of path maximum able as long as all SIP elements that may be in the path of
transmission unit (MTU), as described in RFC 3261 subsequent requests support that transport. This restriction
(see Section 3.13.1.1). needs an explanation. Let us imagine we have two proxies,
P1 at p1.howell.network.com and P2 on the path of an ini-
tial request. P1 is Record-Routing and changes the trans-
port from UDP to Stream Control Transmission Protocol
9.8.2 Proxy Implementation (SCTP) because the P2 URI resolves to SCTP applying RFC
To prevent UA implementation problems, and to maintain 3263 (see Section 8.2.4). Consequently, the proxy P1 inserts
a reasonable level of interoperability, the situation can be two Record-Route headers:
improved on the proxy side. Thus, if the transport protocol
Record-Route: <sip:p1.howell.network.
changed between its incoming and outgoing sides, the proxy
com;transport=udp> and
should use the double Record-Route technique and should Record-Route: <sip:p1.howell.network.
add a transport parameter to each of the Record-Route URIs com;transport=sctp>.
it inserts. When TLS is used on the transport on either side
of the proxy, the URI placed in the Record-Route header The problem arises if P2 is not Record-Routing because
field must encode a next hop that will be reached using the SIP element downstream of P2 will be asked to reach
TLS. There are two ways for this to work. The first way is P1 using SCTP for any subsequent, in-dialog request from
for the URI placed in the Record-Route to be a SIPS URI. the callee, and this downstream SIP element may not sup-
The second is for the URI placed in the Record-Route to be port that transport. To handle this situation, RFC 5658
constructed such that application of resolution procedures of recommends that a proxy should apply the double Record-
RFC 3262 to that URI results in TLS being selected. Proxies Routing technique as soon as it changes the transport pro-
compliant with this specification must not use a transport=tls tocol between its incoming and outgoing sides. If proxy P2
parameter on the URI placed in the Record-Route because in the example above would follow this recommendation,
the transport=tls usage was deprecated in RFC 3261. it would perform double Record-Routing and the down-
Record-Route rewriting may also be used. However, stream element would not be forced to send requests over a
the recommendation to put a transport protocol param- transport it does not support. By extension, a proxy should
eter on Record-Route URI does not apply when the proxy also insert a Record-Route header for any multihomed
has changed the transport protocol due to the size of UDP situation (as the ones described here: scheme changes,
requests as per RFC 3261 (see Section 3.13.1.1). As an illus- sigcomp, IPv4/IPv6, transport changes) that may affect
tration of the previous example, it means one of the follow- the processing of proxies being on the path of subsequent
ing processing (RFC 5658) will be performed: requests.
This results in a hybrid way of computing the destination be received while the binding is still in existence. INVITE
of the response. Half of the information (specifically, the transactions can take an arbitrarily long amount of time to
IP address) is taken from the IP packet headers, and the complete. As a result, the binding may expire before a final
other half (specifically, the port) from the SIP message head- response is received. To keep the binding fresh, the client
ers. SIP operates in this manner so that a server can listen should retransmit its INVITE every 20 seconds or so. These
for all messages, both requests and responses, on a single IP retransmissions will need to take place even after receiving a
address and port. This helps improve scalability. However, provisional response. A UA may execute the binding lifetime
this behavior is not desirable in many cases, most notably discovery algorithm defined in RFC 5389 (see Section 14.3)
when the client is behind a network address translation to determine the actual binding lifetime in the NAT. If it is
(NAT). In that case, the response will not properly traverse longer than 1 minute, the client should increase the inter-
the NAT, since it will not match the binding established val for request retransmissions up to half of the discovered
with the request. lifetime. If it is shorter than 1 minute, it should decrease the
Furthermore, there is currently no way for a client to interval for request retransmissions to half of the discovered
examine a response and determine the source port that the lifetime. Note that discovery of binding lifetimes can be
server saw in the corresponding request. In RFC 3261, SIP unreliable (RFC 5389, see Section 14.3).
provides the client with the source IP address that the server
saw in the request, but not the port. The source IP address
9.8.3.2 Server Behavior
is conveyed in the received parameter in the topmost Via
header field value of the response. This information has The server behavior specified here affects the transport
proved useful for basic NAT traversal, debugging purposes, processing defined in SIP (RFC 3261, see Section 3.13.2).
and support of multihomed hosts. However, it is incom- When a server compliant to this specification (which can
plete without the port information. RFC 3581 defines a be a proxy or UA server [UAS]) receives a request, it exam-
new parameter for the Via header field, called rport, that ines the topmost Via header field value. If this Via header
allows a client to request that the server send the response field value contains an rport parameter with no value, it
back to the source IP address and port where the request must set the value of the parameter to the source port of the
came from. The rport parameter is analogous to the received request. This is analogous to the way in which a server will
parameter, except that rport contains a port number, not insert the received parameter into the topmost Via header
the IP address. field value. In fact, the server must insert a received param-
eter containing the source IP address that the request came
from, even if it is identical to the value of the sent-by com-
9.8.3.1 Client Behavior
ponent. Note that this processing takes place independent
The client behavior specified here affects the transport pro- of the transport protocol. When a server attempts to send
cessing defined in SIP (RFC 3261, see Section 3.1). A cli- a response, it examines the topmost Via header field value
ent, compliant to this specification (clients include UACs of that response. If the sent-protocol component indicates
and proxies), may include an rport parameter in the top Via an unreliable unicast transport protocol, such as UDP, and
header field value of requests it generates. This parameter there is no maddr parameter, but there is both a received
must have no value; it serves as a flag to indicate to the server parameter and an rport parameter, the response must be
that this extension is supported and requested for the trans- sent to the IP address listed in the received parameter, and
action. When the client sends the request, if the request is the port in the rport parameter. The response must be sent
sent using UDP, the client must be prepared to receive the from the same address and port that the corresponding
response on the same IP address and port it used to popu- request was received on.
late the source IP address and source port of the request. For This effectively adds a new processing step between bul-
backwards compatibility, the client MUST still be prepared lets two and three in SIP (RFC 3261, see Section 3.13.2.2).
to receive a response on the port indicated in the sent-by The response must be sent from the same address and port
field of the topmost Via header field value, as specified in SIP that the request was received on in order to traverse sym-
(RFC 3261, see Section 3.1.2.1). metric NATs. When a server is listening for requests on mul-
When there is a NAT between the client and server, the tiple ports or interfaces, it will need to remember the one on
request will create (or refresh) a binding in the NAT. This which the request was received. For a stateful proxy, storing
binding must remain in existence for the duration of the this information for the duration of the transaction is not an
transaction in order for the client to receive the response. issue. However, a stateless proxy does not store state between
Most UDP NAT bindings appear to have a timeout of about a request and its response, and therefore cannot remember
1 minute. This exceeds the duration of non-INVITE trans- the address and port on which a request was received. To
actions. Therefore, responses to a non-INVITE request will properly implement this specification, a stateless proxy can
394 ◾ Handbook on Session Initiation Protocol
encode the destination address and port of a request into 9.9 Caller Preferences-Based Routing
the Via header field value that it inserts. When the response
arrives, it can extract this information and use it to forward 9.9.1 Overview
the response. When a SIP server defined in RFC 3261 (see Section 2.4.4.3)
receives a request, there are a number of decisions it can make
9.8.3.3 Example regarding the processing of the request. These include
A client sends an INVITE to a proxy server that looks like,
◾◾ Whether to proxy or redirect the request
in part
◾◾ Which URIs to proxy or redirect to
INVITE sip:[email protected] SIP/2.0 ◾◾ Whether to fork or not
Via: SIP/2.0/UDP 10.1.1.1:4540;rport;branch ◾◾ Whether to search recursively or not
=z9hG4bKkjshdyff ◾◾ Whether to search in parallel or sequentially
This INVITE is sent with a source port of 4540 and The server can base these decisions on any local policy.
a source IP address of 10.1.1.1. The proxy is at 192.0.2.2 This policy can be statically configured, or can be based on
(proxy.example.com), listening on both port 5060 and execution of a program or database access. However, the
5070. The client sends the request to port 5060. The request administrator of the server is the not the only entity with an
passes through a NAT on the way to the proxy, so that the interest in request processing. There are at least three parties
source IP address appears as 192.0.2.1 and the source port that have an interest:
as 9988. The proxy forwards the request, but not before
appending a value to the rport parameter in the proxied
◾◾ The administrator of the server
request:
◾◾ The user who sent the request
INVITE sip:[email protected] SIP/2.0 ◾◾ The user to whom the request is directed
Via: SIP/2.0/UDP proxy.example.
com;branch=z9hG4bKkjsh77 The directives of the administrator are embedded in the
Via: SIP/2.0/UDP 10.1.1.1:4540;received=192.0
policy of the server. The preferences of the user to whom
.2.1;rport=9988
;branch=z9hG4bKkjshdyff the request is directed (referred to as the callee, even though
the request method may not be INVITE) can be expressed
This request generates a response that arrives at the proxy: most easily through a script written in some type of scripting
language, such as the Call Processing Language (CPL) speci-
SIP/2.0 200 OK fied in RFC 2824. However, no mechanism exists to incor-
Via: SIP/2.0/UDP proxy.example. porate the preferences of the user that sent the request (also
com;branch=z9hG4bKkjsh77
referred to as the caller, even though the request method
Via: SIP/2.0/UDP 10.1.1.1:4540;received=192.0.
2.1;rport=9988 may not be INVITE). For example, the caller might want
;branch=z9hG4bKkjshdyff to speak to a specific user, but wants to reach them only at
work, because the call is a business call. As another example,
The proxy strips its top Via header field value, and then the caller might want to reach a user, but not their voice mail,
examines the next one. It contains both a received parameter since it is important that the caller talk to the called party.
and an rport parameter. The server follows the rules specified In both of these examples, the caller’s preference amounts
in Section 1.4 and sends the response to IP address 192.0.2.1, to having a proxy make a particular routing choice on the
port 9988, and sends it from port 5060 on 192.0.2.2: basis of the preferences of the caller. This extension allows
the caller to have these preferences met. It does so by specify-
SIP/2.0 200 OK
ing mechanisms by which a caller can provide preferences on
Via: SIP/2.0/UDP 10.1.1.1:4540;received=192.0.
2.1;rport=9988 processing of a request. There are two types of preferences.
;branch=z9hG4bKkjshdyff One of them, called request handling preferences, are encap-
sulated in the Request-Disposition header field. They provide
This packet matches the binding created by the initial specific request handling directives for a server. The other,
request. Therefore, the NAT rewrites the destination address called feature preferences, is present in the Accept-Contact
of this packet back to 10.1.1.1, and the destination port back and Reject-Contact header fields. They allow the caller to
to 4540. It forwards this response to the client, which is lis- provide a feature set defined in RFC 2533 (see Section 3.4.3)
tening for the response on that address and port. The client that expresses its preferences on the characteristics of the UA
properly receives the response. that is to be reached.
Routing in SIP ◾ 395
These are matched with a feature set provided by a UA the call to a UA that supports the INVITE method. Both
to its registrar specified in RFC 3840 (see Section 3.4). The request handling and feature preferences can appear in any
extension is of a very general purpose, and not tied to a par- request, not just INVITE. However, they are only useful in
ticular service. Rather, it is a tool that can be used in the requests where proxies need to determine a request target. If
development of many services. One example of a service the domain in the request URI is not owned by any prox-
enabled by caller preferences is a one number service. A user ies along the request path, those proxies will never access a
can have a single identity (their SIP URI) for all of their location service, and therefore, never have the opportunity
devices—their cell phone, personal digital assistance (PDA), to apply the caller preferences. This makes sense because
work phone, home phone, and so on. If the caller wants to typically, the request URI will identify a UAS for mid-dialog
reach the user at their business phone, they simply select requests. In those cases, the routing decisions were already
business phone from a pull-down menu of options when call- made on the initial request, and it makes no sense to redo
ing that URI. Users would no longer need to maintain and them for subsequent requests in the dialog.
distribute separate identities for each device. RFC 3841 that
is described in this section specifies a set of extensions to the
SIP that allow a caller to express preferences about request
9.9.3 UAC Behavior
handling in servers. These preferences include the ability to A caller wishing to express preferences for a request includes
select which URI a request gets routed to, and to specify Accept-Contact, Reject-Contact, or Request-Disposition
certain request handling directives in proxies and redirect header fields in the request, depending on their particular
servers. It does so by defining three new request header fields preferences. No additional behavior is required after the
(Accept-Contact, Reject-Contact, and Request-Disposition) request is sent. The Accept-Contact, Reject-Contact, and
that specify the caller’s preferences. Note that we have defined Request-Disposition header fields in an ACK for a non-2xx
the following items in Section 2.2: caller, feature preferences, final response, or in a CANCEL request, must be equal to
request handling preferences, and explicit preference. the values in the original request being acknowledged or
cancelled. This is to ensure proper operation through state-
less proxies. If the UAC wants to determine whether servers
9.9.2 Operation along the path understand the header fields described in this
When a caller sends a request, it can optionally include new specification, it includes a Proxy-Require header field with a
header fields that request certain handling at a server. These value of pref specified in RFC 3840 (see Section 3.4) in its
preferences fall into two categories. The first category, called request. If the request should fail with a 420 response code,
request handling preferences, is carried in the Request- the UAC knows that the extension is not supported. In that
Disposition header field. It describes specific behavior that case, it should retry, and may decide whether or not to use
is desired at a server. Request handling preferences include caller preferences. A UA should only use Proxy-Require if
whether the caller wishes the server to proxy or redirect, and knowledge about support is essential for handling of the
whether sequential or parallel search is desired. These prefer- request. Note that, in any case, caller preferences can only
ences can be applied at every proxy or redirect server on the call be considered preferences—there is no guarantee that the
signaling path. The second category of preferences, called fea- requested service will be executed. As such, inclusion of a
ture preferences, is carried in the Accept-Contact and Reject- Proxy-Require header field does not mean that the prefer-
Contact header fields. These header fields contain feature sets, ences will be executed, just that the caller preferences exten-
represented by the same feature parameters that are used to sion is understood by the proxies.
indicate capabilities defined in RFC 3840 (see Section 3.4).
Here, the feature parameters represent the caller’s preferences.
9.9.3.1 Request Handling Preferences
The Accept-Contact header field contains feature sets that
describe UAs that the caller would like to reach. The Reject- The Request-Disposition header field specifies caller prefer-
Contact header field contains feature sets that, if matched by ences for how a server should process a request. Its value is
a UA, imply that the request should not be routed to that UA. a list of tokens, each of which specifies a particular process-
Proxies use the information in the Accept-Contact ing directive. The syntax of the header field can be found in
and Reject-Contact header fields to select among contacts Section 2.4.1.2.
in their target set. When neither of those header fields is
present, the proxy computes implicit preferences from the
9.9.3.2 Feature Set Preferences
request. These are caller preferences that are not explicitly
placed into the request but can be inferred from the presence A UAC can indicate caller preferences for the capabilities
of other message components. As an example, if the request of a UA that should be reached or not reached as a result
method is INVITE, this is an implicit preference to route of sending a SIP request. To do that, it adds one or more
396 ◾ Handbook on Session Initiation Protocol
Accept-Contact and Reject-Contact header field values. Each When only require is present, it means that a contact will
header field value contains a set of feature parameters that not be used if it does not match. If it does match, or if it is
define a feature set. The syntax of the header field can be not known whether it is a complete match, the contact is still
found in Section 2.4.1.2. Each feature set is constructed as used. A UAC would use require alone when a nonmatch-
described in RFC 3840 (see Section 3.4). The feature sets ing contact is useless. This is common for services where the
placed into these header fields may overlap; that is, a UA may request simply cannot be serviced without the necessary fea-
indicate preferences for feature sets that match according to tures. An example is support for specific methods or event
the matching algorithm of RFC 2533 (see Section 3.4.3). A packages. When only require is present, the proxy will also
UAC can express explicit preferences for the methods and preferentially route the request to the UA that represents
event packages supported by a UA. It is recommended that a the best match. Here, best means that the UA has explic-
UA include a term in an Accept-Contact feature set with the itly indicated that it supports more of the desired features
sip.methods feature tag (note, however, that even though the than any other. Note, however, that this preferential routing
name of this feature tag is sip.methods, it would be encoded will never override an ordering provided by the called party.
into the Accept-Contact header field as just methods), whose The preferential routing will only choose among contacts of
value includes the method of the request. When a UA sends equal q-value. When only explicit is present, it means that
a SUBSCRIBE request, it is recommended that a UA include all contacts provided by the callee will be used. However, if
a term in an Accept-Contact feature set with the sip.events the contact is not an explicit match, it is tried last among all
feature tag, whose value includes the event package of the other contacts with the same q-value. The principal differ-
request. Whether these terms are placed into a new feature ence, therefore, between this configuration and the usage of
set, or whether they are included in each feature set, is at both require and explicit is the fallback behavior for contacts
the discretion of the implementer. In most cases, the right that do not match explicitly. Here, they are tried as a last
effect is achieved by including a term in each feature set. resort. If require is also present, they are never tried. Finally,
As an example, the following Accept-Contact header field if neither require nor explicit is present, it means that all con-
expresses a desire to route a call to a mobile device, using tacts provided by the callee will be used. However, if the con-
feature parameters taken from RFC 3840 (see Section 3.4): tact does not match, it is tried last among all other contacts
with the same q-value. If it does match, the request is routed
Accept-Contact: *;mobility="mobile";methods preferentially to the best match. This is a common configura-
="INVITE" tion for preferences that, if not honored, will still allow for a
successful call, and the greater the match, the better.
The Reject-Contact header field allows the UAC to spec-
ify that a UA should not be contacted if it matches any of the
values of the header field. Each value of the Reject-Contact
9.9.4 UAS Behavior
header field contains a “*,” purely to align the syntax with When a UAS compliant to this specification receives a request
guidelines for SIP extensions, and is parameterized by a set whose request-URI corresponds to one of its registered con-
of feature parameters. Any UA whose capabilities match the tacts, it should apply the behavior described later as if it were
feature set described by the feature parameters matches the a proxy for the domain in the request-URI. The UAS acts as
value. The Accept-Contact header field allows the UAC to if its location database contains a single request target for the
specify that a UA should be contacted if it matches some or request-URI. That target is associated with a feature set. The
all of the values of the header field. Each value of the Accept- feature set is the same as the one placed in the registration of
Contact header field contains a “*,” and is parameterized by the URI in the request-URI. If a UA had registered against
a set of feature parameters. Any UA whose capabilities match multiple separate AORs, and the contacts registered for each
the feature set described by the feature parameters matches the had different capabilities, it will have used a different URI
value. The precise behavior depends heavily on whether the in each registration, so it can determine which feature set to
require and explicit parameters are present. When both of use.
them are present, a proxy will only forward the request to This processing occurs after the client authenticates and
contacts that have explicitly indicated that they support the authorizes the request, but before the remainder of the gen-
desired feature set. Any others are discarded. As such, a UAC eral UAS processing described in RFC 3261 (see Section
should only use require and explicit together when it wishes 3.1.3.1). If, after performing this processing, there are no
the call to fail unless a contact definitively matches. It is pos- URI left in the target set, the UA should reject the request
sible that a UA supports a desired feature but did not indi- with a 480 response. If there is a URI remaining (there was
cate it in its registration. When a UAC uses both explicit and only one to begin with), the UA proceeds with request pro-
require, such a contact would not be reached. As a result, this cessing as per RFC 3261 (see Section 3.1). Having a UAS
combination is often not the one a UAC will want. perform the matching operations as if it were a proxy allows
Routing in SIP ◾ 397
certain caller preferences to be honored, even if the proxy 9.9.5.2.1 Extracting Explicit Preferences
does not support the extension. A UAS should process any
The first step in proxy processing is to extract explicit prefer-
queue directive present in a Request-Disposition header field
ences. To do that, it looks for the Accept-Contact and Reject-
in the request. All other directives must be ignored.
Contact header fields. For each value of those header fields,
it extracts the feature parameters. These are the header field
9.9.5 Proxy Behavior parameters whose name is audio, automata, class, duplex,
data, control, mobility, description, events, priority, meth-
Proxy behavior consists of two orthogonal sets of rules—
ods, extensions, schemes, application, video, language, type,
one for processing the Request-Disposition header field
isfocus, actor, or text, or whose name begins with a plus (+)
and one for processing the URI and feature set preferences
(RFC 3840, see Section 3.4). The proxy converts all of those
in the Accept-Contact and Reject-Contact header fields. In
parameters to the syntax based on the rules in RFC 2533 (see
addition to processing these headers, a proxy may add one
Section 3.4.3).
if not present, or add a value to an existing header field,
The result will be a set of feature set predicates in con-
as if it were a UAC. This is useful for a proxy to request
junctive normal form, each of which is associated with one of
processing in downstream proxies in the implementation
the two preference header fields. If there was a req-parameter
of a feature. However, a proxy must not modify or remove
associated with a header field value in the Accept-Contact
an existing header field value. This is particularly impor-
header field, the feature set predicate derived from that
tant when S/MIME is used. The message signature could
header field value is said to have its require flag set. Similarly,
include the caller preferences header fields, allowing the
if there was an explicit-param associated with a header field
UAS to verify that, even though proxies may have added
value in the Accept-Contact header field, the feature set
header fields, the original caller preferences were still
predicate derived from that header field value is said to have
present.
its explicit flag set.
9.9.5.1 Request-Disposition Processing
9.9.5.2.2 Extracting Implicit Preferences
If the request contains a Request-Disposition header field
If, and only if, the proxy did not find any explicit prefer-
and it is the owner of the domain in the Request URI, the
ences in the request (because there was no Accept-Contact or
server should execute the directives described later, unless it
Reject-Contact header field), the proxy extracts implicit pref-
has local policy configured to direct it otherwise.
erences. These preferences are ones implied by the presence
of other information in the request. First, the proxy creates
9.9.5.2 Preference and Capability Matching a conjunction with no terms. This conjunction represents a
feature set that will be associated with the Accept-Contact
A proxy compliant to this specification must not apply the
header field, as if it were included there. Note that there is no
preferences matching operation described here to a request
modification of the message implied—only an association
unless it is the owner of the domain in the request URI,
for the purposes of processing. Furthermore, this feature set
and accessing a location service that has capabilities associ-
has its require flag set, but not its explicit flag. The proxy then
ated with request targets. However, if it is the owner of the
adds terms to the conjunction for the two implicit preference
domain, and accessing a location service that has capabilities
types below.
associated with request targets, it should apply the process-
ing described in this section. Typically, this is a proxy that
◾◾ Methods
is using a registration database to determine the request tar-
One implicit preference is the method. When a
gets. However, if a proxy knows about capabilities through
UAC sends a request with a specific method, it is an
some other means, it should apply the processing defined
implicit preference to have the request routed only to
here as well. If it does perform the processing, it must do so
UAs that support that method. To support this implicit
as described below. The processing is described through a
preference, the proxy adds a term to the conjunction of
conversion from the syntax described in this specification to
the following form:
RFC 2533 (see Section 3.4.3) syntax, followed by a matching
operation and a sorting of resulting contact values. The usage (sip.methods=[method of request])
of RFC 2533 syntax as an intermediate step is not required;
it only serves as a useful tool to describe the behavior required ◾◾ Event packages
of the proxy. A proxy can use any steps it likes, so long as the For requests that establish a subscription per RFC
results are identical to the ones that would be achieved with 6665 (see Section 5.2), the Event header field is another
the processing described here. expression of an implicit preference. It expresses a desire
398 ◾ Handbook on Session Initiation Protocol
(& (sip.audio=TRUE)
(sip.events=[value of the Event header
(sip.video=TRUE)
field]) (sip.mobility=fixed)
(sip.message=TRUE)
(| (sip.methods=INVITE) (sip.methods=OPTIONS)
9.9.5.2.3 Constructing Contact Predicates (sip.methods=BYE)
(sip.methods=CANCEL) (sip.methods=ACK))
The proxy then takes each URI in the target set (the set of (| (sip.schemes=sip) (sip.schemes=http)))
URI it is going to proxy or redirect to), and obtains its capa-
bilities as an RFC 2533-formatted feature set predicate. This
Note that other-param was not considered a feature
is called a contact predicate. If the target URI was obtained
parameter, since it is neither a base tag nor did it begin with
through a registration, the proxy computes the contact predi-
a leading +.
cate by extracting the feature parameters from the Contact
header field (see Section 2.8) and then converting them to
a feature predicate. To extract the feature parameters, the 9.9.5.2.4 Matching
proxy follows these steps:
It is important to note that the proxy does not have to
◾◾ Create an initial, empty list of feature parameters. know anything about the meaning of the feature tags that
◾◾ If the Contact URI parameters included the audio, it is comparing in order to perform the matching opera-
automata, class, duplex, data, control, mobility, descrip- tion. The rules for performing the comparison depend on
tion, events, priority, methods, schemes, application, syntactic hints present in the values of each feature tag. For
video, actor, language, isfocus, type, extensions, or text example, a predicate such as (foo>=4) implies that the fea-
parameters, those are copied into the list. ture tag foo is a numeric value. The matching rules in RFC
◾◾ If any Contact URI parameter name begins with a 2533 (see Section 3.4.3) only require an implementation
“+,” it is copied into the list if the list does not already to know whether the feature tag is a numeric, token, or
contain that name with the plus removed. In other quoted string (Booleans can be treated as tokens). Quoted
words, if the video feature parameter is in the list, the strings are always matched using a case-sensitive match-
+video parameter would not be placed into the list. ing operation. Tokens are matched using case-insensitive
This conflict should never arise if the client were com- matching. These two cases are differentiated by the pres-
pliant to RFC 3840 (see Section 3.4), since it is illegal ence of angle brackets around the feature tag value. When
to use the + form for encoding of a feature tag in the these brackets are present (i.e., ;+sip.foo=”<value>”), it
base set. If the URI in the target set had no feature implies case sensitive string comparison. When they are
parameters, it is said to be immune to caller prefer- not present, (i.e., (;+sip.bar=”val”), it implies case insensi-
ence processing. This means that the URI is removed tivity. Numerics are matched using normal mathematical
from the target set temporarily, the caller preferences comparisons.
processing described below is executed, and then the First, the proxy applies the predicates associated with the
URI is added back in. Reject-Contact header field. For each contact predicate, each
Reject-Contact predicate (i.e., each predicate associated with
the Reject-Contact header field) is examined. If that Reject-
Assuming the URI has feature parameters, they are Contact predicate contains a filter for a feature tag, and that
converted to syntax using the rules of RFC 2533 (see feature tag is not present anywhere in the contact predicate,
Section 3.4.3). The resulting predicate is associated with that Reject-Contact predicate is discarded for the processing
a q-value. If the contact predicate was learned through a of that contact predicate. If the Reject-Contact predicate is
REGISTER request, the q-value is equal to the q-value in not discarded, it is matched with the contact predicate using
the Contact header field parameter, else 1.0 if not specified. the matching operation of RFC 2533 (see Section 3.4.3). If
As an example, consider the following registered Contact the result is a match, the URI corresponding to that con-
header field: tact predicate is discarded from the target set. The result is
Contact: <sip:[email protected]>;audio;video; that Reject-Contact will only discard URIs where the UA
mobility="fixed"; has explicitly indicated support for the features that are not
+sip.message="TRUE";other-param=66372; wanted.
Routing in SIP ◾ 399
Next, the proxy applies the predicates associated with the with the predicates in the matching set, to arrive at an over-
Accept-Contact header field. For each contact that remains all caller preference, Qa. For those URIs in the target set
in the target set, the proxy constructs a matching set, Ms. that remain, there will be a score that indicates its match
Initially; this set contains all of the Accept-Contact predi- against each Accept-Contact predicate in the matching set.
cates. Each of those predicates is examined. It is matched If there are M Accept-Contact predicates in the matching
with the contact predicate using the matching operation of set, there will be M scores S1 through SM, for each contact.
RFC 2533 (see Section 3.4.3). If the result is not a match, The overall caller preference, Qa, is the arithmetic average of
and the Accept-Contact predicate had its require flag set, S1 through SM.
the URI corresponding to that contact predicate is dis- At this point, any URIs that were removed from the tar-
carded from the target set. If the result is not a match but get set because they were immune from caller preferences
the Accept-Contact predicate did not have its require flag are added back in, and the Qa for that URI is set to 1.0. The
set, that Contact URI is not discarded from the target set; purpose of the caller preference Qa is to provide an ordering
however, the Accept-Contact predicate is removed from the for any contacts remaining in the target set, if the callee has
matching set for that contact. not provided an ordering. To do this, the contacts remain-
For each contact that remains in the target set, the proxy ing in the target set are sorted by the q-value provided by
computes a score for that contact against each predicate in the callee. Once sorted, they are grouped into equivalence
the contact’s matching set. Let the number of terms in the classes, such that all contacts with the same q-value are in
Accept-Contact predicate conjunction be equal to N. Each the same equivalence class. Within each equivalence class,
term in that predicate contains a single feature tag. If the the contacts are then ordered on the basis of their Qa val-
contact predicate has a term containing that same feature ues. The result is an ordered list of contacts that is used by
tag, the score is incremented by 1/N. If the feature tag was the proxy. If there were no URIs in the target set after the
not present in the contact predicate, the score remains application of the processing in this section, and the caller
unchanged. On the basis of these rules, the score can range preferences were based on implicit preferences, the process-
between 0 and 1. ing in this section is discarded, and the original target set,
The required and explicit tags are then applied, result- ordered by their original q-values, is used. This handles the
ing in potential modification of the score and the target set. case where implicit preferences for the method or event pack-
This process is summarized in Figure 9.3. If the score for the ages resulted in the elimination of all potential targets. By
contact predicate against that Accept-Contact predicate was going back to the original target set, those URIs will be tried,
<1, the Accept-Contact predicate had an explicit tag, and if and result in the generation of a 405 or 489 response. The
the predicate also had a require tag, the Contact URI cor- UAC can then use this information to try again, or report the
responding to that contact predicate is dropped. If, however, error to the user. Without reverting to the original target set,
the predicate did not have a require tag, the score is set to the UAC would see a 480 response, and have no knowledge
0. If there was no explicit tag, the score is unchanged. The of why their request failed. Of course, the target set can also
next step is to combine the scores and the q-values associated be empty after the application of explicit preferences.
T DROP
contact
T Set
Explicit
F score = 0
Compute Score
Score
score unchanged
Score = 1
Figure 9.3 Applying the score. (Copyright IETF. Reproduced with permission.)
400 ◾ Handbook on Session Initiation Protocol
This will result in the generation of a 480 by the proxy. only an explicit match for u3. That is because u3 is the only
This behavior is acceptable, and indeed, desirable in the case one that explicitly indicated support for video, and explicitly
of explicit preferences. When the caller makes an explicit indicated it is a message taker. Thus, u3 gets discarded and
preference, it is agreeing that its request might fail because the others remain.
of a preference mismatch. One might try to return an error Next, each of the remaining three contacts is compared
indicating the capabilities of the callee, so that the caller against each of the three Accept-Contact predicates. u1 is
could perhaps try again. However, doing so results in the a match to all three, earning a score of 1.0 for the first two
leaking of potentially sensitive information to the caller predicates, and 0.5 for the third (the methods feature tag was
without authorization from the callee, and therefore this present in the contact predicate, but the class tag was not).
specification does not provide a means for it. If a proxy server u2 does not match the first predicate. Because that predicate
is recursing, it adds the Contact header fields returned in the has a require tag, u2 is discarded. u4 matches the first predi-
redirect responses to the target set, and reapplies the caller cate, earning a score of 1.0. u4 matches the second predi-
preferences algorithm. If the server is redirecting, it returns cate; however, since the match is not explicit (the score is
all entries in the target set. It assigns q-values to those entries 0.0, in fact), the score is set to zero (it was already zero, so
so that the ordering is identical to the ordering determined nothing changes). u4 does not match the third predicate. At
by the processing above. However, it must not include the this point, u1 and u4 remain. u1 matched all three Accept-
feature parameters for the entries in the target set. If it did, Contact predicates, so its matching set contains all three,
the upstream proxy server would apply the same caller pref- with scores of 1, 1, and 0.5. u4 matches the first two predi-
erences once more, resulting in a double application of those cates, with scores of 1.0 and 0.0. Qa for u1 is 0.83 and Qa for
preferences. If the redirect server does wish to include the u4 is 0.5. u5 is added back in with a Qa of 1.0.
feature parameters in the Contact header field, it must redi- Next, the remaining contacts in the target set are sorted
rect using the original target set and original q-values before by q-value. u5 has a value of 0.5, u1 has a q-value of 0.2 and
the application of caller preferences. so does u4. There are two equivalence classes. The first has a
q-value of 0.5, and consists of just u5. Since there is only one
member of the class, sorting within the class has no impact.
9.9.5.2.5 Example
The second equivalence class has a q-value of 0.2. Within that
Consider the following example, which is contrived but illus- class, the two contacts, u1 and u4, are ordered on the basis of
trative of the various components of the matching process. their values of Qa. u1 has a Qa of 0.83, and u4, a Qa of 0.5.
There are five registered Contacts for sip:[email protected]. Thus, u1 comes first, followed by u4. The resulting overall
They are ordered set of contacts in the target set is u5, u1, and then u4.
Contact: sip:[email protected];audio;video;
methods="INVITE,BYE";q=0.2 9.9.6 Mapping Feature Parameters
Contact: sip:[email protected];audio="FALSE";
methods="INVITE";actor="msg-taker";q=0.2 to a Predicate
Contact: sip:[email protected]. Mapping between feature parameters and a feature set predi-
com;audio;actor="msg-taker";
methods="INVITE";video;q=0.3 cate, formatted according to the syntax of RFC 2533 (see
Contact: sip:[email protected];audio;methods Section 3.4.3), is trivial. It is just the opposite of the process
="INVITE,OPTIONS";q=0.2 described in RFC 3840 (see Section 3.4). Starting from a set of
Contact: sip:[email protected];q=0.5 feature-param, the procedure is as follows. Construct a conjunc-
tion. Each term in the conjunction derives from one feature-
An INVITE sent to sip:[email protected] contained param. If the feature-param has no value, it is equivalent, in
the following caller preferences header fields: terms of the processing which follows, as if it had a value of
Reject-Contact: *;actor="msg-taker";video TRUE. If the feature-param value is a tag-value-list, the ele-
Accept-Contact: *;audio;require ment of the conjunction is a disjunction. There is one term in
Accept-Contact: *;video;explicit the disjunction for each tag-value in the tag-value-list. Consider
Accept-Contact: *;methods="BYE";class now the construction of a filter from a tag-value. If the tag-value
=”business”;q=1.0 starts with an exclamation mark (!), the filter is of the form
There are no implicit preferences in this example because (! <filter from remainder>)
explicit preferences are provided. The proxy first removes u5
from the target set, since it is immune from caller preferences where <filter from remainder> refers to the filter that would
processing. Next, the proxy processes the Reject-Contact be constructed from the tag-value if the exclamation mark
header field. It is a match for all four remaining contacts, but had not been present.
Routing in SIP ◾ 401
If the tag-value starts with an octothorpe (#), the filter is a schemes, application, video, actor, isfocus, extensions, or
numeric comparison. The comparator is either =, > =, < =, or text, the prefix sip. is added to the remainder of the encoded
a range based on the next characters in the phrase. If the next name to compute the feature tag name. As an example, the
characters are =, > =, or < =, the filter is of the form Accept-Contact header field
(name comparator compare-value) Accept-Contact:*;mobility="fixed"
;events="!presence,message-summary"
where name is the name of the feature parameter after it has been ;language="en,de";description="<PC>";+sip
decoded (see below in the next paragraph), and the comparator .newparam
is either =, > =, or < = depending on the initial characters in the ;+rangeparam="#-4:+5.125"
phrase. If the remainder of the text in the tag-value after the
would be converted to the following feature predicate:
equal contains a decimal point (implying a rational number),
the decimal point is shifted right N times until it is an integer, (& (sip.mobility=fixed)
I. The compare-value above is then set to I/10**N, where 10**N (| (! (sip.events=presence)) (sip.
is the result of computing the number 10 to the Nth power. events=message-summary))
If the value after the octothorpe is a number, the filter is (| (language=en) (language=de))
a range. The format of the filter is (sip.description="PC")
(sip.newparam=TRUE)
(name=<remainder>) (rangeparam=-4..5125/1000))
where name is the feature tag after it has been decoded (see
below this paragraph), and <remainder> is the remainder of 9.9.7 Header Field Definitions
the text in the tagvalue after the #, with any decimal numbers
converted to a rational form, and the colon replaced by a dou- RFC 3841 defines three new header fields—Accept-Contact,
ble dot (..). If the tag-value does not begin with an octothorpe Reject-Contact, and Request-Disposition; the descriptions of
(it is a tokennobang or boolean), the filter is of the form these fields are provided in Section 2.8.2.
(name=tag-value)
9.9.7.1 Request Disposition
where name is the feature tag after it has been decoded
The Request-Disposition header field specifies caller prefer-
(see below this paragraph). If the feature-param contains a
ences for how a server should process a request. Its value is a list
string-value (based on the fact that it begins with a left angle
of tokens, each of which specifies a particular directive. Note
bracket [<] and ends with a right angle bracket [>]), the filter
that a compact form, using the letter d, has been defined. The
is of the form
directives are grouped into types. There can be only one direc-
(name="qdtext") tive of each type per request (e.g., you cannot have both proxy
and redirect in the same Request-Disposition header field).
Note the explicit usage of quotes around the qdtext, When the caller specifies a directive, the server should honor
which indicate that the value is a string. In RFC 2533 (see that directive. The following types of directives are defined:
Section 3.4.3), strings are compared using case-sensitive
rules and tokens are compared using case-insensitive rules. ◾◾ Proxy-directive: This type of directive indicates
Feature tags, as specified in RFC 2506 (see Section 2.11), whether the caller would like each server to proxy
cannot be directly represented as header field parameters in (proxy) or redirect (redirect).
the Contact, Accept-Contact, and Reject-Contact header ◾◾ Cancel-directive: This type of directive indicates
fields. This is due to an inconsistency in the grammars and whether the caller would like each proxy server to send a
in the need to differentiate feature parameters from param- CANCEL request downstream (cancel) in response to a
eters used by other extensions. As such, feature tag values are 200 OK from the downstream server (which is the normal
encoded from RFC 2506 format to yield an enc-feature-tag, mode of operation, making it redundant), or whether this
and then are decoded into RFC 2506 format. The decod- function should be left to the caller (no-cancel). If a proxy
ing process is simple. If there is a leading plus (+) sign, it is receives a request with this parameter set to nocancel, it
removed. Any exclamation point (!) is converted to a colon (:) should not CANCEL any outstanding branches upon
and any single quote (’) is converted to a forward slash (/). receipt of a 2xx. However, it would still send CANCEL
If there was no leading plus sign, and the remainder of the on any outstanding branches upon receipt of a 6xx.
encoded name was audio, automata, class, duplex, data, ◾◾ Fork-directive: This type of directive indicates
control, mobility, description, events, priority, methods, whether a proxy should fork a request ( fork), or
402 ◾ Handbook on Session Initiation Protocol
The set of request disposition directives is not extensible Despite the ABNF, there must not be more than one req-
on purpose. This is to avoid a proliferation of new extensions param or explicit-param in an ac-params. Furthermore, there
to SIP that are tunneled through this header field. can only be one instance of any feature tag in feature-param.
on the information provided in the location object. The Each entity is aware of the SIP location capabilities as
Geolocation-Error header field is used to convey location- described earlier. Alice (U1) is the Target, and Bob (U2) is
specific errors within a response. a location recipient (LR). A SIP intermediary may be a SIP
The Geolocation and Geolocation-Routing header fields proxy or even an entity with SIP B2BUA capability that may
can be included in the following SIP requests (see Section need to inspect and modify the SIP message body for loca-
15.3): INVITE, REGISTER, OPTIONS, BYE, UPDATE, tion information. Any SIP entity that receives and inspects
INFO, MESSAGE, REFER, SUBSCRIBE, NOTIFY, and location information is an LR; therefore, this SIP intermedi-
PUBLISH. In addition, the 424 Bad Location Information ary that receives a SIP request is potentially an LR. It should
response code is a rejection of the request due to its loca- be noted that it does not mean that such an intermediary
tion contents, indicating location information that was necessarily has to route the SIP request based on the location
malformed or not satisfactory for the recipient’s purpose or information. If a SIP UA performs the location function, we
could not be dereferenced. A more detailed description of identify it as capability with LS. However, in some use cases,
these header field and response codes is provided in the SIP location information passes through the LS, a dedicated
message header section. The Location Object may appear in server, that keeps the location information for the entire SIP
a MIME body attached to the SIP request, or it may be a network. LS may even be managed by a third party.
remote resource in the network. The location information
embedded in SIP message header fields can be conveyed on
9.10.2.1 Location Conveyed by Value
end-to-end using SIP request messages, and the SIP func-
tional entities can make routing decisions based on the loca- First, we take the example of location conveyance where
tion of the Location Target. Alice calls Bob directly and no other functional entities are
A Target is an entity whose location is being conveyed involved. In Figure 9.4b, the call flows show that Alice is
per RFC 3693. Thus, a Target could be a SIP UA, some other both the Target and the LS that is conveying her location
IP device (a router or a personal computer) that does not have directly to Bob, who acts as an LR. This conveyance is point-
a SIP stack, a non-IP device (a person or a black phone), or to-point: it does not pass through any SIP-layer intermediary.
even a noncommunications device (a building or store front). A Location Object appears by-value in the initial SIP request
In no way does this document assume that the SIP UA client as a MIME body, and Bob responds to that SIP request as
that sends a request containing a location object is necessar- appropriate. There is a Bad Location Information response
ily the Target. The location of a Target conveyed within SIP code introduced within this document to specifically inform
typically corresponds to that of a device controlled by the Alice if she conveys bad location to Bob, for example, Bob
Target, for example, a mobile phone; however, such devices “cannot parse the location provided,” or “there is not enough
can be separated from their owners, and moreover, in some location information to determine where Alice is.”
cases, the UA may not know its own location. In the SIP
context, a location recipient will most likely be a SIP UA, but
9.10.2.2 Location Conveyed as a Location URI
owing to the mediated nature of SIP architectures, location
information conveyed by a single SIP request may have mul- Now we consider the call flows shown in Figure 9.4c that
tiple recipients, as any SIP proxy server in the signaling path shows a little more complication by showing a diagram of
that inspects the location of the Target must also be con- indirect location conveyance from Alice to Bob, where Bob’s
sidered a Location Recipient. In presence-like architectures, entity has to retrieve the location object from a third party
an intermediary that receives publications of location infor- server. Here, the location information is conveyed indirectly,
mation and distributes them to watchers acts as a Location via a location URI carried in the SIP request (more of those
Server per RFC 3693. This location conveyance mechanism details in the following sections). If Alice sends Bob this loca-
can also be used to deliver URIs pointing to such Location tion URI, Bob will need to dereference the URI—analogous
Servers where prospective Location Recipients can request to Content Indirection (RFC 4483, see Section 16.6)—in
Location Objects. order to request the location information. In general, the LS
provides the location value to Bob instead of Alice directly
9.10.2 Basic SIP Location for conveyance to Bob. From a user interface perspective,
Bob the user would not know that this information was
Conveyance Operations gathered from LS indirectly rather than culled from the
We will provide some examples that will provide the basic SIP request; practically, this does not affect the operation of
idea about the operation of SIP location conveyance. Figure location-based applications.
9.4a shows a SIP network consisting of a caller with SIP UA The example given in this section is only illustrative, not
(U2), a callee with SIP UA (U2), SIP intermediary, and a normative. In particular, applications can choose to derefer-
location server (LS). ence a location URI at any time, possibly several times, or
404 ◾ Handbook on Session Initiation Protocol
(a)
Figure 9.4 Location conveyance in SIP: (a) SIP network and functional entities, (b) location conveyed by value, (c) loca-
tion conveyed by URI, (d) location conveyed through a SIP intermediary, and (e) SIP intermediary replacing bad location.
(Copyright IETF. Reproduced with permission.)
potentially not at all. Applications receiving a location URI intermediaries consuming location information is location-
in a SIP transaction need to be mindful of timers used by based routing. In this case, the intermediary chooses a next
different transactions. In particular, the means of dereferenc- hop for the SIP request by consulting a specialized location
ing the location URI take longer than the SIP transaction service that selects forwarding destinations based on the geo-
timeout (Timer C; see Section 3.12) as INVITE and rely graphical location information contained in the SIP request.
on mechanisms other than the transaction’s response code However, the most common case will be one in which the
to convey location errors, when returning such errors are SIP intermediary receives a request with location information
necessary. (conveyed either by-value or by-reference) and does not know
or care about Alice’s location, or support this extension, and
merely passes it on to Bob. In this case, the intermediary does
9.10.2.3 Location Conveyed
not act as a Location Recipient. When the intermediary is not
through a SIP Intermediary an LR, this use case is the same as the one described earlier in
In Figure 9.4d, we introduce the idea of a SIP intermediary that case of location conveyed by value.
into the example to illustrate the role of proxying in the loca- Note that an intermediary does not have to perform
tion architecture. This intermediary can be a SIP proxy or it location-based routing in order to be a location recipient. It
can be a B2BUA. In this message flow, the SIP intermediary could be the case that a SIP intermediary that does not per-
could act as an LR, in addition to Bob. The primary use for form location-based routing does care when Alice includes
Routing in SIP ◾ 405
her location; for example, it could care that the location centers to expedite call reception by the emergency services
information is complete or that it correctly identifies where personnel, thereby minimizing any delay in call establish-
Alice is. The best example of this is intermediaries that ment time. The implementation of these specialized deploy-
verify location information for emergency calling, but it ments is, however, outside the scope of this document.
could also be for any location-based routing, for example,
contacting your favorite local pizza delivery service, mak-
ing sure that organization has Alice’s proper location in the 9.10.2.5 Location URIs in Message Bodies
initial SIP request. There is another scenario in which the In the case where an LR sends a 424 Bad Location Informa
SIP intermediary cares about location and is not an LR, tion response and wishes to communicate suitable location-
one in which the intermediary inserts another location of by-reference rather than location-by-value, the 424 response
the Target, Alice in this case, into the request, and for- must include a content-indirection body per RFC 4483.
wards it. This secondary insertion is generally not advis-
able because downstream SIP entities will not be given any
guidance about which location to believe is better, more 9.10.2.6 Location Profile Negotiation
reliable, less prone to error, more granular, worse than the Figure 9.4c introduces the concept of sending location
other location, or just plain wrong. This example takes a indirectly. If a location URI is included in a SIP request,
“you break it, you buy it” approach to dealing with sec- the sending UA must also include a Supported header field
ond locations placed into a SIP request by an intermediary indicating which location profiles it supports. Two option
entity. That entity becomes completely responsible for all tags for location profiles are defined by this document:
location within that SIP request. geolocation-sip and geolocation-http. Future specifications
may define further location profiles per the IANA policy.
9.10.2.4 SIP Intermediary Replacing The geolocation-sip option tag signals support for acquiring
Bad Location location information via the presence event package of SIP
defined in RFC 3856. A location recipient who supports this
If the SIP intermediary rejects the message due to unsuit- option can send a SUBSCRIBE request and parse a resulting
able location information, the SIP response will indicate NOTIFY containing a PIDF-LO object. The URI schemes
there was Bad Location Information in the SIP request and supported by this option include sip, sips, and pres.
provide a location-specific error code indicating what Alice The geolocation-http option tag signals support for
needs to do to send an acceptable request as shown in Figure acquiring location information via HTTP defined in RFCs
9.4e. In this last use case, the SIP intermediary wishes to 7230–7235. A location recipient who supports this option
include a LO indicating where it understands Alice to be. can request location with an HTTP GET and parse a
Thus, it needs to inform her UA of what location it will resulting 200 response containing a Presence Information
include in any subsequent SIP request that contains her loca- Data Format-Location Object (PIDF-LO) defined in RFC
tion. In this case, the intermediary can reject Alice’s request 4119 (see Section 2.8). The URI schemes supported by this
and, through the SIP response, convey to her the best way option include http and https. A failure to parse the 200
to repair the request in order for the intermediary to accept response, for whatever reason, will return a Dereference
it. Overriding location information provided by the user Failure indication to the original location sending UA to
requires a deployment where an intermediary necessarily inform it that location was not delivered as intended. If the
knows better than an end user—after all, it could be that location URI receiver does not understand the URI scheme
Alice has an on-board GPS, and the SIP intermediary only sent to it, it will return an Unsupported header value of the
knows her nearest cell tower. Which is more accurate loca- option tag from the SIP request, and include the option tag
tion information? Currently, there is no way to tell which of the preferred URI scheme in the response’s Supported
entity is more accurate or which is wrong, for that matter. header field.
This shows the limitation of the location service that does
not specify how to indicate which location is more accurate
than another. 9.10.3 Geolocation Examples
As an aside, it is not envisioned that any SIP-based
9.10.3.1 Location-by-Value
emergency services request (i.e., IP-911 or 112 type of call
(in Coordinate Format)
attempt) will receive a corrective Bad Location Information
response from an intermediary. Most likely, in that scenario, This example shows an INVITE message with a coordinate
the SIP intermediary would act as a B2BUA and insert into location. In this example, the SIP request uses a sips-URI
the request by-value any appropriate location information defined in RFC 3261 (see Section 4.2), meaning that this
for the benefit of Public Safety Answering Point (PSAP) call message is protected using TLS on a hop-by-hop basis.
406 ◾ Handbook on Session Initiation Protocol
forwarded by this element before. The request has either be sent. A Via header field value is added only after the
looped or is legitimately spiraling through the element. To transport that will be used to reach the next hop has been
determine if the request has looped, the element may per- selected, which may involve the usage of the procedures in
form the branch parameter calculation described in RFC RFC 3263 (see Section 8.2.4). When the UAC creates a
3261 (see Section 3.11.6) on this message and compare it request, it must insert a Via into that request. The proto-
to the parameter received in that Via header field. If the col name and protocol version in the header field must be
parameters match, the request has looped. If they differ, SIP and 2.0, respectively. The Via header field value must
the request is spiraling, and processing continues. If a loop contain a branch parameter. This parameter is used to iden-
is detected, the element may return a 482 Loop Detected tify the transaction created by that request. This param-
response. However, this loop-detection procedure is highly eter is used by both the client and the server. The branch
insufficient to detect loops of SIP messages. RFC 5393 parameter value must be unique across space and time for
shows how this vulnerability enables an attack against SIP all requests sent by the UA. The exceptions to this rule are
networks where a small number of legitimate, even autho- CANCEL and ACK for non-2xx responses. As discussed
rized, SIP requests can stimulate massive amounts of proxy- below, a CANCEL request will have the same value of the
to-proxy traffic. branch parameter as the request it cancels. As discussed in
RFC 5393 that is described here in the context of RFC 3261 (see Section 3.12.1.1.3), an ACK for a non-2xx
loop detection shows that, taking advantage of the fork- response will also have the same branch ID as the INVITE
ing capability, requests will continue to propagate down whose response it acknowledges.
this tree until Max-Forwards reaches zero, creating a storm The uniqueness property of the branch ID parameter, to
(see Section 19.9) of 408 Request Timeout responses and/ facilitate its use as a transaction ID, was not part of RFC
or a storm of CANCEL requests will also be propagating 2543 obsoleted by RFC 3261. The branch ID inserted by
through the tree along with the INVITE requests. RFC an element compliant with this specification must always
5393 strengthens loop-detection requirements on SIP proxies begin with the characters z9hG4bK. These seven charac-
when they fork requests (i.e., forward a request to more than ters are used as a magic cookie (seven is deemed sufficient
one destination), and corrects and clarifies the description to ensure that an older RFC 2543 implementation would
of the loop-detection algorithm such proxies are required to not pick such a value), so that servers receiving the request
implement. Additionally, it defines a Max-Breadth header can determine that the branch ID was constructed in the
field for limiting the number of concurrent branches pursued fashion described by this specification (i.e., globally unique).
for any given request. If there is insufficient value set in the Beyond this requirement, the precise format of the branch
Max-Breadth header field to carry out a desired parallel fork- token is implementation defined. The Via header maddr, ttl,
ing, a proxy sends the 440 Max-Breadth Exceeded response and sent-by components will be set when the request is pro-
code. The Max-Breadth mechanism only limits concurrency. cessed by the transport layer (RFC 3261, see Section 3.13).
It does not limit the total number of branches a request can Via processing for proxies are described in RFC 3261 (see
traverse over its lifetime. However, an attacker with access Sections 3.11.6 and 3.11.7). This implies that the proxy will
to a sufficient number of distinct resources will still be able compute its own branch parameter, which will be globally
to stimulate a very large number of messages. The number unique for that branch, and will contain the requisite magic
of concurrent messages will be limited by the Max-Breadth cookie. Note that following only the guidelines in RFC 3261
mechanism, so the entire set will be spread out over a long (see Section 3.1.2.1.7) will result in a branch parameter that
period of time, giving operators better opportunity to detect will be different for different instances of a spiraled or looped
the attack and take corrective measures outside the scope of request through a proxy.
RFC 5393 recommendations. More work is needed to pre- However, proxies required to perform loop detection by
vent this form of attack. RFC 5393 have an additional constraint on the value they
place in the Via header field. Such proxies should create a
branch value separable into two parts in any implementa-
9.11.1 Enhancements in Loop- Detection tion-dependent way. The existence of these two parts is a
Algorithm requirement of the loop-detection procedure. If a proxy
chooses to employ some other mechanism, it is the imple-
9.11.1.1 Treatment of Via Header Field
menter’s responsibility to verify that the detection proper-
The proxy must insert a Via header field value into the ties defined by the requirements placed on these two parts
copy before the existing Via header field values using the are achieved. The first part of the branch value must satisfy
procedure described in RFC 3261 (see Section 2.8). The the constraints of RFC 3261 (see Section 3.1.2.1.7). The
Via header field indicates the transport used for the trans- second part is used to perform loop detection and distin-
action and identifies the location where the response is to guish loops from spirals. This second part must vary with
Routing in SIP ◾ 409
any field used by the location service logic in determining 9.11.1.3 Note to Implementers
where to retarget or forward this request. This is necessary
A common way to create the second part of the branch
to distinguish looped requests from spirals by allowing the
parameter value when forking a request is to compute a hash
proxy to recognize if none of the values affecting the pro-
over the concatenation of the Request-URI, any Route header
cessing of the request have changed. Hence, the second
field values used during processing the request, and any other
part must depend at least on the received Request-URI and
values used by the location service logic while processing this
any Route header field values used when processing the
request. The hash should be chosen so that there is a low
received request. Implementers need to take care to include
probability that two distinct sets of these parameters will col-
all fields used by the location service logic in that particu-
lide. Because the maximum number of inputs that need to
lar implementation. This second part must not vary with
be compared is 70, the chance of a collision is low even with
the request method. CANCEL and non-200 ACK requests
a relatively small hash value, such as 32 bits. CRC-32c as
must have the same branch parameter value as the corre-
specified in RFC 4960 is a specific acceptable function, as is
sponding request they cancel or acknowledge. This branch
MD5 of RFC 1321. Note that MD5 is being chosen purely
parameter value is used in correlating those requests at
for noncryptographic properties. An attacker who can con-
the server handling them as described in RFC 3261 (see
trol the inputs in order to produce a hash collision can attack
Sections 3.12.2.3 and 3.2).
the connection in a variety of other ways. When forming the
second part using a hash, implementations should include at
9.11.1.2 Updates in Loop-Detection Algorithm least one field in the input to the hash that varies between
different transactions attempting to reach the same destina-
RFC 5393 replaces all of item 4 in Section 16.3 of RFC 3261
tion to avoid repeated failure should the hash collide.
(see Section 3.11.3.4) and mandates that proxies required to
The Call-ID and CSeq fields would be good inputs for
perform loop detection must perform the following loop-
this purpose. A common point of failure to interoperate has
detection test before forwarding a request. Each Via header
been due to parsers objecting to the contents of another ele-
field value in the request whose sent-by value matches a
ment’s Via header field values when inspecting the Via stack
value placed into previous requests by this proxy must be
for loops. Implementers need to take care to avoid mak-
inspected for the second part as defined earlier (Section 4.2.1
ing assumptions about the format of another element’s Via
of RFC 5393). This second part will not be present if the
header field value beyond the basic constraints placed on
message was not forked when that Via header field value was
that format by RFC 3261 (see Section 2.8). In particular,
added. If the second field is present, the proxy must per-
parsing a header field value with unknown parameter names,
form the second-part calculation described earlier (Section
parameters with no values, or parameter values with or with-
4.2.1 of RFC 5393) on this request and compare the result
out quoted strings must not cause an implementation to fail.
to the value from the Via header field. If these values are
Removing, obfuscating, or in any other way modifying the
equal, the request has looped and the proxy must reject the
branch parameter values in Via header fields in a received
request with a 482 Loop Detected response. If the values
request before forwarding it removes the ability for the node
differ, the request is spiraling and processing continues to
that placed that branch parameter into the message to per-
the next step.
form loop detection. If two elements in a loop modify branch
parameters this way, a loop can never be detected.
9.11.1.2.1 Impact of Loop Detection
on Overall Network Performance
9.11.2 Max-Breadth Header Field
These requirements and the recommendation to use the loop-
detection mechanisms in this document make the favorable The Max-Breadth mechanism defined here limits the total
trade of exponential message growth for work that is, at number of concurrent branches caused by a forked SIP
worst, order n^2 as a message crosses n proxies. Specifically, request. With this mechanism, all proxyable requests are
this work is order m*n, where m is the number of proxies in assigned a positive integral Max-Breadth value, which
the path that fork the request to more than one location. denotes the maximum number of concurrent branches this
In practice, m is expected to be small. The loop-detection request may spawn through parallel forking as it is forwarded
algorithm expressed here per RFC 5393 requires a proxy to from its current point. When a proxy forwards a request, its
inspect each Via element in a received request. In the worst Max-Breadth value is divided among the outgoing requests.
case, where a message crosses N proxies, each of which loop In turn, each of the forwarded requests has a limit on how
detect, proxy k does k inspections, and the overall number of many concurrent branches it may spawn. As branches com-
inspections spread across the proxies handling this request is plete, their portion of the Max-Breadth value becomes avail-
the sum of k from k = 1 to k = N, which is N(N + 1)/2. able for subsequent branches, if needed. If there is insufficient
410 ◾ Handbook on Session Initiation Protocol
Max-Breadth to carry out a desired parallel fork, a proxy can the Incoming Max-Breadth in a given response context. If
return the 440 Max-Breadth Exceeded response defined in a SIP proxy determines a response context has insufficient
this document. This mechanism operates independently from Incoming Max-Breadth to carry out a desired parallel fork,
Max-Forwards. Max-Forwards limits the depth of the tree a and the proxy is unwilling/unable to compensate by forking
request may traverse as it is forwarded from its origination serially or sending a redirect, that proxy must return a 440
point to each destination it is forked to. As discussed earlier, Max-Breadth Exceeded response.
the number of branches in a tree of even limited depth can be Notice that these requirements mean a proxy receiving a
made large (exponential with depth) by leveraging forking. request with a Max-Breadth of 1 can only fork serially, but
Each such branch has a pair of SIP transaction state it is not required to fork at all—it can return a 440 instead.
machines associated with it. The Max-Breadth mechanism Thus, this mechanism is not a tool a UA can use to force all
limits the number of branches that are active (those that have proxies in the path of a request to fork serially. A SIP proxy
running transaction state machines) at any given point in may distribute Max-Breadth in an arbitrary fashion between
time. Max-Breadth does not prevent forking. It only limits active branches. A proxy should not use a smaller amount of
the number of concurrent parallel forked branches. In par- Max-Breadth than was present in the original request unless
ticular, a Max-Breadth of 1 restricts a request to pure serial the Incoming Max-Breadth exceeded the proxy’s maximum
forking rather than restricting it from being forked at all. A acceptable value. A proxy must not decrement Max-Breadth
client receiving a 440 Max-Breadth Exceeded response can for each hop or otherwise use it to restrict the depth of a
infer that its request did not reach all possible destinations. request’s propagation.
Recovery options are similar to those when receiving a 483
Too Many Hops response, and include affecting the rout-
9.11.2.3 Reusing Max-Breadth
ing decisions through whatever mechanisms are appropriate
to result in a less broad search, or refining the request itself Because forwarded requests that have received a final response
before submission to make the search space smaller. Figure do not count toward the Outgoing Max-Breadth, whenever
9.5 adopted from RFC 5393 depicts an example of how the a final response arrives, the Max-Breadth that was used on
combination of Max-Breadth and Max-Forwards is working that branch becomes available for reuse. Proxies should be
in view of different scenarios of forking: Parallel, Sequential, prepared to reuse this Max-Breadth in cases where there may
and None. be elements left in the target set.
(a)
(b)
UAC Proxy A Proxy B Proxy C
INVITE, MB: 60, MF: 70
INVITE, MB: 60, MF: 69
(c)
MB: 4
MF: 5
MB: 2 P MB: 2
MF: 4 MF: 4
MB: 1 P MB: 1 MB: 1 P MB: 1
MF: 3 MF: 3 MF: 3 MF: 3
P P P P
MB: 1 MB: 1 MB: 1 MB: 1
MF: 2 MF: 2 MF: 2 MF: 2
P P P P
MB: 1 MB: 1 MB: 1 MB: 1
MF: 1 MF: 1 MF: 1 MF: 1
P P P P
(d)
Figure 9.5 Max-Breadth and Max-Forwards example: (a) parallel forking, (b) sequential forking, (c) no forking, and
(d) combined working of Max-Breadth and Max-Forwards. (Copyright IETF. Reproduced with permission.)
may receive multiple 2xx responses for a single forwarded 9.11.2.5.3 Max-Breadth and Automaton UAs
INVITE request. Also, implementations following RFC
Designers of automaton UAs including B2BUAs, gateways,
2543 obsoleted by RFC 3261 may send back a 6xx followed
exploders, and any other element that programmatically
by a 2xx on the same branch. Implementations that sub-
sends requests as a result of incoming SIP traffic should con-
tract from the Outgoing Max-Breadth when they receive
sider whether Max-Breadth limitations should be placed on
a 2xx response to an INVITE request must be careful to
outgoing requests. For example, it is reasonable to design
avoid bugs caused by subtracting multiple times for a single
B2BUAs to carry the Max-Breadth value from incoming
branch.
412 ◾ Handbook on Session Initiation Protocol
requests into requests that are sent as a result. Also, it is rea- implementations described here. Proxies should use dou-
sonable to place Max-Breadth constraints on sets of requests ble Record-Routing for any multihomed situation that
sent by exploders when they may be leveraged in an ampli- may affect the further processing, and they should put
fication attack. transport protocol parameters on Record-Route URIs
in some circumstances. UAs should not offer options to
overwrite the transport for initial requests. UAs should
9.11.2.6 Parallel and Sequential Forking
rely on DNS to express their desired transport and should
Inherent in the definition of this mechanism is the ability avoid IP addresses with transport parameters in this case.
of a proxy to reclaim apportioned Max-Breadth while fork- Lastly, UAs should be ready to switch transports between
ing sequentially. The limitation on Outgoing Max-Breadth is the initial request and further in-dialog messages.
applied to concurrent branches only. For example, if a proxy
receives a request with a Max-Breadth of 4 and has eight tar-
gets to forward it to, that proxy may parallel fork to four of
these targets initially (each with a Max-Breadth of 1, totaling PROBLEMS
an Outgoing Max-Breadth of 4). If one of these transactions 1. How does a SIP registrar act as the routing database?
completes with a failure response, the outgoing Max-Breadth Explain in detail with examples.
drops to 3, allowing the proxy to forward to one of the four 2. How does a SIP proxy route incoming SIP messages?
remaining targets (again, with a Max-Breadth of 1). Explain in detail with examples.
3. How does a strict-routing SIP proxy handle incoming
SIP messages? Explain in detail with examples.
9.11.2.7 Max-Breadth Split Weight Selection 4. What are the procedures for rewriting SIP Record-
There are a variety of mechanisms for controlling the weight Route header field values? What are the problems in
of each fork branch. Fork branches that are given more Max- doing so for routing? What are the recommended solu-
Breadth are more likely to complete quickly (because it is less tions for these problems?
likely that a proxy down the line will be forced to fork sequen- 5. Explain the Record-Routing with GRUU using
tially). By the same token, if it is known that a given branch detailed examples. Explain the double Route-Record
will not fork later on, a Max-Breadth of 1 may be assigned with examples.
with no ill effect. This would be appropriate, for example, if a 6. What are the problems for switching the transport pro-
proxy knows the branch is using the UA-initiated connection tocol, for example, switching from TCP to UDP? How
management over the SIP network defined in RFC 5626 (see are these problems solved by the SIP entities: UA and
Section 15.2). proxy?
7. What is caller preference-based routing in SIP? Explain
with examples. How does a UAC handle this routing
scheme for the following preferences: request handling
9.12 Summary and feature set? Explain the behavior of UAS with
We have describe how routing is done in the SIP network, examples.
including the use of Route, Route-Record, and double 8. How does a SIP proxy behave for the caller preference-
Router-Record header. The registration and routing schemes based routing scheme in processing the Request-
used by different SIP functional entities over the SIP net- Disposition? Explain with examples.
work are explained. However, there are some interoperability 9. How does a SIP proxy behave for the caller preference-
issues that are faced when rewriting of Record-Route and based routing scheme in matching preference and
transport switching are performed. We are summarizing the capability of the following: extracting explicit and
recommendations for avoiding these interoperability issues implicit preferences and constructing contact predi-
as discussed earlier: cates? How are the feature parameters mapped to a
predicate? Explain all of these items with examples.
◾◾ Record-Route rewriting is presented as a technique 10. What is the location-based routing in SIP? How are the
that may be used; however, its drawbacks need to be basic SIP location conveyance operations performed
noted as explained. for the following: location conveyed by value, loca-
◾◾ The double Record-Routing technique described ear- tion URI, through a SIP intermediary, and location
lier should be used. URIs in message bodies? How does a SIP intermediary
◾◾ The Record-Route header interoperability problems replace a bad location?
on transport protocol switching outlined here should 11. How is a SIP location profile negotiated? Explain the
be avoided using the recommended UA and proxy SIP location-based routing using geolocation examples
Routing in SIP ◾ 413
as follows: location-by-value (in coordinate format) Max-Breadth header field reused? Explain the proxy
and two locations composed in same location object. and UA behavior in using the Max-Breadth header
12. How is the geopriv privacy maintained in SIP location- field in avoiding the routing loop over the SIP network.
based routing? Describe the overall security issues and 15. Explain the parallel and sequential forking for avoid-
their solutions in the SIP location-based routing. ing the route loop in using the Max-Breadth header
13. How does the looping occur in routing SIP messages? field over the SIP network. Explain the same in
Explain the loop-detection algorithm that is used in using the Max-Breadth header field with split weight
SIP. What is the impact in overall performance of the selection.
loop-detection over the SIP network? 16. Explain in detail the Max-Breadth header field’s effect
14. How is the SIP Max-Breadth header field used in avoid- on the forking-based amplification attacks. Recommend
ing the routing loop over the SIP network? How is the remedies against these attacks.
Chapter 10
10.1 Introduction
The user identity in Session Initiation Protocol (SIP) usually
10.2 Multiple User Identities
describes a unique public identity of a user that is shared with A user may have multiple devices, especially if the user
others so that they can contact the user. The user can be a sin- is mobile, where the user can be reached. For example, a
gle user, a group of users, a service, or a device. In some cases, service provider, as Third Generation Partnership Project’s
the user identity in SIP can be a private identity that is only (3GPP’s) Internet Protocol (IP) multimedia subsystem
shared within a certain closed group without making it public, (IMS) specification allows, may permit a user to have sev-
such as the user’s identity for access control, billing, and charg- eral SIP user identities (SUIs). Different SUIs may serve as
ing by a service provider. As explained earlier in the context of aliases for a particular user. These SUIs are totally inter-
the user’s address registration, the user identity in SIP is some- changeable because they are associated to the same set of
times called address of record (AOR), which usually looks like services and are transparently associated to the same devices.
sip:username@domain (e.g., sip:[email protected]). The The user may use them to differentiate between different
SIP user agent (UA) registers the user’s SIP identity with the groups of contacts for different purposes, such as relatives
SIP registrar server. The SIP server binds the user’s SIP identity versus friends versus strangers. A user may typically have
to the network address of user SIP UA (its contact address). This a telephone Uniform Resource Identifier (tel URI) as an
lets others who know the user’s AOR call without knowing the alias to a SIP URI. Different indoor units (IDUs) may also
SIP UA’s user network address. permit the user to endorse several personas (e.g., member
415
416 ◾ Handbook on Session Initiation Protocol
has finally been established can be signed by the authentica- Sections 10.4.1 and 10.4.2). Also, RFC 5876 does not specify
tion service of the retargeted caller’s domain, although the the use of the P-Preferred-Identity header field in responses,
URI of the To header field of the SIP request message is dif- as this would serve no purpose in the absence of the ability
ferent because the callee is located in the retargeted domain. for a proxy to insert the P-Asserted-Identity header field. The
This type of call establishment with signed Identity header security issues are described in Section 19.2.2.
for the retargeted callee is defined as the connected identity.
10.4.4.1 Inclusion of P-Asserted-Identity
10.4.4 Recommended Use of Asserted by a UAC
Identity with SIP Messages
RFC 3325 (see Sections 10.4.1 and 10.4.2) does not include
SIP has a mechanism for conveying the identity of the origi- procedures for a UAC to include the P-Asserted-Identity
nator of a request by means of the P-Asserted-Identity and header field in a request. This can be meaningful if the
P-Preferred-Identity header fields. These header fields are UAC is in the same Trust Domain as the first downstream
specified for use in requests using a number of SIP methods, SIP entity. Examples of types of UACs that are often suit-
in particular the INVITE method. However, RFC 3325 (see able for inclusion in a Trust Domain are as follows: public
Sections 10.4.1 and 10.4.2) does not specify the insertion of switched telephone network (PSTN) gateways, Media serv-
the P-Asserted-Identity header field by a trusted UA client ers, Application servers (or B2BUAs) that act as URI list serv-
(UAC); does not specify the use of P-Asserted-Identity and ers (RFC 5363, see Section 19.6), and Application servers
P-Preferred-Identity header fields with certain SIP methods (or B2BUAs) that perform third-party call control. In the
such as UPDATE, REGISTER, MESSAGE, and PUBLISH; particular case of a PSTN gateway, the PSTN gateway might
and does not specify how to handle an unexpected number be able to assert an identity received from the PSTN, the
of URIs or unexpected URI schemes in these header fields. proxy itself having no means to authenticate such an identity.
This document extends RFC 3325 to cover these situa- Likewise, in the case of certain application server or B2BUA
tions. RFC 5876 that is described here extends RFC 3325 arrangements, the application server or B2BUA may be in
(see Section 10.4) by allowing inclusion of the P-Asserted- a position to assert an identity of a user on the other side of
Identity header field by a UAC in the same Trust Domain as that application server or B2BUA. In accordance with RFC
the first proxy, and allowing use of P-Asserted-Identity and 3325 (see Sections 10.4.1 and 10.4.2), nodes within a Trust
P-Preferred-Identity header fields in any request except ACK Domain (see Section 10.5) must behave in accordance with
and CANCEL. The reason for these two exceptions is that a Spec(T), and this principle needs to be applied between
ACK and CANCEL requests cannot be challenged for digest a UAC and its proxy as part of the condition to consider
authentication. RFC 3325 (see Sections 10.4.1 and 10.4.2) the UAC to be within the same Trust Domain. The normal
allows the P-Asserted-Identity and P-Preferred-Identity header proxy procedures of RFC 3325 ensure that the header field
fields each to contain at most two URIs, where one is a SIP is removed or replaced if the first proxy considers the UAC to
or SIPS URI (RFC 3261, see Section 4.2.1) and the other is be outside the Trust Domain. This update to RFC 3325 clar-
a tel URI (RFC 3966, see Section 4.2.2). ifies that a UAC may include a P-Asserted-Identity header
This may be unduly restrictive in the future, for example, field in a request in certain circumstances.
if there is a need to allow other URI schemes, if there is a
need to allow both a SIP and a SIPS URI, or if there is a
10.4.4.2 Inclusion of P-Asserted- Identity
need to allow more than one URI with the same scheme
in Any Request
(e.g., a SIP URI based on a telephone number and a SIP
URI that is not based on a telephone number). This speci- There are several use cases that would benefit from the use of
fication (RFC 5876) therefore provides forwards compat- the P-Asserted-Identity header field in an UPDATE request.
ibility by mandating tolerance to the receipt of unexpected These use cases apply within a Trust Domain where the use
URIs. RFC 3325 (see Sections 10.4.1 and 10.4.2) is unclear of asserted identity is appropriate (see RFC 3325). In one
on the use of P-Asserted-Identity in responses. In contrast example, an established call passes through a gateway to the
to requests, there is no means in SIP to challenge a UA to PSTN. The gateway becomes aware that the remote party
provide SIP digest authentication in a response. As a result, in the PSTN has changed, for example, due to call trans-
there is currently no standardized mechanism whereby a fer. By including the P-Asserted-Identity header field in an
proxy can authenticate a UA server (UAS). Since authenti- UPDATE request, the gateway can convey the identity of
cating the source of a message is a prerequisite for asserting the new remote party to the peer SIP UA. Note that the (re-)
an identity, this specification (RFC 5876) does not specify INVITE method could be used in this situation. However,
the use of the P-Asserted-Identity header field in responses. this forces an offer–answer exchange, which typically is not
This may be the subject of a future update to RFC 3325 (see required in this situation. Also, it involves three messages
User and Network- Asserted Identity in SIP ◾ 419
rather than two. In another example, a B2BUA that is a CANCEL). This update to RFC 3325 allows a P-Asserted-
3PCC (RFC 3725, see Section 18.3) wishes to join two calls Identity or P-Preferred-Identity header field to be included in
together, one of which is still waiting to be answered and any request except ACK and CANCEL.
potentially is forked to different UAs. At this point in time,
it is not possible to trigger the normal offer–answer exchange
10.4.4.3 Dialog Implications
between the two joined parties, because of the mismatch
between a single dialog on the one side and potentially mul- A P-Asserted-Identity header field in a received request asserts
tiple early dialogs on the other side, so this action must wait the identity of the source of that request and says nothing
until one of the called UAs answers. about the source of subsequent received requests claiming to
However, it would be useful to give an early indication to relate to the same dialog. The recipient can make its own
each user concerned of the identity of the user to which they deductions about the source of subsequent requests not con-
will become connected when the call is answered. In other taining a P-Asserted-Identity header field. This document
words, it would provide the new calling UA with the identity does not change RFC 3325 in this respect.
of the new called user and provide the new called UA(s) with
the identity of the new calling user. This can be achieved by
10.4.4.4 SIP Entity Behavior
the B2BUA sending an UPDATE request with a P-Asserted-
Identity header field on the dialogs concerned. Within a Trust This document updates RFC 3325 (see Sections 10.4.1 and
Domain, a P-Asserted-Identity header field could advanta- 10.4.2) by allowing a P-Asserted-Identity header field to be
geously be used in a REGISTER request between an edge included by a UAC within the same Trust Domain and by
proxy that has authenticated the source of the request and allowing a P-Asserted-Identity or P-Preferred-Identity header
the registrar. Within a Trust Domain, a P-Asserted-Identity field to appear in any request except ACK or CANCEL.
header field could advantageously be used in a MESSAGE
request to assert the source of a page-mode instant message.
10.4.4.4.1 UAC Behavior
This would complement its use in an INVITE request to
assert the source of an instant-message session or any other A UAC may include a P-Asserted-Identity header field in
form of session. Similarly, between a UAC and first proxy any request except ACK and CANCEL to report the
that are not within the same Trust Domain, a P-Preferred- identity of the user on behalf of which the UAC is acting
Identity header field could be used in a MESSAGE request and whose identity the UAC is in a position to assert. A
to express a preference when the user has several identities. UAC should do so only in cases where it believes it is in
Within a Trust Domain, a P-Asserted-Identity header field the same Trust Domain as the SIP entity to which it sends
could advantageously be used in a PUBLISH request to the request and where it is connected to that SIP entity in
assert the source of published state information. accordance with the security requirements of RFC 3325. A
This would complement its use in SUBSCRIBE and UAC should not do so in other circumstances and might
NOTIFY requests. Similarly, between a UAC and first proxy instead use the P-Preferred-Identity header field. A UAC
that are not within the same Trust Domain, a P-Preferred- must not include both header fields. A UAC may include
Identity header field could be used in a PUBLISH request a P-Preferred-Identity header field in any request except
to express a preference when the user has several identities. ACK or CANCEL. Inclusion of a P-Asserted-Identity or
Thus, there are several examples where P-Asserted-Identity P-Preferred-Identity header field in a request is not limited
could be used in requests with methods for which there is no to the methods allowed in RFC 3325.
provision in RFC 3325 (see Sections 10.4.1 and 10.4.2). This
leaves a few methods for which use cases are less obvious,
10.4.4.4.2 Proxy Behavior
but the inclusion of P-Asserted-Identity would not cause any
harm. In any requests, the header field would simply assert If a proxy receives a request containing a P-Asserted-Identity
the source of that request, whether or not this is of any use header field from a UAC within the Trust Domain, it must
to the UAS. Inclusion of P-Asserted-Identity in a request behave as it would for a request from any other node within
requires that the original asserter of an identity be able to the Trust Domain, in accordance with the rules of RFC 3325
authenticate the source of the request. This implies the abil- for a proxy. Note that this implies that the proxy must have
ity to challenge a request for SIP digest authentication, which authenticated the sender of the request in accordance with
is not possible with ACK and CANCEL requests. Therefore, the Spec(T) (see Section 10.5) in force for the Trust Domain,
ACK and CANCEL requests need to be excluded. Similarly, and determined that the sender is indeed part of the Trust
there are examples where P-Preferred-Identity could be used Domain. If a proxy receives a request (other than ACK or
in requests with methods for which there is no provision in CANCEL) containing a P-Asserted-Identity or P-Preferred-
RFC 3325 or any other RFC (with the exception of ACK and Identity header field, it must behave in accordance with the
420 ◾ Handbook on Session Initiation Protocol
rules of RFC 3325 for a proxy, even if the method is not one A proxy must not forward a URI when forwarding a
for which RFC 3325 specifies the use of that header field. request, if that URI is to be ignored in accordance with the
requirement above. When a UAC or a proxy sends a request
containing a P-Asserted-Identity header field to another node
10.4.4.4.3 Registrar Behavior
in the Trust Domain, if that other node complies with RFC
If a registrar receives a REGISTER request containing 3325 but not with this specification, and if the method is not
a P-Asserted-Identity header field, it must disregard the one for which RFC 3325 specifies the use of the P-Asserted-
asserted identity unless it is received from a node within Identity header field, and if the request also contains a
the Trust Domain. If the node is within the Trust Domain Privacy header field with value id, as specified in RFC 3325,
(the node having been authenticated by some means), the the other node might not handle the Privacy header field cor-
registrar MAY use this as evidence that the registering UA rectly. To prevent incorrect handling of the Privacy header
has been authenticated and is represented by the identity field with value id, the Spec(T) in force for the Trust Domain
asserted in the header field. should require all nodes to comply with this specification. If
this is not the case, a UAC or a proxy should not include a
P-Asserted-Identity header field in a request if the method is
10.4.4.4.4 UAS Behavior
not one for which RFC 3325 specifies use of the P-Asserted-
If a UAS receives any request (other than ACK or CANCEL) Identity header field and if the request also contains a Privacy
containing a P-Asserted-Identity header field, it must behave header field with value id.
in accordance with the rules of RFC 3325 for a UAS, even if
the method is not one for which RFC 3325 specifies the use
of that header field.
10.5 Network-Asserted Identity
10.4.4.4.5 General Handling 10.5.1 Overview
An entity receiving a P-Asserted-Identity or P-Preferred-Identity A NAI defined in RFC 3324 is an identity initially derived
header field can expect the number of URIs and the combina- by a SIP network intermediary as a result of an authentica-
tion of URI schemes in the header field to be in accordance tion process. We will describe here that there is a need for
with RFC 3325, any updates to RFC 3325, or any Spec(T) exchange of the NAI within networks of securely intercon-
(see Section 10.5) that states otherwise. If an entity receives nected trusted nodes and to UAs securely connected to such
a request containing a P-Asserted-Identity or P-Preferred- networks. SIP allows users to assert their identity in a num-
Identity header field containing an unexpected number of ber of ways, for example, using the From header. However,
URIs or unexpected URI schemes, it must act as follows: ignore there is no requirement for these identities to be anything
any URI with an unexpected URI scheme; ignore any URI for other than the user’s desired alias. An authenticated identity
which the expected maximum number of URIs with the same of a user can be obtained using SIP Digest Authentication
scheme occurred earlier in the header field; and ignore any URI (or by other means). However, UAs do not always have the
whose scheme is not expected to occur in combination with a necessary key information to authenticate another UA.
scheme that occurred earlier in the header field. In the absence A NAI is an identity initially derived by a SIP network
of a Spec(T) determining otherwise, this document does not intermediary as a result of an authentication process. This
change the RFC 3325 requirement that allows each of these may or may not be based on SIP Digest Authentication.
header fields to contain at most two URIs, where one is a SIP or This document describes short-term requirements for the
SIPS URI and the other is a tel URI; however, future updates exchange of NAIs within networks of securely intercon-
to this document may relax that requirement. In the absence nected trusted nodes and also to UAs with secure connec-
of such a relaxation or a Spec(T) determining otherwise, the tions to such networks. Such a network is described in this
RFC 3325 requirement means that an entity receiving a request document as a Trust Domain, and we present a strict defi-
containing a P-Asserted-Identity or P-Preferred-Identity header nition of trust and Trust Domain for the purposes of this
field must act as follows: document. These short-term requirements provide only for
the exchange of NAI within a Trust Domain and to an
◾◾ Ignore any URI with a scheme other than SIP, SIPS, entity directly connected to the Trust Domain. General
or tel. requirements for transport of NAIs on the Internet are out
◾◾ Ignore a second or subsequent SIP URI, a second or sub- of scope of this document. Note that we have described
sequent SIPS URI, or a second or subsequent tel URI. more appropriate use of asserted identities updating RFC
◾◾ Ignore a SIP URI if a SIPS URI occurred earlier in the 3235 including NAI and maintaining privacy in Section
header field and vice versa. 20.2.8.4.
User and Network- Asserted Identity in SIP ◾ 421
10.5.2 Trust Domain Identities, NAI, the owners/operators of the devices. We say a node is trusted
and Trust Domain Specification (with respect to a given Trust Domain) if and only if it is
a member of that domain. We say that a node, A, in the
10.5.2.1 Trust Domain Identities domain is trusted by a node, B (or “B trusts A”), if and only if
An identity, for the purposes of the Trust Domain as speci-
fied here per RFC 3324, can be a SIP, SIPS or tel URI, and ◾◾ There is a secure connection between the nodes.
optionally a Display Name. The URI must be meaningful to ◾◾ B has configuration information indicating that A is a
the domain identified in the URI (in the case of SIP or SIPS member of the Trust Domain.
URIs) or the owner of the E.164 number (in the case of tel
URIs), in the sense that when used as a SIP Request-URI in Note that B may or may not be a member of the Trust
a request sent to that domain/number range owner, it would Domain. For example, B may be a UA that trusts a given
cause the request to be routed to the user/line that is associ- network intermediary, A (e.g., its home proxy). A secure con-
ated with the identity, or to be processed by service logic run- nection in this context means that messages cannot be read
ning on that user’s behalf. If the URI is a SIP or SIPS URI, by third parties, cannot be modified by third parties without
then depending on the local policy of the domain identified detection, and that B can be sure that the message really did
in the URI, the URI may identify some specific entity, such come from A. The level of security required is a feature of
as a person. If the URI is a tel URI, then depending on the the Trust Domain, that is, it is defined in Spec(T). Within
local policy of the owner of the number range within which this context, SIP signaling information received by one node
the telephone number lies, the number may identify some from a node that it trusts is known to have been generated
specific entity, such as a telephone line. However, it should and passed through the network according to the procedures
be noted that identifying the owner of the number range is of the particular specification set Spec(T), and therefore can
a less straightforward process than identifying the domain be known to be valid, or at least as valid as specified in the
that owns a SIP or SIPS URI. specifications Spec(T). Equally, a node can be sure that sig-
naling information passed to a node that it trusts will be
handled according to the procedures of Spec(T). For these
10.5.2.2 Network-Asserted Identity capabilities to be useful, Spec(T) must contain requirements
as to how the NAI is generated, how its privacy is protected,
A NAI is an identity derived by a SIP network entity as a and how its integrity is maintained as it is passed around the
result of an authentication process, which identifies the network. A reader of Spec(T) can then make an informed
authenticated entity in the sense above. In the case of a SIP or judgment about the authenticity and reliability of network
SIPS URI, the domain included in the URI must be within asserted information received from the Trust Domain T.
the Trust Domain. In the case of a tel URI, the owner of the The term trusted (with respect to a given Trust Domain)
E.164 number in the URI must be within the Trust Domain. can be applied to a given node in an absolute sense—it is
The authentication process used, or at least its reliability/ just equivalent to saying the node is a member of the Trust
strength, is a known feature of the Trust Domain using the Domain. However, the node itself does not know whether
NAI mechanism, that is, in the language of described below, another arbitrary node is trusted, even within the Trust
it is defined in Spec(T). Domain. It does know about certain nodes with which it has
secure connections as described above. With the definition
above, statements such as “A trusted node shall...” are just
10.5.2.3 Trust Domains
shorthand for “A node compliant to this specification shall....”
A Trust Domain for the purposes of NAI is a set of SIP Statements such as “When a node receives information from a
nodes (UAC, UAS, proxies, or other network intermediar- trusted node...” are not valid, because one node does not have
ies) that are trusted to exchange NAI information in the complete knowledge about all the other nodes in the Trust
sense described here. A node can be a member of a Trust Domain. Statements such as “When a node receives informa-
Domain, T, only if the node is known to be compliant to tion from another node that it trusts...” are valid, and should
a certain set of specifications, Spec(T), which characterize be interpreted according to two criteria described above. The
the handling of NAI within the Trust Domain, T. Trust above relationships are illustrated in Figure 10.1.
Domains are constructed by human beings who know the
properties of the equipment they are using/deploying. In ◾◾ A, B, and C are part of the same Trust Domain.
the simplest case, a Trust Domain is a set of devices with a ◾◾ A trusts C, but A does not trust B.
single owner/operator who can accurately know the behavior ◾◾ Since E knows that B is inside of the Trust Domain,
of those devices. Such simple Trust Domains may be joined E trusts B, but B does not trust E.
into larger Trust Domains by bilateral agreements between ◾◾ B does not trust F, and F does not trust B.
422 ◾ Handbook on Session Initiation Protocol
10.5.4 Transport of NAI
F
10.5.4.1 Sending of NAI within a Trust Domain
It shall be possible for one node within a Trust Domain to
Trust Domain
securely send a NAI to another node that it trusts.
A B E 10.5.4.2 Receiving of NAI
within a Trust Domain
It shall be possible for one node within a Trust Domain to
receive a NAI from another node that it trusts.
C
and tel URIs, all of which identify the user as described earlier. It 2. What is public user identity and how is it being used by
is not required to transport both a SIP and SIPS URI. It shall be a SIP user over the public Internet? Explain the public
possible for the capability to transport additional types of iden user identity using examples.
tity associated with a single party to be introduced in the future. 3. What are the definitions of the three categories of pri
vate user identities: P-Asserted-Identity, P-Preferred-
Identity, and Connected Identity? How are these private
10.6 Summary user identities used by a SIP user over the private IP
network? Explain these private user identities using
We have described the SIP user identity used over SIP net- examples.
works that can be both public and private. In addition, we 4. What is the NAI? What is a Trust Domain? What is
have described the NAI that is used over the private SIP net- Spec(T)? How can the NAI be useful in creating a
work and is exchanged among the secure trusted SIP network private Trust Domain over the private SIP network?
intermediaries created by a trusted domain in accordance with Explain the NAI, Trust Domain, and Spect(T) using
the Spec(T). The three categories of private user identities detailed examples.
defined are as follows: P-Asserted-Identity, P-Preferred-Identity, 5. How is a NAI generated? How is the NAI transported
and Connected Identity. For NAI, we have defined the trusted over the private SIP network considering the following:
domain, Spec(T), NAI generation, and different types of NAIs. sending and receiving of NAI over a trusted domain,
Finally, we have explained the security implications for using sending of NAI to entities outside a Trust Domain, and
both user identity and NAI over the SIP networks. receiving of NAI by a node outside the Trust Domain?
Explain using detailed examples.
PROBLEMS 6. How do parties with NAIs operate within the private
1. Why does a SIP user like to use multiple identities? SIP network? What are the different types of NAIs?
Explain in detail using examples. What are the security implications for using NAIs?
Chapter 11
425
426 ◾ Handbook on Session Initiation Protocol
from early-media operations. SIP uses the offer–answer bidirectional early-media session and never send a 200 OK
model (RFC 3264, see Section 3.8.4) to negotiate session response for the INVITE.
parameters. One of the UAs—the offerer—prepares a ses-
sion description that is called the offer. The other UA—the
answerer—responds with another session description called
the answer. This two-way handshake allows both UAs to
11.3 Early-Media Solution Models
agree on the session parameters to be used to exchange The peculiarity of early media is that it cannot be declined,
media. The offer–answer model decouples the offer–answer modified, or identified by the receiving parties, or media and
exchange from the messages used to transport the ses- the codec of early media may not match at the receiving side
sion descriptions. For example, the offer can be sent in as no negotiations are allowed. Over the years, a good num-
an INVITE request and the answer can arrive in the 200 ber of solutions for these problems of early media have been
OK response for that INVITE, or, alternatively, the offer proposed. However, two solutions, namely early media with
can be sent in the 200 OK for an empty INVITE and the early-session disposition type and early media with P-Early-
answer can be sent in the ACK. When reliable provisional Media header field described in RFC 3960 (see Section
responses (RFC 3262; see Sections 2.5, 2.8.2, and 2.10) 11.4.7) and RFC 5009 (see Section 11.5), respectively, have
and UPDATE requests (RFC 3311, see Section 3.8.3) are emerged as good ones for solving the early-media problems
used, there are many more possible ways to exchange offers in some closed networking environments, although these are
and answers. not applicable in the Internet in general. We describe these
Media clipping occurs when the user (or the machine two solution models in the subsequent sections:
generating media) believes that the media session is already
established but the establishment process has not finished ◾◾ Early-Media Solution Model with Disposition-Type:
yet. The user starts speaking (i.e., generating media), and Early-Session
the first few syllables or even the first few words are lost. ◾◾ Early-Media Solution Model with P-Early-Media
When the offer–answer exchange takes place in the 200 OK Header
response and in the ACK, media clipping is unavoidable. The
called user starts speaking at the same time the 200 OK is
sent, but the UAS cannot send any media until the answer
from the UAC arrives in the ACK. On the other hand, media
11.4 Early-Media Solution Model with
clipping does not appear in the most common offer–answer Disposition-Type: Early-Session
exchange, for example, an INVITE with an offer and a 200
11.4.1 Overview
OK with an answer. UACs are ready to play incoming media
packets as soon as they send an offer because they cannot Early media refers to media (e.g., audio and video) that is
count on the reception of the 200 OK to start playing out exchanged before a particular session is accepted by the
media for the caller; SIP signaling and media packets typi- called user. Within a dialog, early media occurs from the
cally traverse different paths, and thus, media packets may moment the initial INVITE is sent until the UAS generates a
arrive before the 200 OK response. final response. It may be unidirectional or bidirectional, and
Another form of media clipping (not related to early can be generated by the caller, the callee, or both. Typical
media either) occurs in the caller-to-callee direction. When examples of early media generated by the callee are ringing
the callee picks up and starts speaking, the UAS sends a 200 tone and announcements (e.g., queuing status). Early media
OK response with an answer, in parallel with the first media generated by the caller typically consists of voice commands
packets. If the first media packets arrive at the UAC before or dual-tone multifrequency (DTMF) tones to drive IVR
the answer and the caller starts speaking, the UAC cannot systems. RFC 3959 that is described here defines a new
send media until the 200 OK response from the UAS arrives. disposition type (early-session) for the Content-Disposition
If the media starts flowing before the call establishment and header field in SIP for addressing the early-media solution.
the bandwidth is not reserved for supporting the media, The treatment of early-session bodies is similar to the treat-
there may not be enough bandwidth between the source ment of session bodies. That is, they follow the offer–answer
and destination path to support the early media especially model.
in the case of multiple early media flowing due to forking. Their only difference is that session descriptions whose
An early-media-specific risk may be the attempts by attack- disposition type is early-session are used to establish early-
ers to exploit the different charging policies some operators media sessions within early dialogs, as opposed to regular
apply to early and regular media. When UAs are allowed sessions within regular dialogs. Although we are explain-
to exchange early media for free but are required to pay for ing the Disposition-Type: Early-Session extension here, we
regular media sessions, rogue UAs may try to establish a will be cross-referencing RFC 3960 (see Section 11.4.7) that
Early Media in SIP ◾ 427
provides the early-media solution models using this exten- the UAS that accepts the INVITE (i.e., sends a 200 OK) was
sion. The basic SIP specification described in RFC 3261 only muted, a new offer–answer exchange is needed to unmute
supports very simple early-media mechanisms. These simple it. This usually causes media clipping. Therefore, UASs need
mechanisms have a number of problems related to forking a means of performing an offer–answer exchange with the
and security, and do not satisfy the requirements of most UAC to exchange early media that is independent from the
applications. RFC 3960 (see Section 11.4.7), which uses offer–answer exchanged used to exchange regular media. A
this SIP extension, Disposition-Type: Early-Session by RFC potential solution to this need would be to establish a dif-
3959, goes beyond the mechanisms defined in RFC 3261 ferent dialog using a globally routable Uniform Resource
and describes two models of early media using SIP: Identifier (URI) to perform an independent offer–answer
exchange. This dialog would be labeled as a dialog for early
◾◾ Gateway model media and would be somehow related to the original dia-
◾◾ Application server model log at the UAC. However, performing all the offer–answer
exchanges within the original dialog has many advantages:
Although both early-media models described in RFC
3960 are superior to the one specified in RFC 3261, the gate- ◾◾ It is simpler.
way model still presents a set of issues. In particular, the gate- ◾◾ It does not have synchronization problems because all
way model does not work well with forking. Nevertheless, the early dialogs are terminated when the session is
the gateway model is needed because some SIP entities (in accepted.
particular, some gateways) cannot implement the application ◾◾ It does not require globally routable URIs.
server model. The application server model addresses some ◾◾ It does not introduce service interaction issues related
of the issues present in the gateway model. This model uses to services that may be wrongly applied to the new
the early-session disposition type specified in this document. dialog.
◾◾ It makes firewall management easier.
11.4.2 Issues Related to Early-Media
This way of performing offer–answer exchanges for early
Session Establishment media is referred to as the application server model specified
Traditionally, early-media sessions have been established in in RFC 3960 (see Section 11.4.7). This model uses the early-
the same way as regular sessions. That is, using an offer– session disposition type defined in the following section.
answer exchange where the disposition type of the session
descriptions is session. Application servers perform an offer–
answer exchange with the UAC to exchange early media
11.4.3 Early-Session Disposition Type
exclusively, while UASs use the same offer–answer exchange, We define a new disposition type for the Content-Disposition
first to exchange early media, and once the regular dialog is header field: early-session. UAs must use early-session bod-
established, to exchange regular media. This way of establish- ies to establish early-media sessions in the same way as they
ing early-media sessions is known as the gateway model speci- use session bodies to establish regular sessions, as described
fied in RFC 3960 (see Section 11.4.7), which presents some in RFCs 3261 and 3264 (see Section 3.8.4). Particularly,
issues related to forking and security. These issues exist when early-session bodies must follow the offer–answer model and
this model is used by either an application server or by a UAS. may appear in the same messages as session bodies do with
Application servers may not be able to generate an answer the exceptions of 2xx responses for an INVITE and ACKs.
for an offer received in the INVITE. The UAC created the Nevertheless, it is not recommended that early offers in
offer for the UAS, and thus, it may have applied end-to-end INVITEs be included because they can fork, and the UAC
encryption or have included information (e.g., related to key could receive multiple early answers establishing early-media
management) that the application server is not supposed to streams at roughly the same time. Also, the use of the same
use. Therefore, application servers need a means to perform transport address (Internet Protocol [IP] address plus port)
an offer–answer exchange with the UAC that is independent in a session body and in an early-session body is not recom-
from the offer–answer exchange between both UAs. mended. Using different transport addresses (e.g., different
UASs using the offer–answer exchange that will carry ports) to receive early and regular media makes it easy to
regular media for sending and receiving early media can detect the start of the regular media.
cause media clipping described in RFC 3960 (see Section If a UA needs to refuse an early-session offer, it must
11.4.7). Some UACs cannot receive early media from differ- do so by refusing all the media streams in it. When Session
ent UASs at the same time. Thus, when an INVITE forks Description Protocol (SDP) (see Section 7.7) is used, this is
and several UASs start sending early media, the UAC mutes done by setting the port number of all the media streams to
all the UASs but one (which is usually chosen at random). If zero. This is the same mechanism that UACs use to refuse
428 ◾ Handbook on Session Initiation Protocol
11.4.4 Preconditions
Content-Type: multipart/mixed; boundary="boundary1"
RFC 3312 (see Section 15.4) defines a framework for precon- Content-Length: 401
--boundary1
ditions for SDP. Early sessions may contain preconditions, Content-Type: application/sdp
which are treated in the same way as preconditions in regular Content-Disposition: session
sessions. That is, the UAs do not exchange media, and the v=0
o=Bob 2890844725 2890844725 IN IP4 host.example.org
called user is not alerted until the preconditions are met. s=
c=IN IP4 192.0.2.2
t=0 0
11.4.5 Option Tag m=audio 30000 RTP/AVP 0
--boundary1
We define an option tag to be used in the Require and Content-Type: application/sdp
Supported header fields: early-session. A UA adding the Content-Disposition: early-session
v=0
early-session option tag to a message indicates that it under- o=Bob 2890844714 2890844714 IN IP4 host.example.org
stands the early-session disposition type. s=
c=IN IP4 192.0.2.2
t=0 0
11.4.6 Example m=audio 30002 RTP/AVP 0
--boundary1--
Figure 11.1 shows the message flow between two UAs.
INVITE (F1) has an early-session option tag in its Supported Figure 11.3 Early offer and answer.
header field and the body shown in Figure 11.2. The UAS
sends back a response with two body parts, as shown in
Figure 11.3: one of disposition-type session and the other Content-Type: application/sdp
Content-Disposition: early-session
early-session. The session body part is the answer to the offer v=0
o=alice 2890844717 2890844717 IN IP4 host.example.com
s=
UA A UA B c=IN IP4 192.0.2.1
t=0 0
m=audio 20002 RTP/AVP 0
F1 INVITE (offer)
provides much superior mechanisms for early media to those Having a separate offer–answer exchange for early media
defined in RFC 3261, and describes two models of early- also helps UACs decide whether or not local ringing should
media implementations using SIP: be generated. If a new early session is established and that
early session contains at least an audio stream, the UAC can
◾◾ Application server model: The UAC indicates support assume that there will be incoming early media and it can
for the early-session disposition type defined in RFC then avoid generating local ringing. An alternative model
3959 using the early-session option tag. The applica- would include the addition of a new stream, with an early
tion server model consists of having the UAS behave as media label, to the original session between the UAC and the
an application server to establish early-media sessions UAS using an UPDATE instead of establishing a new early
with the UAC. The application server model addresses session. We have chosen to establish a new early session to
solves most of the early-media-related problems. be coherent with the mechanism used by application servers
◾◾ Gateway m odel: The gateway model does not need that are not colocated with the UAS. This way, the UAS uses
any extension in SIP. Unlike the application server the same mechanism as any application server in the network
model, it does not even need to use the early-session to interact with the UAC.
disposition type. As a result, unlike the application
server model, the gateway model cannot solve many
11.4.7.1.1 In-Band versus Out-of-Band
of the issues related to forking and ringtone genera-
tion of early media. The gateway model is primarily
Session Progress Information
applicable in situations where the UA cannot distin- It should be noted that, even when the application server
guish between early media and regular media, as in the model is used, a UA will have to choose which early-media
case of devices like IP–public switched telephone net- sessions are muted and which ones are rendered to the user.
work (IP–PSTN) gateway that supports SIP over the To make this choice easier for UAs, it is strongly recom-
IP side and ISUP (Integrated Services Digital Network mended that information that is not essential for the session
User Part) over the PSTN side. The IP–PSTN gateway not be transmitted using early media. For instance, UAs
receives media from the PSTN over a circuit, and sends should not use early media to send special ringing tones. The
it to the IP network. The gateway is not aware of the status code and the reason phrase in SIP can already inform
contents of the media, and it does not exactly know the remote user about the progress of session establishment,
when the transition from early to regular media takes without incurring the problems associated with early media.
place. From the PSTN perspective, the circuit is a con-
tinuous source of media.
11.4.7.1.2 Alert-Info Header Field
The Alert-Info header field allows specifying an alternative
11.4.7.1 Application Server Model
ringing content, such as ringing tone, to the UAC. This
The application server model consists of having the UAS header field tells the UAC which tone should be played in
behave as an application server to establish early-media ses- case local ringing is generated; however, it does not tell the
sions with the UAC. As described earlier, the UAC indi- UAC when to generate local ringing. A UAC should follow
cates support for the early-session disposition type using the rules described above for ringing tone generation in both
the early-session option tag. This way, UASs know that models. If, after following those rules, the UAC decides to
they can keep offer–answer exchanges for early media, play local ringing, it can then use the Alert-Info header field
Content-Disposition: Early-Session, separate from regular to generate it.
media using a different session disposition type for the early
media. It requires an option tag to be used in the Require
11.4.7.1.3 Security Considerations
and Supported header fields: early-session. A UA adding
the early-session option tag to a message indicates that it All media-related security features defined in SIP are also
understands the early-session disposition type. Sending early applicable for the early media. An early-media-specific risk
media using a different offer–answer exchange than the one roughly equivalent to forms of toll fraud as described earlier
used for sending regular media helps avoid media clipping may be attempted by attackers due to the different charg-
in cases of forking. The UAC can reject or mute new offers ing policies by the service providers differentiating between
for early media without muting the sessions that will carry early media and regular media. For example, if the charging
media when the original INVITE is accepted. The UAC can policy states that the early media is free while the regular
give priority to media received over the latter sessions. This media is chargeable during the session, rogue UAs may try to
way, the application server model transitions from early to establish a bidirectional early-media session and never send
regular media at the right moment. a 200 OK response for the INVITE. On the other hand,
430 ◾ Handbook on Session Initiation Protocol
When, where, and how these ringing tones are generated controller and media gateway); however, this policy is the
has been standardized (i.e., the local exchange of the cal- same as the one described earlier, which must be imple-
lee generates a standardized ringing tone while the callee is mented by any UA. That is, any UA should play incoming
being alerted). It makes sense for a standardized approach media packets (and stop local ringing tone generation if it
to provide this type of feedback for the user in a homoge- was being performed) to avoid media clipping, even if the
neous environment such as the PSTN, where all the termi- 200 OK response has not arrived. Thus, the tools to imple-
nals have a similar user interface. This homogeneity is not ment this early-media policy are already available to any UA
found among SIP UAs. SIP UAs have different capabilities, that uses SIP.
different user interfaces, and may be used to establish ses- Furthermore, while it is not desirable to standardize a
sions that do not involve audio at all. Because of this, the common local policy to be followed by every SIP UA, a par-
way a SIP UA provides the user with information about the ticular subset of more or less homogeneous SIP UAs could
progress of session establishment is a matter of local policy. use the same local policy by convention. Examples of such
For example, a UA with a Graphical User Interface (GUI) subsets of SIP UAs may be “all the PSTN/SIP gateways” or
may choose to display a message on the screen when the cal- “every 3GPP IMS terminal.” However, defining the particu-
lee is being alerted, while another UA may choose to show a lar common policy that such groups of SIP devices may use
picture of a phone ringing instead. Many SIP UAs choose to is outside the scope of this document.
imitate the user interface of PSTN phones.
They provide a ringing tone to the caller when the callee
11.4.7.2.3 Absence of an Early-Media Indicator
is being alerted. Such a UAC is supposed to generate ringing
tones locally for its user as long as no early media is received The SIP, as opposed to other signaling protocols, does not
from the UAS. If the UAS generates early media (e.g., an provide an early-media indicator. That is, there is no infor-
announcement or a special ringing tone), the UAC is sup- mation about the presence or absence of early media in SIP.
posed to play it rather than generate the ringing tone locally. Such an indicator could be potentially used to avoid the
The problem is that, sometimes, it is not an easy task for a generation of local ringing tone by the UAC when UAS
UAC to know whether it will be receiving early media or it intends to provide an in-band ringing tone or some type of
should generate local ringing. A UAS can send early media announcement. However, in the majority of the cases, such
without using reliable provisional responses (very simple an indicator would be of little use due to the way SIP works.
UASs do that), or it can send an answer in a reliable provi- One important reason limiting the benefit of a potential
sional response without any intention of sending early media early-media indicator is the loose coupling between SIP sig-
(this is the case when preconditions are used). Therefore, by naling and the media path. SIP signaling traverses a different
only looking at the SIP signaling, a UAC cannot be sure path than the media. The media path is typically optimized
whether there will be early media for a particular session. The to reduce the end-to-end delay (e.g., minimum number of
UAC needs to check if media packets are arriving at a given intermediaries), while the SIP signaling path typically tra-
moment. An implementation could even choose to look at verses a number of proxies providing different services for
the contents of the media packets, since they could carry only the session. Hence, it is very likely that the media packets
silence or comfort noise. With this in mind, a UAC should with early media reach the UAC before any SIP message that
develop its local policy regarding local ringing generation. could contain an early-media indicator.
For example, a POTS (Plain Old Telephone Service)-like SIP Nevertheless, sometimes SIP responses arrive at the UAC
UA could implement the following local policy: before any media packet. There are situations in which the UAS
intends to send early media but cannot do it straight away. For
◾◾ Unless a 180 Ringing response is received, never gener- example, UAs using Interactive Connectivity Establishment
ate local ringing. (RFC 5245, see Section 14.3) may need to exchange several
◾◾ If a 180 Ringing has been received but there are no Session Traversal of UDP through NAT (Network Address
incoming media packets, generate local ringing. Translation) (STUN) (RFC 5389, see Section 14.3) mes-
◾◾ If a 180 Ringing has been received and there are sages before being able to exchange media. In this situation,
incoming media packets, play them and do not gener- an early-media indicator would keep the UAC from generat-
ate local ringing. ing a local ringing tone during this time. However, while the
early media is not arriving at the UAC, the user would not be
Note that a 180 Ringing response means that the cal- aware that the remote user is being alerted, even though a 180
lee is being alerted, and a UAS should send such a response Ringing had been received. Therefore, a better solution would
if the callee is being alerted, regardless of the status of the be to apply a local ringing tone until the early-media packets
early-media session. At first sight, such a policy may look dif- could be sent from the UAS to the UAC. This solution does
ficult to implement in decomposed UAs (i.e., media gateway not require any early-media indicator. Moreover, it should be
432 ◾ Handbook on Session Initiation Protocol
noted that migrations from local ringing tone to early media at Within an isolated SIP network, it is possible to gate early
the UAC happen in the presence of forking as well; one UAS media associated with all end points within the network to
sends a 180 Ringing response, and later, another UAS starts enforce a desired early-media policy among network end
sending early media. points. However, when a SIP network is interconnected with
other SIP networks, only the boundary node connected to the
external network can determine which early-media policy to
11.4.7.2.4 Limitations of the Gateway Model apply to a session established between end points on different
Most of the limitations of the gateway model are described sides of the boundary. The P-Early-Media header field provides
earlier. It produces media clipping in forking scenarios and a means for this boundary node to communicate this early-
requires media detection to generate local ringing properly. media policy decision to other nodes within the network.
These issues are addressed by the application server model,
described earlier, which is the recommended way of generat- 11.5.2 Early-Media Application Environments
ing early media that is not continuous with the regular media
generated during the session. The gateway model allows for The P-Early-Media header of SIP can only be applied in
individual networks to create local policy with respect to closed networking environments. For example, a private
the handling of early media, but does not address the case SIP network that emulates a traditional circuit switched
where a network is interconnected with other networks telephone network will benefit from using this header field.
with unknown, untrusted, or different early-media policies. Despite the limitations, there are sufficiently useful special-
In this situation, the P-Early-Media header-based model ized deployments for the use of P-Early-Media in SIP net-
described in RFC 5009 (Section 11.5) becomes a natural works under the following networking conditions:
extension of this gateway model that is applicable within a
transitive Trust Domain. 1. The use of this private P-Early-Media header field is
only applicable inside a Trust Domain as defined in
RFC 3324 (see Section 10.5). Nodes in such a Trust
Domain are explicitly trusted by its users and end
11.5 Early-Media Solution Model systems to authorize early-media requests only when
with P-Early-Media Header allowed by the early-media policy within the Trust
Domain.
Like the earlier model, early media with the P-Early-Media 2. The private P-Early-Media header field cannot be
header model also provides the solution for early media in a applied for a general early-media authorization com-
closed SIP network (e.g., 3GPP’s IMS network) under the munications model suitable for interdomain use or use
same administrative domain, and does not need to use the in the Internet at large. Furthermore, since the early-
Disposition-Type: Early-Media option tag, as explained earlier. media requests are not cryptographically certified, they
This header field is also useful in any SIP network that is inter- are subject to forgery, replay, and falsification in any
connected with other SIP networks and needs to control the architecture that does not meet the requirements of the
flow of media in the early dialog state. This document defines Trust Domain.
the use of the P-Early-Media header field for use within SIP 3. An early-media request also lacks an indication of who
(RFC 3261) messages in certain SIP networks to authorize the specifically is making or modifying the request, and so
cut-through of backward or forward early media when permit- it must be assumed that the Trust Domain is making
ted by the early-media policies of the networks involved. the request. Therefore, the information is only mean-
ingful when securely received from a node known to be
11.5.1 Early-Media Policy a member of the Trust Domain.
4. Although this extension can be used with parallel fork-
The private P-Early-Media header field is intended for use in ing, it does not improve on the known problems with
a SIP network that has the following characteristics: early-media and parallel forking, as described in RFC
3960, unless one can assume the use of symmetric RTP.
◾◾ Its early-media policy prohibits the exchange of early
media between end users.
◾◾ It is interconnected with other SIP networks that have
11.5.3 Early-Media Authorization
unknown, untrusted, or different policies regarding PSTN networks typically provide call progress information
early media. as backward early media from the terminating switch toward
◾◾ It has the capability to gate (enable/disable) the flow of the calling party. PSTN networks also use forward early
early media to/from user equipment. media from the calling party toward the terminating switch
Early Media in SIP ◾ 433
under some circumstances for applications, such as digit usual early-media policy, the network equipment gating the
collection for secondary dialing. PSTN networks typically backward early-media flow for the originating UA must dis-
allow backward or forward early media since they are used tinguish between authorized early media from a terminating
for the purpose of progressing the call to the answer state SIP end point and unauthorized early media from another
and do not involve the exchange of data between end points. SIP device outside of the network. Given the assumption of
In a SIP network, backward early media flows from the UAS a transitive trust relationship between SIP servers in the net-
toward the UAC. Forward early media flows from the UAC work, this can be accomplished by including some informa-
toward the UAS. SIP networks by default allow both forms of tion in a backward SIP message that identifies the presence
early media, which may carry user data, once the media path of authorized backward early media.
is established. Early media is typically desirable with a PSTN Since it is necessary to verify that this indication comes
gateway as UAS, but not with SIP user equipment as UAS. from a trusted source, it is necessary for each server on the
To prevent the exchange of user data within early media path back to the originating UA to be able to verify the trust
while allowing early media via PSTN gateways, a SIP network relationship with the previous server and to remove such an
may have a policy to prohibit backward early media from SIP indication when it cannot do so. A server on the boundary
user equipment and to prohibit forward media toward SIP to an untrusted SIP network can assure that no indication
user equipment, either of which may contain user data. A of authorized backward early media passes from an exter-
SIP network containing both PSTN gateways and SIP end nal UAS to a UAC within the network. Thus, the use of a
devices, for example, can maintain such an early-media policy private header field that can be modified by SIP proxies is
by gating off any early media with a SIP end device acting as to be preferred over the use of a Multipurpose Internet Mail
UAS, gating on early media with a SIP end device acting as Extensions attachment that cannot be modified in this way.
UAC, and gating on early media at each PSTN gateway.
Unfortunately, a SIP network interconnected with 11.5.3.2 Forward Early Media
another SIP network may have no means of assuring that the
interconnected network is implementing a compatible early- Forward early media is less common than backward early
media policy, thus allowing the exchange of user data within media in the PSTN. It is typically used to collect second-
early media under some circumstances. For example, if a net- ary dialed digits, to collect credit card numbers, or to collect
work A allows all early media with user equipment as UAC other DTMF or speech responses for the purpose of fur-
and an interconnected network B allows all early media with ther directing the call. Forward early media in the PSTN is
user equipment as UAS, any session established between user always directed toward a network server for the purpose of
equipment as UAC in A and user equipment as UAS in B will indicating that a call is progressing and involves no exchange
allow bidirectional user data exchange as early media. Other of data between end users.
combinations of early-media policies may also produce similar A terminating SIP UA outside of the SIP network, on
undesirable results. The purpose of the P-Early-Media header the other hand, may receive any user data in a forward early-
is to allow a SIP network interconnected to other SIP net- media stream. Thus, if the network implements the usual
works with different early-media policies to correctly identify early-media policy, the network equipment gating the for-
and enable authorized early media according to its policies. ward early-media flow for the originating UA must distin-
guish between a terminating end point that is authorized to
receive forward early media, and another SIP device outside
11.5.3.1 Backward Early Media of the network that is not authorized to receive forward early
Backward early media in the PSTN typically comprises call media containing user data. This authorization can be accom-
progress information, such as ringing feedback (ringback), plished in the same manner as for backward early media by
or announcements regarding special handling such as for- including some information in a backward SIP message that
warding. It may also include requests for further informa- identifies that the terminating side is authorized to receive
tion, such as a credit card number to be entered as forward forward early media.
early media in the form of DTMF tones or speech. Backward
early media of this type provides information to the calling 11.5.4 Applicability of Content-Disposition
party strictly for the purpose of indicating that the call is
and Application/Gateway Model
progressing and involves no exchange of data between end
users. The usual PSTN charging policy assumes that no data The private P-Early-Media header can be applicable to the
is exchanged between users until the call has been answered. gateway model defined in RFC 3960 (see Section 11.4.7),
A terminating SIP UA outside of the SIP network, on since the PSTN gateway is the primary requestor of early
the other hand, may provide any user data in a backward media in a private SIP network of a given administrative
early-media stream. Thus, if the network implements the domain (e.g., 3GPP’s IMS network). For the same reason,
434 ◾ Handbook on Session Initiation Protocol
neither the application server model of RFC 3960, nor the based on local policy. If the proxy also performs gating of
early-session disposition type defined in RFC 3959 (see early media, then it uses the parameter(s) of the P-Early-
Section 11.4) is applicable in this situation. The gateway Media header field to decide whether to open or close the
model of RFC 3960 allows for individual networks to create gates for backward and forward early-media flow(s) between
local policy with respect to the handling of early media, but the UAs. The proxy performing gating of early media may
does not address the case where a network is interconnected also add a gated parameter to the P-Early-Media header field
with other networks with unknown, untrusted, or different before forwarding the message so that other gating proxies in
early-media policies. In this communications environment, the path can choose to leave open their gates.
the P-Early-Media header like this is essential. Because with- If the UAC is a trusted server within the network (e.g.,
out the kind of information in the P-Early-Media header a PSTN gateway), then the UAC may use the parameter(s)
field, it is not possible for the network to determine whether of the P-Early-Media header field in messages received from
cut-through of early media could lead to the transfer of data the UAS to decide whether to perform early-media gating
between end users during session establishment. Thus, the or cut-through, and to decide whether to render backward
P-Early-Media header is a natural extension of the gateway early media in preference to generating ringback based on
model of RFC 3960 that is applicable within a transitive the receipt of a 180 Ringing response. If the UAC is asso-
Trust Domain. ciated with user equipment, then the network will have
assigned a proxy the task of performing early-media gating,
so that the parameter(s) of the P-Early-Media header field
11.5.5 Operation received at such a UAC do not require that the UAC police
The P-Early-Media header field is used for the purpose of the early-media flow(s); however, they do provide additional
requesting and authorizing requests for backward or for- information that the UAC may use to render media. The
ward early media. A UAC capable of recognizing the UAC and proxies in the network may also insert, delete, or
P-Early-Media header field may include the header field in modify the P-Early-Media header field in messages toward
an INVITE request. The P-Early-Media header field in an the UAS within the dialog according to local policy; how-
INVITE request contains the supported parameter. As mem- ever, the interpretation of the header field when used in this
bers of the Trust Domain, each proxy receiving an INVITE way is a matter of local policy and not defined herein. The
request must decide whether to insert or delete the P-Early- use of direction parameter(s) in this header field could be
Media header field before forwarding. A UAS receiving an used to inform the UAS of the final early-media authoriza-
INVITE request can use the presence of the P-Early-Media tion status.
header field in the request to decide whether to request early-
media authorization in subsequent messages toward the 11.5.6 Limitations of the P-Early-Media
UAC.
After receiving an incoming INVITE request, the UAS
Header Field
requesting backward or forward early media will include The P-Early-Media header field does not apply to any SDP
the P-Early-Media header field in a message toward the with Content-Disposition: Early-Session defined in RFC
UAC within the dialog, including direction parameter(s) 3959 (see Section 11.4). When parallel forking occurs, there
that identify for each media line in the session whether the is no reliable way to correlate early-media authorization in
early-media request is for backward media, forward media, a dialog with the media from the corresponding end point
both, or neither. The UAS can change its request for early unless one can assume the use of symmetric RTP, since the
media by including a modified P-Early-Media header field SDP messages do not identify the RTP source address of any
in a subsequent message toward the UAC within the dia- media stream. When a UAC or proxy receives multiple early
log. Each proxy in the network receiving the P-Early-Media dialogs and cannot accurately identify the source of each
header field in a message toward the UAC has the responsi- media stream, it should use the most restrictive early-media
bility of assuring that the early-media request comes from authorization it receives on any of the dialogs to decide the
an authorized source. If a P-Early-Media header field arrives policy to apply toward all received media.
from either an untrusted source, a source not allowed to send When early-media usage is desired for any reason and
backward early media, or a source not allowed to receive for- one cannot assume the use of symmetric RTP, it is advisable
ward early media, then the proxy may remove the P-Early- to disable parallel forking using caller preferences defined in
Media header field or alter the direction parameter(s) of the RFC 3841. Although the implementation of media gating
P-Early-Media header field before forwarding the message, is not defined in this specification (RFC 5009), note that
based on local policy. media gating must be implemented carefully in the presence
A proxy in the network not receiving the P-Early-Media of NATs and protocols that aid in network address transla-
header field in a message toward the UAC may insert one tion traversal. Media gating may also introduce a potential
Early Media in SIP ◾ 435
for media clipping that is similar to that created during early-media authorization request has been received within
parallel forking or any other feature that may disable early the dialog, the default early-media authorization depends
media, such as custom ringback. on local policy and may depend on whether the header field
was included in the INVITE request. After an early-media
authorization request has been received within a dialog, and
11.5.7 P-Early-Media Header Field a subsequent message is received without the P-Early-Media
The private P-Early-Media header field with the supported header field, the previous early-media authorization remains
parameter may be included in an INVITE request to indi- unchanged. The P-Early-Media header field in any message
cate that the UAC or a proxy on the path recognizes the within a dialog toward the UAS may be ignored or inter-
header field. This header is not used in the Internet. A net- preted according to local policy.
work entity may request the authorization of early media or The P-Early-Media header field does not interact with
change a request for authorization of early media by includ- SDP offer–answer procedures in any way. Early-media autho
ing the P-Early-Media header field in any message allowed by rization is not influenced by the state of the SDP offer–answer
Table 2.5, Section 2.8, within the dialog toward the sender of procedures (including preconditions and directionality) and
the INVITE request. The P-Early-Media header field includes does not influence the state of the SDP offer–answer pro-
one or more direction parameters where each has one of the cedures. The P-Early-Media header field may or may not
values sendrecv, sendonly, recvonly, or inactive, following the be present in messages containing SDP. The most recently
convention used for the SDP stream directionality of RFCs received early-media authorization applies to the correspond-
3264 (see Section 3.8.4) and 4566 (see Section 7.7). ing media line in the session established for the dialog until
Each parameter applies, in order, to the media lines receipt of the 200 OK response to the INVITE request, at
in the corresponding SDP messages establishing session which point all media lines in the session are implicitly autho-
media. Unrecognized parameters shall be silently discarded. rized. Early-media flow in a particular direction requires that
Nondirection parameters are ignored for purposes of early- early media in that direction is authorized, that media flow
media authorization. If there are more direction parameters in that direction is enabled by the SDP direction attribute
than media lines, the excess shall be silently discarded. If for the stream, and that any applicable preconditions RFC
there are fewer direction parameters than media lines, 3312 (see Section 15.4) are met. Early-media authorization
the value of the last direction parameter shall apply to all does not override the SDP direction attribute or precondi-
remaining media lines. A message directed toward the UAC tions state, and the SDP direction attribute does not override
containing a P-Early-Media header field with no recognized early-media authorization.
direction parameters shall not be interpreted as an early-
media authorization request.
11.5.7.1 Procedures at the UAC
The parameter value sendrecv indicates a request for
authorization of early media associated with the correspond- A UAC may include the P-Early-Media header field with
ing media line, both from the UAS toward the UAC and from the supported parameter in an INVITE request to indi-
the UAC toward the UAS (both backward and forward early cate that it recognizes the header field. A UAC receiving a
media). The value sendonly indicates a request for authoriza- P-Early-Media header field may use the parameter(s) of the
tion of early media from the UAS toward the UAC (backward header field to gate or cut-through early media, and to decide
early media), and not in the other direction. The value recvonly whether to render early media from the UAS to the UAC
indicates a request for authorization of early media from the in preference to any locally generated ringback triggered by
UAC toward the UAS (forward early media), and not in the a 180 Ringing response. If a proxy is providing the early-
other direction. The value inactive indicates either a request media gating function for the UAC, then the gateway model
that no early media associated with the corresponding media of RFC 3960 for rendering of early media is applicable. A
line be authorized, or a request for revocation of authoriza- UAC without a proxy in the network performing early-media
tion of previously authorized early media. The P-Early-Media gating that receives a P-Early-Media header field should per-
header field in any message within a dialog toward the sender form gating or cut-through of early media according to the
of the INVITE request may also include the nondirection parameter(s) of the header field.
parameter gated to indicate that a network entity on the path
toward the UAS is already gating the early media, according
11.5.7.2 Procedures at the UAS
to the direction parameter(s). When included in the P-Early-
Media header field, the gated parameter shall come after all A UAS that is requesting authorization to send or receive early
direction parameters in the parameter list. media may insert a P-Early-Media header field with appro-
When receiving a message directed toward the UAC priate parameters(s) in any message allowed in Tables 2.5
without the P-Early-Media header field and no previous and 2.10 (see Section 2.8) toward the UAC within the dialog.
436 ◾ Handbook on Session Initiation Protocol
437
438 ◾ Handbook on Session Initiation Protocol
12.2 Communications Service ID standards is not sufficient to identify the service, it may be due
to a lack of sufficient signaling to convey what is needed, or
The SIP carries signaling messages for session establishments may be because request URIs should be used for differentia-
for invoking services. A given SIP signaling message that rep- tion and they are not being so used. By applying the litmus tests
resents a unique service must also be unique. For example, if described later, network designers can determine whether the
the Session Description Protocol (SDP) message body con- system is attempting to perform a declarative service ID.
tains a specific audio codec with its definitive functional and
performances characteristics, it will represent a specific audio
service and the same unique signaling scheme will always 12.2.2 SIP’s Expressiveness for Negotiation
represent the same service, and all users are expected to have One of SIP’s key strengths is its ability to negotiate a common
the same experiences for this service. There are a number of view of a session between participants. This means that the
points that we can make, as follows: service that is ultimately received can vary wildly, depending
on the types of end points in the call and their capabilities.
◾◾ Expressions used in the signaling message for represen- Indeed, this fact becomes even more evident when calls are set
tation of the service in the SIP signaling layer. up between domains. As such, when performing derived ser-
◾◾ Processing of the signaling message for representation vice ID, domains should be aware that sessions may arrive from
of the service in the service (or application) layer. different networks and different end points. Consequently,
◾◾ Experiences of users for the same service at the user the service ID algorithm must be complete—meaning that it
level. If the end users have been in control of the ser- computes the best answer for any possible signaling message
vices, and the network or networks that interconnect that might be received and any session that might be set up.
the end users are not involved in providing services In a homogeneous environment, the process of service ID
in the service (or application) layer, then things would is easy. The service provider will know the set of services they
be simpler. are providing, and, based on the specific call flows for each
specific service, can construct rules to differentiate one service
However, the session establishment and reestablishment from another. However, when different providers intercon-
within a given call in SIP even for a single media can be nect, or when different end points are introduced, assump-
highly complex, and the intermediate functional entities tions about what services are used, and how they are signaled,
within the network may have to play a role such as for media no longer apply. To provide the best user experience possible, a
bridging, transfer/diversion of the call-leg, mid-call charg- provider doing service ID needs to perform a best-match oper-
ing, digit collection, emergency call, and many other services ation, such that any legal SIP signaling—not just the specific
that a caller may have to get. In many cases, the processing call flows running within their own network among a limited
of a service from a given SIP signaling message knowing the set of endpoints—is mapped to the appropriate service.
SDP message body and hints from the headers, as well as
user experiences of the same service, may not always be the
same especially if the service provided within the network or 12.2.3 Presence
networks are different. If we add multimedia on the top of a Presence can help a great deal with providing unique URIs
single media, for each of those call features, things will get for different services. If a user wishes to contact another user
more complicated. In this respect, RFC 5897 (Sections 12.1 and knows only the AOR for the target, as is usually the case,
through 12.3) provides some guidelines for using the service the user can fetch the presence document for the target. That
ID that is the process of determining or signaling the user- document, in turn, can contain numerous services URIs for
level use case that is driving the signaling being utilized by contacting the target with different services. Those URIs can
the user agent (UA). The recommendations for service ID are then be used in the Request-URI for differentiation. When
summarized in the following sections. possible, this is the best solution to the problem. It should be
noted that, unlike service URIs described earlier, the URI
itself will provide the address of a given service, but the ser-
12.2.1 Derived Service ID
vice itself must be uniquely understood from the description
Derived service ID—where an identifier for a service is obtained of the services provided somewhere else.
by inspection of the signaling and of other contextual data (such
as subscriber profile)—is reasonable, and when done properly,
does not lead to the perils described above. However, declara-
12.2.4 Intradomain
tive service ID—where UAs indicate what the service is, sepa- Service identifiers themselves are not bad; derived service ID
rate from the rest of the signaling—leads to the perils described allows each domain to cache the results of the service ID pro-
above. If it appears that the signaling currently defined in cess for usage by another network element within the same
Service and Served-User Identity in SIP ◾ 439
domain. However, service identifiers are fundamentally use- explicit and well-defined protocol mechanisms is harmful.
ful within a particular domain, and any such header must be During a session setup, proxies may need to understand what
stripped at a network boundary. Consequently, the process of service the request is related to in order to know what appli-
service ID and their associated service identifiers are always cation server (AS) to contact or other service logic to invoke.
an intradomain operation. The SIP INVITE request contains all of the information
necessary to determine the service. However, the calculation
of the service may be computational and database intensive.
12.2.5 Device Dispatch For example, a given Trust Domain’s definition of a service
Device dispatch should be done following the principles of might include request authorization. Moreover, the analysis
RFC 3841 (see Section 9.9), using implicit preferences based may require examination of the SDP.
on the signaling. For example, RFC 5688 (Sections 12.2.5 For example, an INVITE request with video SDP directed
and 12.3.2) defines a new UA capability that can be used to a video-on-demand Request-URI could be marked as an
to dispatch requests based on different types of application Internet Protocol (IP) television session. An INVITE request
media streams. However, it is a mistake to try and use a ser- with push-to-talk over cellular (PoC) routes could be marked
vice identifier as a UA capability. Consider a service called as a PoC session. An INVITE request with a Require header
multimedia telephony, which adds video to the existing public field containing an option tag of foogame could be marked as
switched telephone network (PSTN) experience. A user has a foogame session. If the information contained within the
two devices, one of which is used for multimedia telephony SIP INVITE request is not sufficient to uniquely identify a
and the other strictly for a voice-assisted game. It is tempt- service, the remedy is to extend the SIP signaling to capture
ing to have the telephony device include a UA capability the missing element.
(RFC 3840, see Section 3.4) called multimedia telephony in By providing a mechanism to compute and store the
its registration. A calling multimedia telephony device can results of the domain-specific service calculation, that is,
then include the Accept-Contact header field (RFC 3841, see the derived service ID, this optimization allows a single
Section 9.9) containing this feature tag. The proxy serving trusted proxy to perform an analysis of the request and
the called party, applying the basic algorithms of RFC 3841, authorize the requestor’s permission to request such a ser-
will correctly route the call to the terminating device. vice. The proxy may then include a service identifier that
However, if the calling party is not within the same relieves other trusted proxies and trusted UAs from perform-
domain, and the calling domain does not know about or use ing further duplicate analysis of the request for their service
this feature tag, there will be no Accept-Contact header field, ID purposes. In addition, this extension allows UA clients
even if the calling party was using a service that is a good (UACs) outside the Trust Domain to provide a hint of the
match for multimedia telephony. In such a case, the call may requested service. The private P-Asserted-Service and private
be delivered to both devices, but it will yield a poorer user P-Preferred-Service header defined in RFC 6050 (see Section
experience. That is because device dispatch was done using a 12.3) enable a network of trusted SIP servers to assert the ser-
declarative service ID. The best way to avoid this problem is vice of authenticated users. The use of these extensions is only
to use feature tags that can be matched to well-defined sig- applicable inside an administrative domain with previously
naling features—media types, required SIP extensions, and agreed-upon policies for generation, transport, and usage of
so on. In particular, the golden rule is that the granularity such information, and does not offer a general service ID
of feature tags must be equivalent to the granularity of indi- model suitable for use between different Trust Domains or
vidual features that can be signaled in SIP. for use in the Internet at large.
These headers do not provide for the dialog or transaction
to be rejected if the service is not supported end-to-end. SIP
provides other mechanisms, such as the option-tag and use
12.3 Asserted- and Preferred-Service ID of the Require and Proxy-Require header fields, where such
functionality is required. No explicitly signaled service ID
12.3.1 Overview exists, and the session proceeds for each node’s definition of
The concept of service within SIP has no hard and fast rules. the service in use, on the basis of information contained in
As described earlier, RFC 5897 provides general guidance the SDP and in other SIP header fields.
on what constitutes a service within SIP and what does not. This mechanism is specifically for managing the infor-
This document also makes use of the terms derived service ID mation needs of intermediate routing devices between the
and declarative service ID as defined in Sections 12.1 through calling user and the user represented by the Request-URI.
12.3 (RFC 5897). It clearly states that the declarative ser- In support of this mechanism, a Uniform Resource Name
vice ID of the process by which a UA inserts a moniker into (URN) is defined to identify the services. This URN has
a message that defines the desired service, separate from wider applicability to additionally identify services and
440 ◾ Handbook on Session Initiation Protocol
terminal applications. Between end users, caller prefer- service by including the SIP and SDP parameters that cor-
ences and callee capabilities as specified in RFC 3840 and respond to the service they require. Furthermore, since the
RFC 3841 provide an appropriate mechanism for indicat- asserted services are not cryptographically certified, they are
ing such service and application ID. These mechanisms have subject to forgery, replay, and falsification in any architec-
been extended by RFC 5688 (see Sections 9.5 through 9.8) ture that does not meet the requirements of RFC 3324 (see
to provide further capabilities in this area. The P-Asserted- Section 10.5). The asserted services also lack an indication
Service header field contains a URN. This is supported by of who specifically is asserting the service, and so it must be
the P-Preferred-Service header field that also contains a assumed that a member of the Trust Domain is asserting
URN and that allows the UA to express preferences regard- the service. Therefore, the information is only meaningful
ing the decisions made on service within the Trust Domain. when securely received from a node known to be a mem-
An example of the P-Asserted-Service header field is ber of the Trust Domain. Despite these limitations, there
are sufficiently useful specialized deployments that meet the
P-Asserted-Service: urn:urn-7:3gpp-service. assumptions described above and can accept the limitations
exampletelephony.version1 that result, to warrant the informational publication of this
mechanism.
A proxy server that handles a request can, after authenti-
cating the originating user in some way (e.g., digest authen-
tication) to ensure that the user is entitled to that service,
12.3.3 Header Fields
insert such a P-Asserted-Service header field into the request We have provided augmented Backus–Naur Form (ABNF)
and forward it to not sufficient to uniquely identify a service, syntaxes of all SIP header field in Section 2.4.1. However, we
the remedy is to extend the SIP signaling to capture the miss- are repeating syntaxes for these two header fields for conve-
ing element. We provide further explanation here per RFC nience: P-Asserted-Service and P-Preferred Service.
5897. A proxy server or UA that it does not trust removes all
the P-Asserted-Service header field values. Thus, the services
are labeled by means of an informal URN that provides a 12.3.3.1 P-Asserted-Service Header
hierarchical structure for defining services and subservices, The P-Asserted-Service header field is used among trusted SIP
and provides an address that can be resolvable for various entities (typically intermediaries) to carry the service infor-
purposes outside the scope of this document, for example, to mation of the user sending a SIP message. The P-Asserted-
obtain information about the service so described. Service header field carries the information of the derived
service ID. While the declarative service ID can assist in
deriving the value transferred in this header field, this should
12.3.2 Applicability Statement
be in the form of streamlining the correct derived service ID.
The use of the P-Asserted-Service and P-Preferred-Service
header fields defined in RFC 6050 is only applicable inside a PAssertedService = "P-Asserted-Service"
HCOLON
Trust Domain. All functional entities of the network create
PAssertedService-value
a Trust Domain, T, authenticating each other’s Network- PAssertedService-value = Service-ID *(COMMA
Asserted Identity using valid identities expressed in SIP Service-ID)
URI, SIPS URI, tel URI, or optional display name comply-
ing with a certain set of specification, spec (T), as defined Note the following:
in RFC 3324 (see Section 12.3). Nodes in such a Trust
Domain are explicitly trusted by its users and end systems ◾◾ The definition of Service-ID has been provided in
to publicly assert the service of each party, and they have ABNF in the subsequent section.
common and agreed-upon definitions of services and homo- ◾◾ Proxies can (and will) add and remove this header field.
geneous service offerings. The means by which the network ◾◾ Tables 2.5 and 2.10 (Section 2.8) show the relationship
determines the service to assert is not defined in RFC 6050 between P-Asserted-Service header field, Proxy, and
(see Section 12.3). Request Messages.
As explained earlier, this service model is not suit-
able for interdomain use or use in the Internet at large. Its Syntactically, there may be multiple P-Asserted-Service
assumptions about the trust relationship between the user header fields in a request. The semantics of multiple
and the network may not be suitable in many applications. P-Asserted-Service header fields appearing in the same request
For example, these extensions do not accommodate a model is not defined at this time in RFC 6050. Implementations of
whereby end users can independently assert their service by this specification must provide only one P-Asserted-Service
use of the extensions defined here. End users assert their header field value.
Service and Served-User Identity in SIP ◾ 441
8. Validation mechanism: Validation determines whether potential services that the request could match). The proxy
a given string is currently a validly assigned URN pro- must not use the contents of the P-Preferred-Service header
vided in RFC 3406. Owing to the distributed nature field to identify the service without first checking against
of usage and since not all services are available every- the capabilities (e.g., SDP) contained in the request. If the
where, validation in this sense is not possible. proxy inserts a P-Asserted-Service header field in the request,
9. Scope: The scope for this URN can be local to a single the proxy must remove the P-Preferred-Service header field
domain, or may be more widely used. before forwarding the request; otherwise, the proxy should
include the P-Preferred-Service header field when forwarding
the request.
12.3.4 Usage of Header Fields in Requests If the proxy receives a request from a node that it trusts,
it can use the information in the P-Asserted-Service header
12.3.4.1 Procedures at UACs
field, if any, as if it had authenticated the user itself. If there
The UAC may insert a P-Preferred-Service in a request that is no P-Asserted-Service header field present, or it is not pos-
creates a dialog, or a request outside of a dialog. This informa- sible to match the request to a specific service as identified
tion can assist the proxies in identifying appropriate service by the service identifier, a proxy may add one containing it
capabilities to apply to the call. This information must not using its own analysis of the information contained in the SIP
conflict with other SIP or SDP information included in the request. If the proxy received the request from an element it
request. Furthermore, the SIP or SDP information needed does not trust and there is a P-Asserted-Service header pres-
to signal the functionality of this service must be present. ent, the proxy must replace that header field’s contents with
Thus, if a service requires a video component, then the SDP a new analysis or remove that header field. The analysis per-
has to include the media line associated with that video com- formed to identify such service identifiers is outside the scope
ponent; it cannot be assumed from the P-Preferred-Service of this document. However, it is perfectly valid as a result
header field value. Similarly, if the service requires particular of the analysis not to include any service identifier in the
SIP functionality for which a SIP extension and a Require forwarded request, and thus not include a P-Asserted-Service
header field value is defined, then the request has to include header field. If a proxy forwards a request to a node outside
that SIP signaling as well as the P-Preferred-Service header the proxy’s Trust Domain, there must not be a P-Asserted-
field value. A UAC that is within the same Trust Domain as Service header field in the forwarded request.
the proxy to which it sends a request (e.g., a media gateway or
AS) may insert a P-Asserted-Service header field in a request
that creates a dialog, or a request outside of a dialog. This
12.3.4.3 Procedures at UA Servers
information must not conflict with other SIP or SDP infor- For a UA server (UAS) outside the Trust Domain, the
mation included in the request. Furthermore, the SIP or SDP P-Asserted-Service header is removed before it reaches this
information needed to signal the functionality of this service entity; therefore, there are no procedures for such a device.
must be present. However, if a UAS receives a request from a previous element
that it does not trust, it must not use the P-Asserted-Service
header field in any way. If a UA is part of the Trust Domain
12.3.4.2 Procedures at Intermediate Proxies
from which it received a request containing a P-Asserted-
A proxy in a Trust Domain can receive a request from a node Service header field, then it can use the value freely; however,
that it trusts or a node that it does not trust. When a proxy it must ensure that it does not forward the information to
receives a request from a node it does not trust and it wishes any element that is not part of the Trust Domain.
to add a P-Asserted-Service header field, the proxy must
identify the service appropriate to the capabilities (e.g., SDP)
12.3.5 Usage of Header Fields in Responses
in the request, and may authenticate the originator of the
request (to determine whether the user is subscribed for that There is no usage of these header fields in responses.
service). Where the originator of the request is authenticated,
the proxy must use the identity that results from this check-
ing and authentication to insert a P-Asserted-Service header
12.3.6 Examples of Usage
field into the request. In this example, proxy.example.com creates a P-Asserted-
When a proxy receives a request containing a P-Preferred- Service header field from the user identity it discovered from
Service header field, the proxy may use the contents of SIP digest authentication, the list of services appropriate
that header field to assist in determining the service to be to that user, and the services that correspond to the SDP
included in a P-Asserted-Service header field (for instance, information included in the request. Note that F1 and F2
to prioritize the order of comparison of filter criteria for are about identifying the user and do not directly form part
Service and Served-User Identity in SIP ◾ 443
Call-ID: 245780247857024504 The served user to a proxy or AS is the user whose ser-
CSeq: 2 INVITE vice profile is accessed by that SIP proxy that may be known
Max-Forwards: 68
P-Asserted-Service: urn:urn-7:3gpp-service.
as the serving SIP proxy, for example, Serving–Call Session
exampletelephony.version1 Control Function (S-CSCF) of Third Generation Partnership
Project’s (3GPP’s) IP Multimedia System (IMS) network or
v=0 AS when an initial request is received that is originated by,
o=- 2987933615 2987933615 IN IP6 originated on behalf of, or terminated to that user. There may
5555::aaa:bbb:ccc:ddd also be the distributed servers/databases, for example, home
s=-
c=IN IP6 5555::aaa:bbb:ccc:ddd
subscriber server (HSS) of 3GPP’s IMS network, that store
t=0 0 the user profiles. This profile, in turn, provides some use-
m=audio 3456 RTP/AVPF 97 96 ful information (preferences or permissions) for processing
b=AS:25.4 at a proxy and, potentially, at an AS. For providing specific
a=curr:qos local sendrecv services to the respective users, some service-specific filter
a=curr:qos remote none criteria (SSFC) are stored in the server/database (e.g., HSS
a=des:qos mandatory local sendrecv
a=des:qos mandatory remote sendrecv
of 3GPP’s IMS network) as part of the user profile and are
a=sendrecv downloaded to the serving SIP proxy upon user registration.
a=rtpmap:97 AMR For example, 3GPP’s IMS network also uses the service-
a=fmtp:97 mode-set=0,2,5,7; maxframes specific filter criteria.
The P-Served-User header field is very useful for the
large-scale distributed SIP network especially where a variety
of SIP-aware ASs are employed for providing services in a
12.4 Served-User ID more scalable way. A large distributed fixed or mobile SIP
network (e.g., 3GPP’s IMS network) of a given administra-
for Handling Services tive domain may have multiple SIP proxies playing different
We are dealing with served-user ID in particular that is roles, and the AS farm is distributed over a large geographical
expressed explicitly for a given service; however, the type of area and provides services to SIP users. All functional enti-
service or the service ID is known implicitly. For example, ties of the network create a Trust Domain, T, authenticating
the SIP INVITE message itself expresses the kinds of ser- each other’s Network-Asserted Identity using valid identities
vices that are being requested directly in the SDP message expressed in SIP URI, SIPS URI, tel URI, or optionally dis-
body, and may express services or provide hints for services play name complying with a certain set of specification, spec
implicitly in the message headers. If the served-user iden- (T), as defined in RFC 3324 (see Section 10.5). SIP proxies
tity is known, the kind of services that the user obtained may have the service control interface (SCI), for example,
from the SIP network can also be known directly or implic- IP multimedia Service Control (ISC) interface of 3GPP’s
itly. However, the most important use of the P-Served-User IMS network, with many ASs and servers/databases. A proxy
header field (RFC 5502) lies somewhere else: routing within may have multiple SCIs of different ASs to fulfill the service
the SIP-aware AS farm, where a host of ASs may be used requirements of the users that are served by this particular
for serving a SIP call with distributed service architecture. proxy. In addition, SIP proxies may have SIP interfaces with
A high-level description is provided; now, this header can be ASs/servers.
used for providing SIP services by multiple ASs in inter-AS Some implementations of the SIP network may use the
communication environments. original dialog identifier (ODI) to the request that will allow
the serving SIP proxy to identify the message on the incom-
ing side, even if its dialog ID has been changed by the AS
12.4.1 P-Served-User Header (e.g., AS performing third-party call control). In addition,
The private P-Served-User header field defined in RFC 5502 the originating ID presentation (OIP) service may also be
conveys the identity of the served user and the session case implemented to provide the terminating user with the pos-
parameter that applies to this particular communication ses- sibility of receiving trusted (i.e., network-provided) iden-
sion and application invocation. The header is not used in tity information in order to identify the originating user.
the Internet in general. The session case parameter is used to For example, 3GPP’s IMS network has specified both ODI
indicate the category of the served user: originating served and OPI services. It should be noted that the SIP Identity
user or terminating served user, and registered user or unreg- and Identity-Info header field specified in RFC 4474 (see
istered user. Note that there can be many kinds of ASs in the Sections 2.8 and 19.4.8) and connected identity specified
farm for providing a variety of services to the SIP users based with the from-change tag may be the standard way of doing
on different subscription types. the identity presentation instead of using OIP that is non-SIP
Service and Served-User Identity in SIP ◾ 445
standard. We use both ODI and OIP later as an example for call cases, and determines whether or not the particular user
the purpose of showing the usefulness of the P-Served-User is registered.
header field. When the serving SIP proxy determines that for an
incoming initial request the originating call case applies,
it determines the served user by looking at the P-Asserted-
12.4.2 Application Service Invocation Identity header field defined in RFC 3325 (see Sections 2.8,
We describe the following scenarios in the context of the 10.4, and 20.3), which carries the Network-Asserted Identity
large-scale distributed SIP network where the capability gaps of the originating user. When, after filtering through the SCI
of the SIP signaling messages that can be fulfilled by the for this initial request, the serving proxy may decide to for-
P-Served-User header field are identified: ward the request to an AS, the AS has to go through a similar
process of determining the session case and the served user.
◾◾ General Scenario Since it should come to the same conclusion that this is an
◾◾ Diversion originating session case, it also has to look at the P-Asserted-
– Continue on terminating leg, but finish subsequent Identity header field to determine the served user.
terminating SSFC first When the serving proxy determines that for an incom-
– Create new originating leg and provide originating ing initial request, the terminating call case applies, it deter-
SSFC processing mines the served user by looking at the Request-URI defined
◾◾ Call out of the blue in RFC 3261, which carries the identity of the intended ter-
– On behalf of user B, but service profile of service minating user. When, after processing through the SSFC for
identity C this initial request, the serving proxy may decide to forward
the request to an AS, the AS has to go through a similar
In summary, all those scenarios describe the following process of determining the session case and the served user.
SIP capability gaps: Since it should come to the same conclusion that this is a ter-
minating session case, it also has to look at the Request-URI
◾◾ The identity of the served user can be conveyed on the to determine the served user.
SCI interface in order to be able to offer real-world In the originating case, it can be observed that while the
application services. P-Asserted-Identity header field just represents the originat-
◾◾ It is required that, in addition to the served-user iden- ing user when it enters the serving proxy, it is overloaded
tity, the session case needs to be conveyed in order to with another meaning when it is sent to an AS over the SSFC
be able to offer appropriate services to the served user. interface. This other meaning is that it serves as a represen-
tation of the served user. In the terminating case, a similar
overloading happens to the Request-URI; while it first only
represented the identity of the intended terminating user, it
12.4.2.1 General Scenarios
is overloaded with another meaning when it is sent to an AS
A SIP proxy that serves as a registrar may handle originat- over the ISC interface. This other meaning is that it serves
ing and terminating session states for users allocated to it. as a representation of the served user. In basic call scenarios,
This means that any call that is originated by a specific user this does not show up as a problem; however, once more com-
or any call that is terminated to that specific user will pass plicated service scenarios (notably forwarding services) need
through that particular proxy that is allocated to that user. to be realized, it poses severe limitations. Such scenarios are
At the time of servicing the call, it may also imply that this brought forward in the following subsections. In those situa-
particular serving SIP proxy that is allocated for a specific tions, the P-Served-User header field overcomes those limita-
user will download the profile of this particular user from tions and is very useful in providing real-world application
the server/database (e.g., HSS of 3GPP’s IMS network). This services using a distributed SIP network and a distributed AS
user profile tells this serving SIP proxy whether the user is farm, making the large-scale network scalable.
allowed to originate or terminate calls or whether an AS
needs to be linked in over the particular SCI interface. The
12.4.2.2 Call Diversion Continuing
user profile information that determines whether a particular
on Terminating Leg
initial request needs to be sent to a particular AS is decided
using the SSFC criteria. For the serving SIP proxy to be able This scenario deals with the diversion of the same call con-
to meet its responsibilities, it needs to determine on which tinuing on the terminating leg but finishing on the subse-
user’s behalf it is performing its tasks and which session case quent terminating SSFC first. Imagine a service scenario
is applicable for the particular request. As explained earlier, where a user B has a terminating service that diverts the
the session case distinguishes the originating and terminating call to a different destination but is required to still execute
446 ◾ Handbook on Session Initiation Protocol
subsequent terminating services for the same user. This means not tell the AS which user is being served; it just presents a
that this particular user has multiple SSFCs configured that history of diversions that might not be even caused by the
are applicable for an incoming initial request. When the serv- systems serving this particular user.
ing SIP proxy receives an initial INVITE request, it analyzes
the request and determines that the session case is for a termi-
12.4.2.3 Call Diversion Creating
nating registered user, and then it determines the served user
New Originating Leg
to be user B by looking at the Request-URI. Now the serving
proxy starts the SSFC processing. The first SSFC that matches We are now considering a scenario where the call is diverted,
the INVITE request causes the INVITE to be forwarded over creating a new originating leg and providing originating
the SCI interface to an AS that hosts user B’s diversion service SSFC processing. Imagine a service scenario where a user B
by adding the AS and serving proxy’s own host names to the has a terminating service that diverts the call to a different
Route header. The serving proxy may add an ODI to the serv- destination. It is required that a forwarded call leg is handled
ing proxy’s own host name (e.g., ODI is used in 3GPP’s IMS as an originating call leg and that originating services for user
network) on the Route header. This ODI, even if its dialog ID B are executed. This means that this particular user has one
may have been changed by an AS for some reason, allows the or more SSFCs configured that are applicable for an outgoing
serving proxy to correlate an INVITE coming from an AS initial request. When the serving SIP proxy receives an initial
over the SCI interface to the existing session that forwarded INVITE request, it analyzes the request and determines that
the INVITE to the AS in the first place. the session case is for a terminating registered user, and then
When the AS receives the initial INVITE request, it it determines the served user to be user B by looking at the
analyzes the request and determines that the session case Request-URI. Now, the serving SIP proxy starts the SSFC
is for a terminating registered user, and then it determines processing. The first SSFC that matches the INVITE request
the served user to be user B by looking at the Request-URI. causes the INVITE to be forwarded over the ISC interface
On the basis of some criteria depending on implementation, to an AS that hosts user B’s diversion service by adding the
the diversion service may conclude that the request needs to AS and serving SIP proxy’s own host names to the Route
be diverted to another user or application C. It does this by header. The serving SIP proxy adds an ODI to the serving
changing the Request-URI to C. It may record the Request- SIP proxy’s own host name on the Route header. This allows
URI history by using the History-Info header field defined the serving SIP proxy to correlate an INVITE coming from
in RFC 4244. Then, the AS removes itself from the Route an AS over the ISC interface to the existing session that for-
header and routes the INVITE request back to the serving warded the INVITE to the AS in the first place.
SIP proxy by using the topmost Route header field. When When the AS receives the initial INVITE request, it ana-
the serving SIP proxy receives the INVITE over the ISC lyzes the request and determines that the session case is for a
interface, it can see that the Route header contains its own terminating registered user, and then it determines the served
host name and an ODI that correlates to an existing termi- user to be user B by looking at the Request-URI. On the
nating session for user B. This can be used by the serving SIP basis of some criteria, the diversion service concludes that the
proxy to analyze whether there are still unexecuted SSFCs. request needs to be diverted to another user or application
(Note that the implementation behavior of the serving SIP C. It does this by changing the Request-URI to C. It may
proxy on receiving an INVITE with a changed Request-URI record the Request-URI history by using the History-Info
is to terminate the SCI processing and to route the request header field defined in RFC 4244. Then, the AS removes
based on the new Request-URI value.) itself from the Route header. To make sure that the request is
The process repeats itself. The INVITE is forwarded to handled as a new originating call on behalf of user B, the AS
the AS that is associated with this particular SSFC. When adds the orig parameter to the topmost route header. Then, it
the AS receives the initial INVITE request, it analyzes the routes the INVITE request back to the serving SIP proxy by
request and determines that the session case is for a terminat- using this topmost Route header field. When the serving SIP
ing registered user, and then it determines the served user proxy receives the INVITE over the ISC interface, it can see
to be user C by looking at the Request-URI. This is clearly that the topmost Route header contains its own host name
wrong, as the user being served is still user B. This scenario and an orig parameter. Because the topmost Route header
clearly shows the problem that occurs when the Request-URI contains the orig parameter, the serving SIP proxy concludes
is overloaded with the meanings intended target identity and that the INVITE should be handled as if a call is originated
served user with the operation as described earlier. Moreover, by the served user. The served user is determined from the
it shows that this use case cannot be realized without intro- P-Asserted-Identity header to be user A. This is clearly wrong,
ducing a mechanism that conveys information about the as the user being served is and should be user B. For the sake
served user from the serving SIP proxy to the AS. Use of the of discussion, let us assume that the serving SIP proxy can
History-Info element does not solve this problem as it does determine that the served user is user B.
Service and Served-User Identity in SIP ◾ 447
Then, the procedure would continue as follows: The serv- or stand-alone requests, which are routed between nodes
ing SIP proxy starts the originating SSFC processing; the first in a Trust Domain for P-Served-User. The P-Served-User
SSFC that matches the INVITE request causes the INVITE header field contains an identity of the user that represents
to be forwarded over the ISC interface to an AS that hosts an the served user. The sescase parameter may be used to convey
originating service of user B by adding the AS and serving whether the initial request is originated by the served user or
SIP proxy’s own host names to the Route header. The serv- destined for the served user. The regstate parameter may be
ing SIP proxy adds an ODI to the serving SIP proxy’s own used to indicate whether the initial request is for a registered
host name on the Route header. The INVITE is forwarded or unregistered user. The ABNF, defined in RFC 5234, syn-
to the AS that is associated with this particular SSFC. When tax of the P-Served-User header field is as follows:
the AS receives the initial INVITE request, it analyzes the
request and determines that the session case is for an origi- P-Served-User = "P-Served-User" HCOLON
PServedUser- value *(SEMI served-user-param)
nating registered user, and then it determines the served user
served-user-param = sessioncase-param/
to be user A by looking at the P-Asserted-Identity. This is registration-state-param
clearly wrong, as the user being served is and should be user /generic-param
B. This scenario clearly shows the problem that occurs when PServedUser-value = name-addr/addr-spec
the P-Asserted-Identity is overloaded with the meanings call sessioncase-param = "sescase" EQUAL
originator and served user with the operation as described ear- "orig"/"term"
registration-state-param = "regstate" EQUAL
lier. It shows that this use case cannot be realized without
"unreg"/"reg"
introducing a mechanism that conveys information about EQUAL, HCOLON,
the served user from the serving SIP proxy to the AS, and SEMI, name-addr,
from the AS to the serving SIP proxy. Use of the History- addr-spec, and
Info element does not solve this problem as it does not tell generic-param are
the AS which user is being served, but just presents a history defined in RFC
3261.
of diversions that might not be even caused by the systems
serving this particular user.
The following is an example of a P-Served-User header
field:
12.4.2.4 Call Out of the Blue
P-Served-User: <sip:[email protected]>;
This scenario is dealing with the call that is coming out of sescase=orig; regstate=reg
the blue on behalf of user B, but the service profile happens
to be of service identity C. There are services that need to be
able to initiate a call, whereby the call appears to be com- 12.4.4 Proxy Behavior: Generating
ing from a user B but the service profile on behalf of service the P-Served-User Header
identity C needs to be executed in the serving SIP proxy.
When a call needs to appear as coming from user B, it means Proxies that support the header must only insert the header
that the P-Asserted-Identity needs to contain B’s identity. For in initial requests for a dialog or in stand-alone requests when
example, 3GPP’s IMS network uses OIP service that uses the the following conditions hold:
P-Asserted-Identity to present the call originator. This makes
sense because that is the main meaning expressed by the ◾◾ The proxy has the capability to determine the served
P-Asserted-Identity header field. It is clear that no INVITE user for the current request.
request can be constructed currently that would achieve ◾◾ The next hop is part of the same Trust Domain for
both requirements expressed in the first paragraph, because P-Served-User.
the P-Asserted-Identity is overloaded with two meanings on
the ISC interface. When the serving SIP proxy receives this When the above conditions do not hold, the proxy must
request, it will determine that the served user is user B, which not insert the header.
is not what we want to achieve.
12.4.5 Proxy Behavior: Consuming
12.4.3 P-Served-User Header Field the P-Served-User Header
Usage, Definition, and Syntax
A proxy that supports the header must, upon receiving from
We explain this header field here again for detailed explana- a trusted node the P-Served-User header in initial requests
tions, although all header fields can be seen in Section 2.8. for a dialog or in stand-alone requests, take the value of
This header field can be added to initial requests for a dialog the P-Served-User header to represent the served user in
448 ◾ Handbook on Session Initiation Protocol
operations that require such information. A proxy that sup- P-Preferred service IDs are explained in detail. All of these
ports the header must remove the header from requests or service-related IDs can be used in private intradomain com-
responses when the header was received from a node out- munications, but are not suitable for interdomain or public
side the Trust Domain for P-Served-User before further for- Internet communications. Finally, the security concerns in
warding the message. A proxy that supports the header must using these identities are addressed.
remove the header from requests or responses when the next
hop is a node outside the Trust Domain for P-Served-User PROBLEMS
before further forwarding the message.
1. What are IDs used in SIP service-agnostic? Why are
service IDs not officially standardized for use over the
12.4.6 Applicability and Limitations public Internet?
2. How do you suggest to identify communication ser-
The use of the P-Served-User header field extensions is only
vices from signaling messages of SIP including presence
applicable inside a Trust Domain for served user and can-
and device dispatch? Describe in detail using examples
not be used in the Internet in general, as described earlier.
in intradomain communication environments.
Nodes in such a Trust Domain explicitly trust each other
3. Do the IDs in SIP signaling messages provide suffi-
to convey the served user and to be responsible for with-
cient hints for services over the SIP network that will
holding that information outside of the Trust Domain. The
provide interoperability? If not, what are the mecha-
means by which the network determines the served user
nisms that can be used for creation of service IDs over
and the policies that are executed for a specific served user
the private SIP network? Justify your recommendation
is not considered in RFC 5502. The served-user information
with examples.
lacks an indication of whom or what specifically determined
4. What are the P-Asserted-Service and P-Preferred-
the served user, and so it must be assumed that the Trust
Service SIP header fields? Explain the usage of these
Domain determined the served user. Therefore, the informa-
header fields in request messages in SIP for the fol-
tion is only meaningful when securely received from a node
lowing: UAC, UAS, and proxy. Why are these header
known to be a member of the Trust Domain. Because the
fields not required to be used in SIP response messages?
served user typically only has validity in one administrative
Explain in detail the usage of these headers by using
domain, it is in general not suitable for interdomain use or
examples.
use in the Internet at large. Despite these limitations, there
5. What are the security concerns in handling of ser-
are sufficiently useful specialized deployments that meet the
vices using the P-Asserted-Service and P-Preferred-
assumptions described above, and that can accept the limita-
Service SIP headers? How are these security concerns
tions that result, to warrant the informational publication of
mitigated?
this mechanism, for example, a closed network like 3GPP’s
6. What is the P-Served-User SIP header? How does it
IMS.
help in handling of services over the SIP network in
general? Explain using general scenarios for invocation
of services.
12.5 Summary 7. Explain the use of the P-Served-User field for invoca-
We have explained why service IDs cannot be standardized tion of services for the following cases: diversion—
for services over the public Internet in general. However, we continue on terminating leg, but finish subsequent
have described how service IDs can be created using some terminating SSFC first; diversion—create new origi-
canonical forms based on some hints of services that are pro- nating leg and provide originating SSFC processing;
vided by the user-level service-agnostic SIP signaling. The and call out of the blue—on behalf of user B, but ser-
P-Preferred service ID, service URNs, and usage of these vice profile of service identity C.
header fields are described. In addition, the P-Served-User 8. What are the security concerns in handling of services
ID that is used for handling of services are presented. The using the P-Served-User and P-Preferred-Service SIP
behaviors of all SIP entities in processing the services and headers? How are these security concerns mitigated?
Chapter 13
13.2 Connections Management
in SIP Network
13.1 Introduction
13.2.1 Overview
The Session Initiation Protocol (SIP) enables end-to-end
communications between user agents (UAs) after establish- The key idea of connections management in SIP specified in
ing a session that may last from a few seconds to hours. SIP RFC 5626 that is described here is that when a UA sends a
449
450 ◾ Handbook on Session Initiation Protocol
REGISTER request or a dialog-forming request, the proxy can with server-provided certificates over TCP and failures and
later use this same network flow that appears to be end-to-end rebooting of servers in the network path. To illustrate different
connections between the communicating entities, whether kinds of proxies in a large SIP network, RFC 5626 has defined
this is a bidirectional stream of UDP datagrams, a TCP con- some terms like authoritative proxy and edge proxy (EP). An
nection, or an analogous concept in another transport proto- authoritative proxy that handles non-REGISTER requests for
col, to forward any incoming requests that need to go to this a specific AOR, performs the logical Location Server lookup
UA in the context of the registration or dialog. In this context, described in RFC 3263 (see Section 8.2), and forwards those
the term flow used in SIP apparently may constitute a connec- requests to specific Contact URIs. An edge proxy is any proxy
tion using the parameters defined in RFC 5626, such as flow that is located topologically between the registering UA and
token, instant-id, and reg-id that are specified by a UA at the the authoritative proxy. A UA can have multiple connections
time of registration as well as ob parameter used in the Path bindings with any number of proxies over the SIP network.
header. The flow token is an identifier that uniquely identi-
fies a flow that can be included in a SIP Uniform Resource
Identifier (URI). The reg-id refers to the value of a new header
13.2.2 Flow-Based Connections Setup
field parameter for the Contact header field. When a UA reg- In connections mechanisms, each UA has a unique instance-
isters multiple times, each for a different flow, each concur- id that stays the same for this UA even if the UA reboots or
rent registration gets a unique reg-id value. The instance-id is power cycled. Each UA can register multiple times over
refers to the value of the sip.instance media feature tag, which different flows for the same SIP AOR to achieve high reli-
appears as a +sip.instance Contact header field parameter and ability. Each registration includes the instance-id for the UA
is a Uniform Resource Name (URN) that uniquely identifies and a reg-id label that is different for each flow. The registrar
this specific UA instance. Note that the registration with mul- can use the instance-id to recognize that two different regis-
tiple telephone addresses of record (AORs) for client-initiated trations both correspond to the same UA. The registrar can
outbound connection is described in Section 3.3.5. use the reg-id label to recognize whether a UA is creating a
The ob parameter is a SIP URI parameter that has a dif- new flow or refreshing or replacing an old one, possibly after
ferent meaning depending on context. In a Path header field a reboot or a network failure. For achieving reliability, a UA
value, it is used by the first edge proxy to indicate that a flow can set up connection bindings with multiple logical out-
token was added to the URI. In a Contact or Route header bound proxies/registrars running on different hosts or mul-
field value, it indicates that the UA would like other requests tiple edge proxies of a given administrative domain. When
in the same dialog to be routed over the same flow. In addi- a proxy goes to route a message to a UA for which it has a
tion, RFC 5626 defines the outbound option tag, 430 Flow binding, it can use any one of the flows on which a successful
Failed, and 439 First Hop Lacks Outbound response code. A registration has been completed.
registrar places the outbound option tag in a Require header A failure to deliver a request on a particular flow can be
to indicate to the registering UA that the registrar used reg- tried again on an alternate flow. Proxies can determine which
istrations using the binding rules defined in RFC 5626. The flows go to the same UA by comparing the instance-id.
response 430 Flow Failed code is used by an edge proxy to Proxies can tell that a flow replaces a previously abandoned
indicate to the Authoritative Proxy that a specific flow to a flow by looking at the reg-id. When sending a dialog-forming
UA instance has failed. The 439 First Hop Lacks Outbound request, a UA can also ask its first edge proxy to route subse-
Support response code is used by a registrar to indicate that it quent requests in that dialog over the same flow. This is nec-
supports the outbound feature described in this specification, essary whether the UA has registered or not. UAs use a simple
but that the first outbound proxy that the user is attempting periodic message as a keep-alive mechanism to keep their
to register through does not. In addition, keep-alive mecha- flow to the proxy or registrar alive. For connection-oriented
nisms are also used and a UA may register over multiple flows transports such as TCP, this is based on carriage-return and
at the same time for improving reliability, and the keep-alive line-feed sequences (CRLF), while for transports that are not
schemes will make sure that flows are active. If any flow goes connection oriented, this is accomplished by using a SIP-
down and the connection is not available because of server specific usage profile of Session Traversal of UDP through
failures within the network path, the UA will able to get con- NAT (STUN) specified in RFC 5389 (see Section 14.3).
nected using any one of the remaining alternate flows.
The keep-alive mechanism is also used to keep network
address translator (NAT) bindings open and to allow the UA
13.2.3 Keep-Alive Mechanisms
to detect when a flow has failed. As a result, connections man- Two keep-alive mechanisms are specified in RFC 5626:
agement using flow mechanisms allows keeping end-to-end CRLF keep-alive and STUN keep-alive. Each of these
persistence connections with UAs despite the existence of fire- mechanisms uses a client-to-server ping keep-alive and a cor-
walls and NATs, or the use of Transport Layer Security (TLS) responding server-to-client pong message. This ping–pong
Connections Management and Overload Control in SIP ◾ 451
sequence allows the client, and optionally the server, to tell definition could change because a NAT device in the net-
if its flow is still active and useful for SIP traffic. The server work path reboots and the resulting public Internet Protocol
responds to pings by sending pongs. If the client does not (IP) address or port mapping for the UA changes. To detect
receive a pong in response to its ping (allowing for retrans- this, STUN requests are sent over the same flow that is
mission for STUN as described in RFC 5389; see Section being used for the SIP traffic. The proxy or registrar acts as a
8.2.4), it declares the flow dead and opens a new flow in limited STUN server on the SIP signaling port. Again, the
its place. However, RFC 5626 (see Section 13.2) has sug- STUN mechanism is very robust and allows the detection of
gested timer values for these client keep-alive mechanisms. a changed IP address and port.
These timer values were chosen to keep most NAT and fire-
wall bindings open, to detect unresponsive servers within 2
minutes, and mitigate against the avalanche restart problem.
13.2.4 Grammar
However, the client may choose different timer values to suit RFC 5626 defines a new header field Flow-Timer, and new
its needs, for example, to optimize battery life. In some envi- Contact header field parameters, reg-id and +sip.instance.
ronments, the server can also keep track of the time since a ping The grammar includes the definitions from RFC 3261. Flow-
was received over a flow to guess the likelihood that the flow is Timer is an extension-header from the message-header in
still useful for delivering SIP messages. When the UA detects the RFC 3261 augmented Backus–Naur Form (ABNF). The
that a flow has failed or that the flow definition has changed, ABNF is repeated here for convenience, although all SIP syn-
the UA needs to reregister and will use the backoff mechanism taxes are provided in Section 2.4.1:
described later to provide congestion relief when a large num-
ber of agents simultaneously reboot. A keep-alive mechanism Flow-Timer = "Flow-Timer" HCOLON 1*DIGIT
contact-params = /c-p-reg/c-p-instance
needs to keep NAT bindings refreshed; for connections, it also
c-p-reg = "reg-id" EQUAL 1*DIGIT; 1 to
needs to detect failure of a connection; and for connectionless (2^31 - 1)
transports, it needs to detect flow failures including changes to c-p-instance = "+sip.instance" EQUAL DQUOTE
the NAT public mapping. For connection-oriented transports "<" instance-val ">" DQUOTE
such as TCP (RFC 0793) and Stream Control Transmission instance-val = 1*uric; defined in RFC 3261
Protocol (SCTP; RFC 4960), this specification describes a
keep-alive approach based on sending CRLFs. For connec- The value of the reg-id must not be 0 and must be less
tionless transport, such as UDP (RFC 0768), this specification than 2^31.
describes using STUN (RFC 5389, see Section 8.2.4) over the
same flow as the SIP traffic to perform the keep-alive. UAs and 13.2.5 Connections Management
proxies are also free to use native transport keep-alives; how-
Procedures for SIP Entities
ever, the application may not be able to set these timers on a
per-connection basis, and the server certainly cannot make any 13.2.5.1 User Agent
assumption about what values are used. Use of native transport
13.2.5.1.1 Instance ID Creation
keep-alives is beyond the scope of this document.
Each UA must have an Instance Identifier URN defined
in RFC 2141 that uniquely identifies the device. Usage of
13.2.3.1 CRLF Keep-Alive Technique
a URN provides a persistent and unique name for the UA
This approach can only be used with connection-oriented instance. It also provides an easy way to guarantee unique-
transports such as TCP or SCTP. The client periodically ness within the AOR. This URN must be persistent across
sends a double-CRLF (the ping), then waits to receive a sin- power cycles of the device. The instance ID must not change
gle CRLF (the pong). If the client does not receive a pong as the device moves from one network to another.
within an appropriate amount of time, it considers the flow A UA should create a Universally Unique Identifier
failed. It should be noted that sending a CRLF over a con- (UUID) URN specified in RFC 4122 as its instance-id. The
nection-oriented transport is backwards compatible because UUID URN allows for noncentralized computation of a
of requirements in Section 2.4.2 (RFC 3261), but only URN based on time, unique names such as a medium access
implementations that support this specification will respond control (MAC) address, or a random number generator. If
to a ping with a pong. a URN scheme other than UUID is used, the UA must
only use URNs for which an RFC (from the IETF stream)
defines how the specific URN needs to be constructed and
13.2.3.2 STUN Keep-Alive Technique
used in the +sip.instance Contact header field parameter
This approach can only be used for connection-less trans- for outbound behavior. To convey its instance-id in both
ports, such as UDP. For connection-less transports, a flow requests and responses, the UA includes a sip.instance media
452 ◾ Handbook on Session Initiation Protocol
feature tag as a UA characteristic defined in RFC 3840 (see reg-id values as the corresponding initial registration where
Sections 2.11 and 3.4). This media feature tag is encoded in the binding was added. Registrations that merely refresh an
the Contact header field as the +sip.instance Contact header existing binding are sent over the same flow as the original
field parameter. One case where a UA could prefer to omit registration where the binding was added. If a reregistration
the sip.instance media feature tag is when it is making an is rejected with a recoverable error response, for example, by a
anonymous request or when some other privacy concern 503 Service Unavailable containing a Retry-After header, the
requires that the UA not reveal its identity. UAC should not tear down the corresponding flow if the flow
When the instance ID is used in this specification, it is uses a connection-oriented transport such as TCP. As long as
extracted from the value in the sip.instance media feature tag. pongs are received in response to pings, the flow should be
Thus, equality comparisons are performed using the rules for kept active until a nonrecoverable error response is received.
URN equality that are specific to the scheme in the URN. This prevents unnecessary closing and opening of connec-
If the element performing the comparisons does not under- tions. In an initial registration or reregistration in the case of
stand the URN scheme, it performs the comparisons using the third-party registrations, a UA must not include a reg-id
the lexical equality rules defined in RFC 2141. Lexical equal- header field parameter in the Contact header field if the reg-
ity could result in two URNs being considered unequal when istering UA is not the same instance as the UA referred to by
they are actually equal. In this specific usage of URNs, the the target Contact header field. This practice is occasionally
only element that provides the URN is the SIP UA instance used to install forwarding policy into registrars. A UAC also
identified by that URN. As a result, the UA instance has to must not include an instance-id feature tag or reg-id Contact
provide lexically equivalent URNs in each registration it gen- header field parameter in a request to unregister all Contacts
erates. This is likely to be normal behavior in any case; clients (a single Contact header field value with the value of “*”).
are not likely to modify the value of the instance ID so that
it remains functionally equivalent to (yet lexicographically
13.2.5.1.3 Sending Connection-Oriented
different from) previous registrations.
Non-REGISTER Requests
UAs that support this specification should include the out-
13.2.5.1.2 Connection-Oriented Registrations
bound option tag in a Supported header field in a request
To provide reliability for connections management per that is not a REGISTER request and finds a protocol, IP
RFC 5626, a UA must support sets with at least two out- address, and port for the next-hop URI using the Domain
bound proxy URIs and should support sets with up to four Name System (DNS). For protocols that do not use TLS,
URIs. For each outbound proxy URI in the set, the UA if the UAC has an existing flow to this IP address, and port
client (UAC) should send a REGISTER request using this with the correct protocol, then the UAC must use the exist-
URI as the default outbound proxy. Alternatively, the UA ing connection. For TLS protocols, there must also be a
could limit the number of flows formed to conserve bat- match between the host production in the next hop and one
tery power, for example. If the set has more than one URI, of the URIs contained in the subjectAltName in the peer
the UAC must send a REGISTER request to at least two certificate. If the UAC cannot use one of the existing flows,
of the default outbound proxies from the set. UAs that sup- then it should form a new flow by sending a datagram or
port RFC 5626 must include the outbound option tag in a opening a new connection to the next hop, as appropriate for
Supported header field in a REGISTER request in addition the transport protocol.
to other and headers and parameters used for the normal reg- Typically, a UAC using the procedures of this document
istration. REGISTER requests must include an instance-id and sending a dialog-forming request will want all subse-
media feature tag as specified earlier. A UAC conforming to quent requests in the dialog to arrive over the same flow. If
this specification must include in the Contact header field, a the UAC is using a Globally Routable UA URI (GRUU)
reg-id parameter that is distinct from other reg-id parameters specified in RFC 5627 that was instantiated using a Contact
used in other registrations that use the same +sip.instance header field value that included an ob parameter, the UAC
Contact header field parameter and AOR. Each one of these sends the request over the flow used for registration, and
registrations will form a new flow from the UA to the proxy. subsequent requests will arrive over that same flow. If the
The sequence of reg-id values does not have to be sequential UAC is not using such a GRUU, then the UAC adds an
but must be exactly the same sequence of reg-id values each ob parameter to its Contact header field value. This will
time the UA instance power cycles or reboots, so that the reg- cause all subsequent requests in the dialog to arrive over the
id values will collide with the previously used reg-id values. flow instantiated by the dialog-forming request. This case
This is so that the registrar can replace the older registrations. is typical when the request is sent before registration, such
For refreshing a binding and for removing a binding, sub- as in the initial subscription dialog for the configuration
sequent REGISTER requests use the same instance-id and framework.
Connections Management and Overload Control in SIP ◾ 453
treatment if incoming requests for the UA are received, etc.). and reg-id, or with Path header field values that do not con-
The server must wait for an amount of time larger than the tain the ob URI parameter.
Flow-Timer in order to have a grace period to account for If the Contact header field does not contain a +sip.
transport delay. instance Contact header field parameter, the registrar pro-
cesses the request using the Contact binding rules defined
in RFC 3261. When a +sip.instance Contact header field
13.2.5.3 Registrar parameter and a reg-id Contact header field parameter are
RFC 5626 updates the definition of a binding in RFC present in a Contact header field of a REGISTER request
3261 (see Section 3.3) and RFC 3327 (see Section 2.8.2). (after the Contact header validation as described above), the
Registrars that implement this specification must support the corresponding binding is between an AOR and the combi-
Path header mechanism defined in RFC 3327 (see Section nation of the instance-id (from the +sip.instance Contact
2.8.2). When receiving a REGISTER request, the registrar header parameter) and the value of reg-id Contact header
must check from its Via header field if the registrar is the field parameter. The registrar must store in the binding the
first hop or not. If the registrar is not the first hop, it must Contact URI, all the Contact header field parameters, and
examine the Path header of the request. If the Path header any Path header field values. (Even though the Contact
field is missing or it exists but the first URI does not have an URI is not used for binding comparisons, it is still needed
ob URI parameter, then outbound processing must not be by the authoritative proxy to form the target set.) Provided
applied to the registration. In this case, the following pro- that the UAC had included an outbound option tag (defined
cessing applies: if the REGISTER request contains the reg-id in Section 2.10) in a Supported header field value in the
and the outbound option tag in a Supported header field, REGISTER request, the registrar must include the outbound
then the registrar must respond to the REGISTER request option tag in a Require header field value in its response to
with a 439 First Hop Lacks Outbound Support defined in that REGISTER request.
RFC 5626 response; otherwise, the registrar must ignore If the UAC has a direct flow with the registrar, the reg-
the reg-id parameter of the Contact header. See Section 2.6, istrar must store enough information to uniquely identify
Table 2.4 for more information on the 439 response code. A the network flow over which the request arrived. For com-
Contact header field value with an instance-id media feature mon operating systems with TCP, this would typically be
tag but no reg-id header field parameter is valid (this combi- just the handle to the file descriptor where the handle would
nation will result in the creation of a GRUU, as described in become invalid if the TCP session was closed. For common
the GRUU specification in RFC 5627), but one with a reg-id operating systems with UDP, this would typically be the file
but no instance-id is not valid. If the registrar processes a descriptor for the local socket that received the request, the
Contact header field value with a reg-id but no instance-id, it local interface, and the IP address and port number of the
simply ignores the reg-id parameter. remote side that sent the request. The registrar may store this
A registration containing a reg-id header field param- information by adding itself to the Path header field with
eter and a nonzero expiration is used to register a single an appropriate flow token. If the registrar receives a reregis-
UA instance over a single flow, and can also deregister any tration for a specific combination of AOR, and instance-id
Contact header field with zero expiration. Therefore, if the and reg-id values, the registrar MUST update any informa-
Contact header field contains more than one header field tion that uniquely identifies the network flow over which the
value with nonzero expiration and any of these header field request arrived if that information has changed, and should
values contain a reg-id Contact header field parameter, the update the time the binding was last updated. To be com-
entire registration should be rejected with a 400 Bad Request pliant with this specification, registrars that can receive SIP
response. The justification for recommending rejection ver- requests directly from a UAC without intervening edge prox-
sus making it mandatory is that the receiver is allowed by ies must implement the same keep-alive mechanisms as edge
RFC 3261 to squelch (not respond to) excessively malformed proxies described earlier. Registrars with a direct flow with a
or malicious messages. If the Contact header did not contain UA may include a Flow-Timer header in a 2xx class registra-
a reg-id Contact header field parameter or if that parame- tion response that includes the outbound option tag in the
ter was ignored as described above, the registrar must not Require header.
include the outbound option tag in the Require header field
of its response. The registrar must be prepared to receive,
13.2.5.4 Authoritative Proxy Procedures:
simultaneously for the same AOR, some registrations that
use instance-id and reg-id, and some registrations that do
Forwarding Requests
not. The registrar may be configured with local policy to When a proxy uses the location service to look up a regis-
reject any registrations that do not include the instance-id tration binding and then proxies a request to a particular
Connections Management and Overload Control in SIP ◾ 455
contact, it selects a contact to use normally, with a few addi- omitted for brevity and readability. In these examples, edge
tional rules: proxy 1 (EP1) and edge proxy 2 (EP2) are outbound proxies,
and Proxy is the authoritative proxy. The section is subdi-
◾◾ The proxy must not populate the target set with more vided into independent calls flows; however, they are struc-
than one contact with the same AOR and instance-id tured in sequential order of a hypothetical sequence of call
at a time. flows. We have considered that the outbound-proxy-set is
◾◾ If a request for a particular AOR and instance-id fails already configured on Bob’s UA for the sake of simplicity.
with a 430 Flow Failed defined in RFC 5626 response,
the proxy should replace the failed branch with another
13.2.5.5.1 Registration
target if one is available with the same AOR and
instance-id, but a different reg-id. Now that Bob’s UA is configured with the outbound proxy
◾◾ If the proxy receives a final response from a branch set, whether through configuration or using the configura-
other than a 408 Request Timeout or a 430 Flow tion framework procedures of the previous section, Bob’s UA
Failed response, the proxy must not forward the same sends REGISTER requests through each edge proxy (Figure
request to another target representing the same AOR 13.1) in the set. Once the registrations succeed, Bob’s UA
and instance-id. The targeted instance has already pro- begins sending CRLF keep-alive messages about every 2
vided its response. minutes.
In message F1 (Figure 13.1), Bob’s UA sends its first
The proxy uses the next-hop target of the message and registration through the first edge proxy in the outbound-
the value of any stored Path header field vector in the regis- proxy-set by including a loose route. The UA includes an
tration binding to decide how to forward and populate the instance-id and reg-id in its Contact header field value. Note
Route header in the request. If the proxy is colocated with the option tags in the Supported header.
the registrar and stored information about the flow to the
UA that created the binding, then the proxy must send the Message F1:
request over the same logical flow saved with the binding,
since that flow is known to deliver data to the specific target REGISTER sip:example.com SIP/2.0
Via: SIP/2.0/TCP
UA instance’s network flow that was saved with the binding.
192.0.2.2;branch=z9hG4bKnashds7
Implementation note: typically, this means that for TCP, Max-Forwards: 70
the request is sent on the same TCP socket that received the From: Bob <sip:bob@example.
REGISTER request. com>;tag=7F94778B653B
For UDP, the request is sent from the same local IP To: Bob <sip:[email protected]>
address and port over which the registration was received, Call-ID: 16CB75F21C70
CSeq: 1 REGISTER
to the same IP address and port from which the REGISTER
Supported: path, outbound
was received. If a proxy or registrar receives information Route: <sip:ep1.example.com;lr>
from the network that indicates that no future messages will Contact:
be delivered on a specific flow, then the proxy must invalidate <sip:[email protected];transport=tcp>;reg-id=1
all the bindings in the target set that use that flow (regard- ;+sip.
less of AOR). Examples of this are a TCP socket closing or instance="<urn:uuid:00000000-0000-1000-8000-
AABBCCDDEEFF>"
receiving a destination unreachable ICMP (Internet Control
Content-Length: 0
Message Protocol) error on a UDP flow. Similarly, if a proxy
closes a file descriptor, it MUST invalidate all the bindings in
Message F2 is similar to F1, but EP1 removes the Route
the target set with flows that use that file descriptor.
header field value, decrements Max-Forwards, and adds its
Via header field value. Since EP1 is the first edge proxy, it
13.2.5.5 Registration Call Flows Managing adds a Path header with a flow token and includes the ob
Client-Initiated Connection parameter.
We have provided an example message flow from RFC 5626 Path: <sip:VskztcQ/S8p4WPbOnHbuyh5iJvJIW3ib@
that extends the registration scheme defined in RFC 3261 ep1.example.com;lr;ob>
(see Section 3.3) as explained earlier. These call flows illustrate
most of the concepts related to the client-initiated outbound Since the response to the REGISTER (message F3)
connection management by the registration server. In many contains the outbound option tag in the Require header
cases, Via, Content-Length, and Max-Forwards headers are field, Bob’s UA will know that the registrar used outbound
456 ◾ Handbook on Session Initiation Protocol
F3. 200 OK
F4. 200 OK
F5. REGISTER
F6. REGISTER
F7. 200 OK
F8. 200 OK
F9. 2CRLF
F10. CRLF
F11. 2CRLF
F12. CRLF
Figure 13.1 Registration call flows managing outbound connections. (Copyright IETF. Reproduced with permission.)
binding rules. The response also contains the currently active Message F5:
Contacts and the Path for the current registration.
REGISTER sip:example.com SIP/2.0
Via: SIP/2.0/TCP
Message F3:
192.0.2.2;branch=z9hG4bKnqr9bym
Max-Forwards: 70
SIP/2.0 200 OK
From: Bob <sip:bob@example.
Via: SIP/2.0/TCP 192.0.2.15;branch=z9hG4bKnui
com>;tag=755285EABDE2
qisi
To: Bob <sip:[email protected]>
Via: SIP/2.0/TCP
Call-ID: E05133BD26DD
192.0.2.2;branch=z9hG4bKnashds7
CSeq: 1 REGISTER
From: Bob <sip:bob@example.
Supported: path, outbound
com>;tag=7F94778B653B
Route: <sip:ep2.example.com;lr>
To: Bob <sip:bob@example.
Contact:
com>;tag=6AF99445E44A
<sip:[email protected];transport=tcp>;reg-id=2
Call-ID: 16CB75F21C70
;+sip.
CSeq: 1 REGISTER
instance="<urn:uuid:00000000-0000-1000-8000-
Supported: path, outbound
AABBCCDDEEFF>"
Require: outbound
Content-Length: 0
Contact: <sip:[email protected];transport=tcp>;
reg-id=1;expires=3600
;+sip. Likewise in message F6, EP2 adds a Path header with
instance="<urn:uuid:00000000-0000-1000-8000- flow token and ob parameter.
AABBCCDDEEFF>"
Path: <sip:VskztcQ/S8p4WPbOnHbuyh5iJvJIW3ib@ Path: <sip:wazHDLdIMtUg6r0I/oRZ15zx3zHE1w1Z@
ep1.example.com;lr;ob> ep2.example.com;lr;ob>
Content-Length: 0
Message F8 tells Bob’s UA that outbound registra-
The second registration through EP2 (message F5) is tion was successful, and shows both Contacts. Note that
similar except that the Call-ID has changed, the reg-id is 2, only the Path corresponding to the current registration is
and the Route header goes through EP2. returned.
Connections Management and Overload Control in SIP ◾ 457
F13. INVITE
F14. INVITE
F16. INVITE
F17. INVITE
F18. 200 OK
F19. 200 OK
F20. 200 OK
F21. ACK
F22. ACK
F23. BYE
F24. BYE
F25. 200 OK
F26. 200 OK
Figure 13.2 Incoming call and proxy crash. (Copyright IETF. Reproduced with permission.)
458 ◾ Handbook on Session Initiation Protocol
Since EP1 just rebooted, it does not have the flow At this point, both UAs have the correct route set for the
described in the flow token. It returns a 430 Flow Failed dialog. Any subsequent requests in this dialog will route cor-
response. rectly. For example, the ACK request in message F21 is sent
from Alice’s UA directly to EP2. The BYE request in message
Message F15: F23 uses the same route set.
Message F16:
Message F23:
INVITE sip:[email protected];transport=tcp
SIP/2.0 BYE sip:[email protected];transport=tcp SIP/2.0
To: Bob <sip:[email protected]> To: Bob <sip:[email protected]>;tag=skduk2
From: Alice <sip:[email protected]>;tag=02935 From: Alice <sip:[email protected]>;tag=02935
Call-ID: klmvCxVWGp6MxJp2T2mb Call-ID: klmvCxVWGp6MxJp2T2mb
CSeq: 1 INVITE CSeq: 2 BYE
Route: <sip:wazHDLdIMtUg6r0I/oRZ15zx3zHE1w1Z@ Route: <sip:wazHDLdIMtUg6r0I/oRZ15zx3zHE1w1Z@
ep2.example.com;lr;ob> ep2.example.com;lr>
F27. 2CRLF
F28. CRLF
F29. 2CRLF
Bob discovers its
flow to EP1 failed
F30. REGISTER
F31. REGISTER
F32. 200 OK
F33. 200 OK
Figure 13.3 Call flows with reregistration. (Copyright IETF. Reproduced with permission.)
;+sip.instance="<urn:uuid: (messages F37, F38, and F39), and either Bob or Alice can
00000000-0000-1000-8000-AABBCCDDEEFF>" send in-dialog requests.
In message F31, EP1 inserts a Path header with a new
flow token: Message F34:
F37. 200 OK
F38. 200 OK
F39. 200 OK
F40. ACK
F41. ACK
F42. BYE
F43. BYE
F44. 200 OK
F45. 200 OK
Figure 13.4 Outgoing call flows after reregistration. (Copyright IETF. Reproduced with permission.)
460 ◾ Handbook on Session Initiation Protocol
Record-Route: below. The UA needs to detect when a specific flow fails. The
<sip:3yJEbr1GYZK9cPYk5Snocez6DzO7w+AX@ UA actively tries to detect failure by periodically sending
ep1.example.com;lr>
keep-alive messages using one of the techniques described
in the following sections. If a flow with a registration has
When EP1 receives the BYE (message F42) from Bob’s
failed, the UA follows the procedures earlier to form a new
UA, it can tell that the request is an outgoing request (since
flow to replace the failed one. When a successful registra-
the source of the request matches the flow in the flow token)
tion response contains the Flow-Timer header field, the value
and simply deletes its Route header field value and forwards
of this header field is the number of seconds the server is
the request on to Alice’s UA.
prepared to wait without seeing keep-alives before it could
consider the corresponding flow dead.
Message F42:
Note that the server would wait for an amount of time
BYE sip:[email protected] SIP/2.0 larger than the Flow-Timer in order to have a grace period to
From: Bob <sip:[email protected]>;tag=ldw22z account for transport delay. The UA must send keep-alives at
To: Alice <sip:[email protected]>;tag=plqus8 least as often as this number of seconds. If the UA uses the
Call-ID: 95KGsk2V/Eis9LcpBYy3 server-recommended keep-alive frequency, it should send its
CSeq: 2 BYE
keep-alives so that the interval between each keep-alive is
Route: <sip:3yJEbr1GYZK9cPYk5Snocez6DzO7w+AX@
ep1.example.com;lr> randomly distributed between 80% and 100% of the server-
Contact: <sip:[email protected];transport=tcp;ob> provided time. For example, if the server suggests 120 sec-
onds, the UA would send each keep-alive with a different
frequency between 95 and 120 seconds. If no Flow-Timer
13.2.6 Keep-Alive Mechanisms header field was present in a register response for this flow,
the UA can send keep-alives at its discretion. The sections
in SIP Network below provide recommended default values for these keep-
Keep-alives are used for refreshing NAT/firewall bindings alives. The client needs to perform normal SIP DNS resolu-
and detecting flow failure. Flows can fail for many reasons, tion described in RFC 3263 (see Section 8.2.4) on the URI
including the rebooting of NATs and the crashing of edge from the outbound-proxy-set to pick a transport. Once a
proxies. As described earlier, a UA that registers will begin transport is selected, the UA selects the keep-alive approach
sending keep-alives after an appropriate registration response. that is recommended for that transport.
A UA that does not register (e.g., a public switched telephone
network [PSTN] gateway behind a firewall) can also send
13.2.6.1 Keep-Alive with CRLF
keep-alives under certain circumstances. Under specific cir-
cumstances, a UAC might be allowed to send STUN keep- This approach must only be used with connection-oriented
alives even if the procedures in registration described earlier transports such as TCP or SCTP; it must not be used with
were not completed, provided that there is an explicit indica- connection-less transports such as UDP. A UA that forms
tion that the target first-hop SIP node supports STUN keep- flows checks if the configured URI to which the UA is con-
alives. For example, this applies to a nonregistering UA, or to necting resolves to a connection-oriented transport (e.g.,
a case where the UA registration succeeded but the response TCP and TLS over TCP). For this mechanism, the client
did not include the outbound option tag in the Require ping is a double-CRLF sequence, and the server pong is a
header field. It should be noted that a UA can always send a single CRLF, as defined in the ABNF below:
double CRLF (a ping) over connection-oriented transports
CRLF = CR LF
as this is already allowed by RFC 3261 (see Section 2.4.2.5).
double-CRLF = CR LF CR LF
However a UA that did not register using outbound registra- CR = %x0D
tion cannot expect a CRLF in response (a pong) unless the LF = %x0A
UA has an explicit indication that CRLF keep-alives are sup-
ported as described in this section. The ping and pong need to be sent between SIP messages
Likewise, a UA that did not successfully register with and cannot be sent in the middle of a SIP message. If send-
outbound procedures needs explicit indication that the tar- ing over TLS, the CRLFs are sent inside the TLS-protected
get first-hop SIP node supports STUN keep-alives before it channel. If sending over a SigComp compressed data stream
can send any STUN messages. A configuration option indi- described in RFC 3320, the CRLF keep-alives are sent inside
cating keep-alive support for a specific target is considered the compressed stream. The double CRLF is considered a
an explicit indication. If these conditions are satisfied, the single SigComp message. The specific mechanism for repre-
UA sends its keep-alives according to the same guidelines as senting these characters is an implementation-specific matter
those used when UAs register; these guidelines are described to be handled by the SigComp compressor at the sending
Connections Management and Overload Control in SIP ◾ 461
end. If a pong is not received within 10 seconds after send- keep-alive request should be a random number between 24
ing a ping (or immediately after processing any incoming and 29 seconds.
message being received when that 10 seconds expires), then Note on selection of time values: the upper bound of
the client MUST treat the flow as failed. Clients MUST sup- 29 seconds was selected, as many NATs have UDP time-
port this CRLF keep-alive. However, this value of 10-second outs as low as 30 seconds. The 24-second lower bound was
timeout was selected to be long enough that it allows plenty selected so that after 10 minutes, the jitter introduced by dif-
of time for a server to send a response even if the server is ferent timers will make the keep-alive requests unsynchro-
temporarily busy with an administrative activity. At the nized to evenly spread the load on the servers. Note that the
same time, it was selected to be small enough that a UA regis- short NAT timeouts with UDP have a negative impact on
tered to two redundant servers with unremarkable hardware battery life. If a STUN Binding Error Response is received,
uptime could still easily provide very high levels of overall or if no Binding Response is received after seven retrans-
reliability. Although some Internet protocols are designed for missions (16 times the STUN RTO timer—where RTO is
round-trip times over 10 seconds, SIP for real-time commu- an estimate of round-trip time), the UA considers the flow
nications is not really usable in these types of environments, failed. If the XOR-MAPPED-ADDRESS in the STUN
as users often abandon calls before waiting much more than Binding Response changes, the UA MUST treat this event
a few seconds. as a failure on the flow.
When a Flow-Timer header field is not provided in the
most recent success registration response, the proper selec-
13.2.6.2.1 STUN Keep-Alive Processing
tion of keep-alive frequency is primarily a trade-off between
battery usage and availability. The UA must select a random The STUN keep-alive processing is only applicable to the SIP
number between a fixed or configurable upper bound and transport layer that allows SIP and STUN Binding Requests
a lower bound, where the lower bound is 20% less than the to be mixed over the same flow, and constitutes a new STUN
upper bound. The fixed upper bound or the default configu- usage. The STUN messages are used to verify that connec-
rable upper bound should be 120 seconds (95 seconds for tivity is still available over a UDP flow, and to provide peri-
the lower bound) where battery power is not a concern, and odic keep-alives. These STUN keep-alives are always sent to
840 seconds (672 seconds for the lower bound) where battery the next SIP hop. STUN messages are not delivered end-to-
power is a concern. The random number will be different for end. The only STUN messages required by this usage are
each keep-alive ping. The rationale for the selection of time Binding Requests, Binding Responses, and Binding Error
values is as follows: the 120-second upper bound was chosen Responses. The UAC sends Binding Requests over the same
on the basis of the idea that for a good user experience, fail- UDP flow that is used for sending SIP messages. These
ures normally will be detected in this amount of time and a Binding Requests do not require any STUN attributes. The
new connection will be set up. The 14-minute upper bound corresponding Binding Responses do not require any STUN
for battery-powered devices was selected on the basis of NATs attributes except the XOR-MAPPED-ADDRESS. The UA
with TCP timeouts as low as 15 minutes. Operators that wish server (UAS), proxy, or registrar responds to a valid Binding
to change the relationship between load on servers and the Request with a Binding Response that must include the
expected time that a user might not receive inbound commu- XOR-MAPPED-ADDRESS attribute. If a server compliant
nications will probably adjust this time. The 95-second lower to this section receives SIP requests on a given interface and
bound was chosen so that the jitter introduced will result in UDP port, it must also provide a limited version of a STUN
a relatively even load on the servers after 30 minutes. server on the same interface and UDP port. Note that it is
easy to distinguish STUN and SIP packets sent over UDP
because the first octet of a STUN Binding method has a
13.2.6.2 Keep-Alive with STUN
value of 0 or 1, while the first octet of a SIP message is never
The STUN-based keep-alive approach must only be used a 0 or 1. Because sending and receiving binary STUN data
with connection-less transports, such as UDP; it must not on the same ports used for SIP is a significant and nonback-
be used for connection-oriented transports such as TCP wards compatible change to RFC 3261, this section requires
and SCTP. A UA that forms flows checks if the config- a number of checks before sending STUN messages to a SIP
ured URI to which the UA is connecting resolves to use node.
the UDP transport. The UA can periodically perform keep- If a SIP node sends STUN requests (e.g., due to incor-
alive checks by sending STUN specified in RFC 5389 (see rect configuration) despite these warnings, the node could
Section 8.2.4) Binding Requests over the flow as described in be blacklisted for UDP traffic. A SIP node must not send
the subsequent section. Clients must support STUN-based STUN requests over a flow unless it has an explicit indi-
keep-alives. When a Flow-Timer header field is not included cation that the target next-hop SIP server claims to sup-
in a successful registration response, the time between each port this specification. UACs must not use an ambiguous
462 ◾ Handbook on Session Initiation Protocol
configuration option such as “Work through NATs?” or “Do outbound registration succeeded and, if keep-alives are in
keep-alives?” to imply next-hop STUN support. A UAC may use on this flow, at least one subsequent keep-alive response
use the presence of an ob URI parameter in the Path header was received. The number of seconds to wait is computed
in a registration response as an indication that its first edge in the following way. If all of the flows to every URI in the
proxy supports the keep-alives defined in this document. outbound-proxy-set have failed, the base time is set to a lower
Typically, a SIP node first sends a SIP request and waits to value (with a default of 30 seconds); otherwise, in the case
receive a 2xx class response over a flow to a new target des- where at least one of the flows has not failed, the base time
tination, before sending any STUN messages. When sched- is set to a higher value (with a default of 90 seconds). The
uled for the next NAT refresh, the SIP node sends a STUN upper-bound wait time (W ) is computed by taking 2 raised
request to the target. Once a flow is established, failure of a to the power of the number of consecutive registration fail-
STUN request (including its retransmissions) is considered a ures for that URI, and multiplying this by the base time,
failure of the underlying flow. For SIP over UDP flows, if the up to a configurable maximum time (with a default of 1800
XOR-MAPPED-ADDRESS returned over the flow changes, seconds).
this indicates that the underlying connectivity has changed,
and is considered a flow failure. The SIP keep-alive STUN W = min[Tm, {Tb(2f )}]
usage requires no backwards compatibility with RFC 5389
(see Section 8.2.4). where
Tm = max time (maximum time), Tb = base time (baseline
time), and f = consecutive failures.
13.2.6.2.2 Use with SigComp
When STUN is used together with SigComp specified in These times may be configurable in the UA. The three
RFC 3320 compressed SIP messages over the same flow, the times are
STUN messages are simply sent uncompressed, outside of
SigComp. This is supported by multiplexing STUN mes- ◾◾ Tm with a default of 1800 seconds
sages with SigComp messages by checking the two topmost ◾◾ Tb (if all failed) with a default of 30 seconds
bits of the message. These bits are always 1 for SigComp, or 0 ◾◾ Tb (if all have not failed) with a default of 90 seconds
for STUN. All SigComp messages contain a prefix (the five
most significant bits of the first byte are set to 1) that does not For example, if the base time is 30 seconds, and there were
occur in UTF-8 of RFC 3629 encoded text messages; thus, three failures, then the upper-bound wait time is min[1800,
for applications that use this encoding (or ASCII encoding), {30(23)}] or 240 seconds. The actual amount of time the UA
it is possible to multiplex uncompressed application mes- waits before retrying registration (the retry delay time) is
sages and SigComp messages on the same UDP port. The computed by selecting a uniform random time between 50%
most significant two bits of every STUN Binding method and 100% of the upper-bound wait time.
are both zeroes. This, combined with the magic cookie, aids The UA MUST wait for at least the value of the retry
in differentiating STUN packets from other protocols when delay time before trying another registration to form a new
STUN is multiplexed with other protocols on the same port. flow for that URI (a 503 Service Unavailable response code
to an earlier failed registration attempt with a Retry-After
header field value may cause the UA to wait longer). To be
13.2.6.3 Flow-Recovery Mechanisms
explicitly clear on the boundary conditions: when the UA
The flow-recovery mechanisms are described in RFC 5626 boots, it immediately tries to register. If this fails and no
when a flow used in registration fails. When a flow used for registration on other flows succeed, the first retry happens
registration (through a particular URI in the outbound- somewhere between 30 and 60 seconds after the failure of
proxy-set) fails, the UA needs to form a new flow to replace the first registration request. If the number of consecutive
the old flow and replace any registrations that were previ- failures is large enough that the maximum of 1800 seconds is
ously sent over this flow. Each new registration must have reached, the UA will keep trying indefinitely with a random
the same reg-id value as the registration it replaces. This is time of 15 to 30 minutes between each attempt. The default
done in much the same way as forming a brand new flow. flow registration backoff times are defined in Table 13.1
However, if there is a failure in forming this flow, the UA (Appendix A of RFC 5626). The base time used for the flow
needs to wait a certain amount of time before retrying to reregistration backoff times are configurable. If the base-
form a flow to this particular next hop. time-all-fail value is set to the default of 30 seconds and the
The amount of time to wait depends if the previous base-time-not-failed value is set to the default of 90 seconds,
attempt at establishing a flow was successful. For the pur- Table 13.1 shows the resulting amount of time the UA will
poses of this section, a flow is considered successful if wait to retry registration.
Connections Management and Overload Control in SIP ◾ 463
Table 13.1 Default Flow Registration Backoff Times Via: SIP/2.0/TCP 192.0.2.2;branch=z9hG4bKnlsd
kdj2
No. of Reg All Flows >1 Nonfailed Max-Forwards: 70
Failures Unusable Flow From: <[email protected]>;tag=23324
To: <sip:00000000-0000-1000-8000-
0 0s 0s [email protected]>
1 30–60 s 90–180 s Call-ID: nSz1TWN54x7My0GvpEBj
CSeq: 1 SUBSCRIBE
2 1–2 min 3–6 min Event: ua-profile;profile-type=device
;vendor="example.com";model="uPhone";
3 2–4 min 6–12 min version="1.1"
Expires: 0
4 4–8 min 12–24 min Supported: path, outbound
5 8–16 min 15–30 min Accept: message/external-body,
application/x-uPhone-config
6 15–30 min 15–30 min Contact: <sip:192.0.2.2;transport=tcp;ob>
;+sip.instance="<urn:uuid:
Source: Copyright IETF. Reproduced with permission. 00000000-0000-1000-8000-AABBCCDDEEFF>"
Content-Length: 0
13.2.7 Connection Management Example In message F2, EP1 adds the following Record-Route
header:
We have taken the connections management call flow example
in a SIP network as shown in Figure 13.5 adopted from RFC Record-Route:
5626. Two outbound edge proxy (EP) servers denoted by EP1 <sip:GopIKSsn0oGLPXRdV9BAXpT3coNuiGKV@ep1.
and EP2, proxy servers, one authoritative proxy server simply example.com;lr>
termed as proxy, and one configuration server (CS) are in the
SIP network of the example.com administrative domain. An In message F5, the configuration server sends a NOTIFY
authoritative proxy handles non-REGISTER requests for a spe- with an external URL for Bob to fetch his configuration.
cific AOR and performs the location server lookup as described The NOTIFY has a Subscription-State header that ends the
in RFC 3261 (see Section 3.3). An EP is the one that can be subscription.
located between the registering UA and the authoritative proxy
topologically. Some header fields like Via, Content-Length, Message F5:
and Max-Forwards are omitted for brevity and readability. NOTIFY sip:192.0.2.2;transport=tcp;ob SIP/2.0
Via: SIP/2.0/TCP
192.0.2.5;branch=z9hG4bKn81dd2
13.2.7.1 Configuration Subscription Max-Forwards: 70
Figure 13.5b shows the call flows of how Bob’s UA obtains To: <[email protected]>;tag=23324
From: <sip:00000000-0000-1000-8000-
the configuration package of the outbound-proxy-set assum-
[email protected]>;tag=0983
ing that Bob’s UA has not been configured yet. Bob’s UA Call-ID: nSz1TWN54x7My0GvpEBj
sends a SUBSCRIBE request for the UA profile configuration CSeq: 1 NOTIFY
package through polling (Expires is zero). After receiving the Route: <sip:GopIKSsn0oGLPXRdV9BAXpT3coNuiGKV@
NOTIFY request, Bob’s UA fetches the external configura- ep1.example.com;lr>
tion and obtains a configuration file that contains the out- Subscription-State: terminated;
reason=timeout
bound-proxy-set sip:ep1.example.com;lr and sip:ep2.example.
Event: ua-profile
com;lr. Note that the configuration package is obtained using Content-Type: message/external-body;
a different protocol other than SIP, such as HTTPS, which access-type="URL"
is not shown. In this example, the DNS server happens to be ;expiration="Thu, 01 Jan 2009 09:00:00 UTC"
configured so that sip: example.com resolves to EP1 and EP2. ;URL="http://example.com/uPhone.cfg"
In this example (Figure 13.5b), the first message is as ;size=9999;hash=10AB568E91245681AC1B
Content-Length: 0
follows:
EP1 receives this NOTIFY request, strips off the Route
Message F1:
header, extracts the flow token, calculates the correct flow,
SUBSCRIBE sip:00000000-0000-1000-8000- and forwards the request message F6 over that flow to
[email protected] Bob. Bob’s UA fetches the configuration file and learns the
SIP/2.0 outbound-proxy-set.
464 ◾ Handbook on Session Initiation Protocol
U1 U1 U2
Configuration EP1 EP2 Proxy CS EP1 EP2 Proxy
server (CS)
Bob Bob Alice
Edge proxy 2
Edge proxy 1 (EP2) Proxy
(EP1) F9. REGISTER F10.
F1.
SUBSCRIBE F2. REGISTER
SUBSCRIBE Event: ua-profile F12. F11. 200 OK
200 OK
F13.
F3. 200 OK REGISTER F14.
F4.
200 OK REGISTER
F15.
F16. 200 OK 200 OK
F5. NOTIFY
F6.
SIP network NOTIFY F17. (… about 120 seconds later ...)
(example.com) 2CRLF
F20. CRLF
U1 U2 U1 U2 U1 U2
EP1 EP2 Proxy EP1 EP2 Proxy EP1 EP2 Proxy
Bob Alice Bob Alice Bob Alice
F42.
EP1 crashes and F35. 2CRLF
F21. INVITE
reboots F43. INVITE F44.
F22. INVITE INVITE
INVITE F36. CRLF
F45.
F23. 430 200 OK
F37. F46. 200 OK
2CRLF F47.
F24. INVITE X 200 OK
F25. INVITE
F38. F48.
F26. 200 OK REGISTER ACK F49. ACK
F27. F39.
F28. REGISTER
200 OK
200 OK
F50.
BYE
F30. ACK F29. ACK F41. F51. BYE
F40. 200 OK
F31. BYE 200 OK
F32. BYE
F52. 200 OK
F33. 200 OK F53.
F34. 200 OK 200 OK
(d) (e) (f )
Figure 13.5 Connections management in SIP example: (a) SIP network and (b, c, d, e, and f) connections management
call flows. (Copyright IETF. Reproduced with permission.)
AOR, the registrar replaces the old Contact URI and flow Message F13:
information. This allows a UA that has rebooted to replace
its previous registration for each flow with minimal impact REGISTER sip:example.com SIP/2.0
on overall system load. Via: SIP/2.0/TCP
192.0.2.2;branch=z9hG4bKnqr9bym
When Alice sends a request to Bob, although not shown Max-Forwards: 70
here, Bob’s authoritative proxy selects the target set. The From: Bob <sip:[email protected]>
proxy forwards the request to elements in the target set based ;tag=755285EABDE2
on the proxy’s policy. The proxy looks at the target set and To: Bob <sip:[email protected]>
uses the instance-id to understand if two targets both end Call-ID: E05133BD26DD
up routing to the same UA. When the proxy goes to for- CSeq: 1 REGISTER
Supported: path, outbound
ward a request to a given target, it looks and finds the flows Route: <sip:ep2.example.com;lr>
over which it received the registration. The proxy then for- Contact:
wards the request over an existing flow, instead of resolving <sip:[email protected];transport=tcp>;reg-id=2
the Contact URI using the procedures in RFC 3263 (see ;+sip.instance="<urn:uuid:
Section 8.2.4) and trying to form a new flow to that contact. 00000000-0000-1000-8000-AABBCCDDEEFF>"
Message F10 is similar. EP1 removes the Route header field Content-Length: 0
value, decrements Max-Forwards, and adds its Via header
Likewise in message F14, EP2 adds a Path header with
field value. Since EP1 is the first edge proxy, it adds a Path
flow token and ob parameter.
header with a flow token and includes the ob parameter.
Path: <sip:wazHDLdIMtUg6r0I/oRZ15zx3zHE1w1Z@
Path: <sip:VskztcQ/S8p4WPbOnHbuyh5iJvJIW3ib@ ep2.example.com;lr;ob>
ep1.example.com;lr;ob>
Message F16 tells Bob’s UA that outbound registration
Since the 200 OK response message F11 to the was successful, and shows both Contacts. Note that only the
REGISTER contains the outbound option tag in the Path corresponding to the current registration is returned.
Require header field, Bob’s UA will know that the registrar
used outbound binding rules. The response also contains Message F16:
the currently active Contacts and the Path for the current
registration. SIP/2.0 200 OK
Via: SIP/2.0/TCP
Message F11: 192.0.2.2;branch=z9hG4bKnqr9bym
From: Bob <sip:[email protected]>
;tag=755285EABDE2
SIP/2.0 200 OK
To: Bob <sip:[email protected]>
Via: SIP/2.0/TCP 192.0.2.15;branch=z9hG4bKnui
;tag=49A9AD0B3F6A
qisi
Call-ID: E05133BD26DD
Via: SIP/2.0/TCP
Supported: path, outbound
192.0.2.2;branch=z9hG4bKnashds7
Require: outbound
From: Bob <sip:[email protected]>
CSeq: 1 REGISTER
;tag=7F94778B653B
Contact: <sip:[email protected];transport=tcp>;
To: Bob <sip:[email protected]>
reg-id=1;expires=3600
;tag=6AF99445E44A
;+sip.instance="<urn:uuid:
Call-ID: 16CB75F21C70
00000000-0000-1000-8000-AABBCCDDEEFF>"
CSeq: 1 REGISTER
Contact: <sip:[email protected];transport=tcp>;
Supported: path, outbound
reg-id=2;expires=3600
Require: outbound
;+sip.
Contact: <sip:[email protected];transport=tcp>;
instance="<urn:uuid:00000000-0000-1000-8000-
reg-id=1;expires=3600
AABBCCDDEEFF>"
;+sip.instance="<urn:uuid:
Path: <sip:wazHDLdIMtUg6r0I/oRZ15zx3zHE1w1Z@
00000000-0000-1000-8000-AABBCCDDEEFF>"
ep2.example.com;lr;ob>
Path: <sip:VskztcQ/S8p4WPbOnHbuyh5iJvJIW3ib@
Content-Length: 0
ep1.example.com;lr;ob>
Content-Length: 0
the authoritative proxy. Before Bob’s UA notices that its flow Route: <sip:wazHDLdIMtUg6r0I/oRZ15zx3zHE1w1Z@
to EP1 is no longer responding, Alice calls Bob sending an ep2.example.com;lr;ob>
INVITE message F21. Bob’s authoritative proxy first tries
the flow to EP1, but EP1 no longer has a flow to Bob, so it In message F25, EP2 needs to add a Record-Route header
responds with a 430 Flow Failed response message F23. The field value, so that any subsequent in-dialog messages from
proxy removes the stale registration and tries the next bind- Alice’s UA arrive at Bob’s UA. EP2 can determine it needs to
ing for the same instance. Record-Route since the request is a dialog-forming request
and the Route header contained a flow token and an ob
Message F21: parameter. This Record-Route information is passed back to
Alice’s UA in the responses (messages F26, F27, and F28).
INVITE sip:[email protected] SIP/2.0
To: Bob <sip:[email protected]> Message F25:
From: Alice <sip:[email protected]>;tag=02935
Call-ID: klmvCxVWGp6MxJp2T2mb INVITE sip:[email protected];transport=tcp
CSeq: 1 INVITE SIP/2.0
To: Bob <sip:[email protected]>
Bob’s proxy rewrites the Request-URI to the Contact From: Alice <sip:[email protected]>;tag=02935
Call-ID: klmvCxVWGp6MxJp2T2mb
URI used in Bob’s registration, and places the path for one
CSeq: 1 INVITE
of the registrations toward Bob’s UA instance into a Route Record-Route:
header field. This Route goes through EP1. <sip:wazHDLdIMtUg6r0I/oRZ15zx3zHE1w1Z@ep2.
example.com;lr>
Message F22:
Message F26:
INVITE sip:[email protected];transport=tcp
SIP/2.0 SIP/2.0 200 OK
To: Bob <sip:[email protected]> To: Bob <sip:[email protected]>;tag=skduk2
From: Alice <sip:[email protected]>;tag=02935 From: Alice <sip:[email protected]>;tag=02935
Call-ID: klmvCxVWGp6MxJp2T2mb Call-ID: klmvCxVWGp6MxJp2T2mb
CSeq: 1 INVITE CSeq: 1 INVITE
Route: <sip:VskztcQ/S8p4WPbOnHbuyh5iJvJIW3ib@ Record-Route:
ep1.example.com;lr;ob> <sip:wazHDLdIMtUg6r0I/oRZ15zx3zHE1w1Z@ep2.
example.com;lr>
Since EP1 just rebooted, it does not have the flow At this point, both UAs have the correct route set for the
described in the flow token. It returns a 430 Flow Failed dialog. Any subsequent requests in this dialog will route cor-
response message F23. rectly. For example, the ACK request in message F29 is sent
from Alice’s UA directly to EP2. The BYE request in message
Message F23: #31 uses the same route set.
13.2.7.4 Reregistration and simply deletes its Route header field value and forwards
the request on to Alice’s UA.
Next, we are considering a reregistration scenario (Figure
13.5e). Somewhat later, Bob’s UA sends keep-alives to both
Message F50:
its edge proxies, but it discovers that the flow with EP1 failed.
Bob’s UA reregisters through EP1 using the same reg-id and BYE sip:[email protected] SIP/2.0
Call-ID it previously used. From: Bob <sip:[email protected]>;tag=ldw22z
To: Alice <sip:[email protected]>;tag=plqus8
Message F38: Call-ID: 95KGsk2V/Eis9LcpBYy3
CSeq: 2 BYE
REGISTER sip:example.com SIP/2.0 Route: <sip:3yJEbr1GYZK9cPYk5Snocez6DzO7w+AX@
From: Bob <sip:[email protected]> ep1.example.com;lr>
;tag=7F94778B653B Contact: sip:[email protected];transport=tcp;ob
To: Bob <sip:[email protected]>
Call-ID: 16CB75F21C70
CSeq: 2 REGISTER 13.2.8 Connection Reuse in SIP
Supported: path, outbound
Route: <sip:ep1.example.com;lr> RFC 5923 that is described here enables a pair of communicat-
Contact: ing proxies to reuse a congestion-controlled connection between
<sip:[email protected];transport=tcp>;reg-id=1 themselves for sending requests in the forwards and backwards
;+sip.instance="<urn:uuid: direction. Because the connection is essentially aliased for
00000000-0000-1000-8000-AABBCCDDEEFF>"
requests going in the backwards direction, reuse is predicated
In message #39, EP1 inserts a Path header
with a new flow token: upon both the communicating endpoints authenticating them-
Path: <sip:3yJEbr1GYZK9cPYk5Snocez6DzO7w+AX@ selves using X.509 certificates through TLS. For this reason, we
ep1.example.com;lr;ob> only consider connection reuse for TLS over TCP and TLS over
SCTP. RFC 5923 also provides guidelines on connection reuse
and virtual SIP servers, and the interaction of connection reuse
13.2.7.5 Outgoing Call and DNS service (SRV) lookups in SIP. SIP entities can com-
municate using either unreliable/connectionless (e.g., UDP) or
Finally, the call flows for the outgoing call is taken. Bob reliable/connection-oriented (e.g., TCP, SCTP) transport pro-
makes an outgoing call to Alice. Bob’s UA includes an ob tocols. When SIP entities use a connection-oriented protocol
parameter in its Contact URI in message F42. EP1 adds (such as TCP or SCTP) to send a request, they typically origi-
a Record-Route with a flow token in message F43. The nate their connections from an ephemeral port. In the follow-
route set is returned to Bob in the response (messages F45, ing example, A listens for SIP requests over TLS on TCP port
F46, and F47), and either Bob or Alice can send in-dialog 5061 (the default port for SIP over TLS over TCP), but uses an
requests. ephemeral port (port 49160) for a new connection to B. These
entities could be SIP UAs or SIP proxy servers. The SIP includes
Message F42: the notion of a persistent connection, which is a mechanisms to
ensure that responses to a request reuse the existing connection
INVITE sip:[email protected] SIP/2.0
From: Bob <sip:[email protected]>;tag=ldw22z that is typically still available, as well as reusing the existing con-
To: Alice <sip:[email protected]> nections for other requests sent by the originator of the connec-
Call-ID: 95KGsk2V/Eis9LcpBYy3 tion. However, new requests sent in the backwards direction, in
CSeq: 1 INVITE the example shown in Figure 13.6a, requests from B destined to
Route: <sip:ep1.example.com;lr> A, are unlikely to reuse the existing connection. This frequently
Contact: <sip:[email protected];transport=tcp;ob>
causes a pair of SIP entities to use one connection for requests
sent in each direction, as shown in Figure 13.6b.
In message F43, EP1 adds the following Record-Route Unlike TCP, TLS connections can be reused to send
header. requests in the backwards direction since each end can be
Record-Route:
authenticated when the connection is initially set up. Once
<sip:3yJEbr1GYZK9cPYk5Snocez6DzO7w+AX@ep1. the authentication step has been performed, the situation can
example.com;lr> be thought to resemble the picture in Figure 13.6a, except that
A and B both use a single shared connection, for example,
When EP1 receives the BYE message F50 from Bob’s between port 49160 on A and port 5061 on B. When A wants
UA, it can tell that the request is an outgoing request (since to send a request to B, it will reuse this connection, and when B
the source of the request matches the flow in the flow token) wants to send a request to A, it will reuse the same connection.
468 ◾ Handbook on Session Initiation Protocol
Figure 13.6 Connection requests between A and B: (a) unidirectional connection for requests from A to B, and (b) two
connections for requests between A and B. (Copyright IETF. Reproduced with permission.)
For illustration purposes, the discussion below uses TCP IP address, port number, and transport of P2 as determined
as a transport for TLS operations. Another streaming trans- through RFC 3263 server selection. P1 adds an alias header
port, such as SCTP, can be used as well. The act of reusing field parameter to the topmost Via header field (inserted by
a connection is initiated by P1 when it adds an alias header it) before sending the request to P2. The value in the sent-by
field parameter (see Section 2.4.1) to the Via header field. production rule of the Via header field (including the port
When P2 receives the request, it examines the topmost Via number) and the transport over which the request was sent
header field. If the Via header contained an alias header become the advertised address of P1:
field parameter, P2 establishes a binding such that subse-
quent requests going to P1 will reuse the connection; that Via: SIP/2.0/TLS p1.example.com
;branch=z9hG4bKa7c8dze;alias
is, requests are sent over the established connection. With
reference to Figure 13.8, in order for P2 to reuse a connec-
Assuming that P1 does not already have an existing
tion for requests in the backwards direction, it is important
aliased connection with P2, P1 now opens a connection with
that the validation model for requests sent in this direction
P2. P2 presents its X.509 certificate to P1 for validation (see
(i.e., P2 to P1) is equivalent to the normal connection in each
Section 19.2.2). Upon connection authentication and accep-
direction model, wherein P2 acting as client would open up
tance, P1 adds P2 to its alias table. P1’s alias table now looks like
a new connection in the backwards direction and validate
the connection by examining the X.509 certificate presented.
The act of reusing a connection needs the desired property Destination Destination Destination Destination Alias
that requests get delivered in the backwards direction only IP Address Port Transport Identity Descriptor
Figure 13.8 Proxy setup. (Copyright IETF. Reproduced When P2 receives the request, it examines the topmost
with permission.) Via header field to determine whether P1 is willing to use
470 ◾ Handbook on Session Initiation Protocol
this connection as an aliased connection (i.e., accept requests to different IP addresses due to round-robin DNS.
from P2 toward P1). The Via header field at P2 now looks However, the aliased connection is to be estab-
like the following (the received header field parameter is lished with the original sender of the request.
added by P2):
Whether or not to allow an aliased connection ultimately parameter, it signifies that the upstream client will leave the
depends on the recipient of the request; that is, the client connection open beyond the transaction and dialog lifetime,
does not get any confirmation that its downstream peer cre- and that subsequent transactions and dialogs that are des-
ated the alias, or indeed that it even supports this specifica- tined to a resolved address that matches the identifiers in the
tion. Thus, clients must not assume that the acceptance of a advertised address in the topmost Via header field can reuse
request by a server automatically enables connection alias- this connection. Whether or not to use, in the reverse direc-
ing. Clients must continue receiving requests on their default tion, a connection marked with the alias Via header field
port. Clients must authenticate the connection before form- parameter ultimately depends on the policies of the server. It
ing an alias (see Section 19.2.2). Once the server has been can choose to honor it, and thereby send subsequent requests
authenticated, the client must cache, in the alias table, the over the aliased connection. If the server chooses not to honor
identity (or identities) of the server as determined in RFC an aliased connection, the server must allow the request to
5922 (see Section 19.4.6). The client must also populate proceed as though the alias header field parameter was not
the destination IP address, port, and transport of the server present in the topmost Via header. This assures interoper-
in the alias table; these fields are retrieved from executing ability with RFC3261 server behavior. Clients can include
RFC3263 (see Section 8.2.4) server resolution process on the alias header field parameter without fear that the server
the next-hop URI. Finally, the client must populate the alias will reject the SIP request because of its presence.
descriptor field with the connection handle (or identifier) Servers must be prepared to deal with the case that the
used to connect to the server. Once the alias table has been aliased connection no longer exists when they are ready to
updated with a resolved address, and the client wants to send send a subsequent request over it. This can occur if the peer
a new request in the direction of the server, the client reuses ran out of operating system resources and had to close the
the connection only if all of the following conditions hold: connection. In such a case, the server must open a new con-
nection to the resolved address, and the alias table updated
◾◾ The client uses the RFC3263 resolution on a URI and accordingly. If the sent-by production of the Via header field
arrives at a resolved address contained in the alias table. contains a port, the server must use it as a destination port.
◾◾ The URI used for RFC3263 server resolution matches Otherwise, the default port is the destination port. Servers
one of the identities stored in the alias table row cor- must authenticate (see Section 19.2.2) the connection before
responding to that resolved address. forming an alias. The server, if it decides to reuse the connec-
tion, must cache in the alias table the identity (or identities)
Clients must be prepared for the case that the connection of the client as they appear in the X.509 certificate subjectAl-
no longer exists when they are ready to send a subsequent ternativeName extension field. The server also populates the
request over it. In such a case, a new connection must be destination IP address, port, and transport in the alias table
opened to the resolved address, and the alias table updated from the topmost Via header field (using the ;received param-
accordingly. This behavior has an adverse side effect when a eter for the destination IP address). If the port number is
CANCEL request or an ACK request for a non-2xx response omitted, a default port number of 5061 is to be used. Finally,
is sent downstream. Normally, these would be sent over the the server populates the alias descriptor field with the con-
same connection over which the INVITE request was sent. nection handle (or identifier) used to accept the connection
However, if between the sending of the INVITE request from the client (see Section 2.1 for the contents of the alias
and subsequent sending of the CANCEL request or ACK table). Once the alias table has been updated, and the server
request to a non-2xx response, the connection was closed, wants to send a request in the direction of the client, it reuses
then the client should open a new connection to the resolved the connection only if all of the following conditions hold:
address and send the CANCEL request or ACK request there
instead. The client may insert the newly opened connection ◾◾ The server, which acts as a client for this transaction,
into the alias table. uses the RFC3263 resolution process on a URI and
arrives at a resolved address contained in the alias table.
◾◾ The URI used for RFC3263 server resolution matches
13.2.8.5.2 Server Behavior
one of the identities stored in the alias table row cor-
Servers should keep connections up unless they need to responding to that resolved address.
reclaim resources. Connection reuse works best when the
client and the server maintain their connections for long
periods of time. Servers, therefore, should not automatically
13.2.8.5.3 Closing a TLS connection
drop connections on completion of a transaction or termina-
tion of a dialog. When a server receives a request over TLS Either the client or the server may terminate a TLS ses-
whose topmost Via header field contains an alias header field sion by sending a TLS closure alert. Before closing a TLS
472 ◾ Handbook on Session Initiation Protocol
connection, the initiator of the closure must either wait for them due to failure conditions unrelated to the SIP server
any outstanding SIP transactions to complete, or explicitly being overloaded. For example, a PSTN gateway that runs
abandon them. After the initiator of the close has sent a out of trunks but still has plenty of capacity to process SIP
closure alert, it must discard any TLS message until it has messages should reject incoming INVITEs using a 488 Not
received a similar alert from its peer. The receiver of the clo- Acceptable Here response (RFC 4412, see Section 2.8).
sure alert must not start any new SIP transactions after the Similarly, a SIP registrar that has lost connectivity to its
receipt of the closure alert. registration database but is still capable of processing SIP
requests should reject REGISTER requests with a 500 Server
Error response (RFC 3261, see Section 2.6). Overload con-
13.2.8.6 Connection Reuse
trol does not apply to these cases, and SIP provides appro-
and SRV Interaction priate response codes for them. The SIP provides a limited
Connection reuse has an interaction with the DNS SRV load mechanism for overload control through its 503 Service
balancing mechanism. To understand the interaction, con- Unavailable response code. However, this mechanism can-
sider Figure 13.9. not prevent overload of a SIP server, and it cannot prevent
Here, the proxy uses the DNS SRV to load balance across congestion collapse. In fact, the use of the 503 Service
the three servers, S1, S2, and S3. Using the connect reuse Unavailable response code may cause traffic to oscillate and
mechanism specified in this document, over time the proxy will shift between SIP servers, thereby worsening an overload
maintain a distinct aliased connection to each of the servers. condition. A detailed discussion of the SIP overload prob-
However, once this is done, subsequent traffic is load balanced lem, the problems with the 503 Service Unavailable response
across the three downstream servers in the normal manner. code, and the requirements for a SIP overload control mecha-
nism can be found in RFC 5390 (see Section 13.3.9).
RFC 7339 that is described here defines the protocol for
communicating overload information between SIP servers
13.3 Loss-Based Overload and clients so that clients can reduce the volume of traffic
Control in SIP Network sent to overloaded servers, avoiding congestion collapse and
increasing useful throughput. It describes the Via header
13.3.1 Overview parameters used for overload control communication. In
Like any network element, a SIP server (RFC 3261, see addition, the general behavior of SIP servers and clients
Section 2.4.4) can suffer from overload when the number involved in overload control is specified. RFC 7339 speci-
of SIP messages it receives exceeds the number of messages fies the loss-based overload control scheme that is mandatory
it can process. Overload can pose a serious problem for a to implement for this specification. However, it allows other
network of SIP servers. During periods of overload, the overload control schemes to be supported as well. To do so
throughput of a network of SIP servers can be significantly effectively, the expectations and primitive protocol param-
degraded. In fact, overload may lead to a situation where the eters common to all classes of overload control schemes are
retransmissions of dropped SIP messages may overwhelm described here.
the capacity of the network. This is often called congestion
collapse. Overload is said to occur if a SIP server does not 13.3.2 Operations
have sufficient resources to process all incoming SIP mes-
sages. These resources may include CPU processing capac- We provide an overview of how the overload control mecha-
ity, memory, input/output, or disk resources. For overload nism operates by introducing the overload control param-
control, this document only addresses failure cases where eters. The next section provides more details and normative
SIP servers are unable to process all SIP requests because of behavior on the parameters listed below. Because overload
resource constraints. There are other cases where a SIP server control is performed hop-by-hop, the Via header parameter
can successfully process incoming requests but has to reject is attractive since it allows two adjacent SIP entities to indi-
cate support for, and exchange information associated with,
overload control specified in RFC 6357 that describes the
S1
design for management of overload in SIP handling over-
load conditions. Additional advantages of this choice are dis-
Proxy S2 cussed here later. An alternative mechanism using SIP event
packages was also considered, and the characteristics of that
S3 choice are further outlined here.
RFC 7339 defines four new parameters for the SIP
Figure 13.9 Load balancing. Via header for overload control. These parameters provide
Connections Management and Overload Control in SIP ◾ 473
a mechanism for conveying overload control information supports other overload control schemes such as a rate-based
between adjacent SIP entities. The oc parameter is used by a scheme (see Section 13.4). Each element in the comma-
SIP server to indicate a reduction in the number of requests separated list corresponds to the class of overload control
arriving at the server. The oc-algo parameter contains a token algorithms supported by the SIP client. When more than one
or a list of tokens corresponding to the class of overload con- class of overload control algorithms is present in the oc-algo
trol algorithms supported by the client. The server chooses parameter, the client may indicate algorithm preference by
one algorithm from this list. The oc-validity parameter estab- ordering the list in a decreasing order of preference. However,
lishes a time limit for which overload control is in effect, and the client cannot assume that the server will pick the most
the oc-seq parameter aids in sequencing the responses at the preferred algorithm. When a downstream SIP server receives
client. These parameters are discussed in detail in the next a request with multiple overload control algorithms speci-
section. fied in the oc-algo parameter (optionally sorted by decreasing
order of preference), it chooses one algorithm from the list
and must return the single selected algorithm to the client.
13.3.3 Via Header Parameters
Once the SIP server has chosen a mutually agreeable
for Overload Control class of overload control algorithms and communicated it to
The four Via header parameters that are introduced by RFC the client, the selection stays in effect until the algorithm
7339 are described here. Further context about how to inter- is changed by the server. Furthermore, the client must con-
pret these under various conditions is provided in the next tinue to include all the supported algorithms in subsequent
section. requests; the server must respond with the agreed-to algo-
rithm until the algorithm is changed by the server. The selec-
tion should stay the same for a nontrivial duration of time to
13.3.3.1 Parameter: oc
allow the overload control algorithm to stabilize its behav-
This parameter is inserted by the SIP client and updated ior described later. The oc-algo parameter does not define
by the SIP server. A SIP client must add an oc parameter the exact algorithm to be used for traffic reduction; rather,
to the topmost Via header it inserts into every SIP request. the intent is to use any algorithm from a specific class of
This provides an indication to downstream neighbors that algorithms that affect traffic reduction similarly. For exam-
the client supports overload control. There must not be a ple, the reference default overload rate control algorithm
value associated with the parameter (the value will be added described later can be used as a loss-based algorithm, or it
by the server). The downstream server must add a value to can be substituted by any other loss-based algorithm that
the oc parameter in the response going upstream to a client results in equivalent traffic reduction.
that included the oc parameter in the request. Inclusion of
a value to the parameter represents two things. First, upon
13.3.3.3 Parameter: oc-validity
the first contact (see the next section), addition of a value by
the server to this parameter indicates (to the client) that the This parameter may be inserted by the SIP server in
downstream server supports overload control as defined in a response; it must not be inserted by the SIP client in a
this document. Second, if overload control is active, then it request. This parameter contains a value that indicates an
indicates the level of control to be applied. When a SIP client interval of time (measured in milliseconds) that the load
receives a response with the value in the oc parameter filled reduction specified in the value of the oc parameter should
in, it must reduce, as indicated by the oc and oc-algo param- be in effect. The default value of the oc-validity parameter
eters, the number of requests going downstream to the SIP is 500 (milliseconds). If the client receives a response with
server from which it received the response. the oc and oc-algo parameters suitably filled in, but no oc-
validity parameter, the SIP client should behave as if it had
received oc-validity=500.
13.3.3.2 Parameter: oc-algo
A value of 0 in the oc-validity parameter is reserved to
This parameter is inserted by the SIP client and updated by denote the event that the server wishes to stop overload con-
the SIP server. A SIP client MUST add an oc-algo parameter trol, or to indicate that it supports overload control but is not
to the topmost Via header it inserts into every SIP request, currently requesting any reduction in traffic specified later.
with a default value of loss. This parameter contains names A nonzero value for the oc-validity parameter must only be
of one or more classes of overload control algorithms. A SIP present in conjunction with an oc parameter. A SIP client
client must support the loss-based overload control scheme must discard a nonzero value of the oc-validity parameter if
and must insert at least the token loss as one of the oc-algo the client receives it in a response without the corresponding
parameter values. In addition, the SIP client may insert other oc parameter being present as well. After the value specified
tokens, separated by a comma, in the oc-algo parameter if it in the oc-validity parameter expires and until the SIP client
474 ◾ Handbook on Session Initiation Protocol
receives an updated set of overload control parameters from server, the last response from this server had oc-validity=0,
the SIP server, overload control is not in effect between the or the time period indicated by the oc-validity parameter has
client and the downstream SIP server. expired), the SIP client sends the SIP message to the server
without invoking any overload control algorithm.
13.3.3.4 Parameter: oc-seq
13.3.4.1 Determining Support
This parameter must be inserted by the SIP server in
for Overload Control
a response; it must not be inserted by the SIP client in a
request. This parameter contains an unsigned integer value If a client determines that this is the first contact with a server,
that indicates the sequence number associated with the oc the client must insert the oc parameter without any value
parameter. This sequence number is used to differentiate two and must insert the oc-algo parameter with a list of algo-
oc parameter values generated by an overload control algo- rithms it supports. This list must include loss and may include
rithm at two different instants in time. The oc parameter other algorithm names approved by the Internet Assigned
values generated by an overload control algorithm at time t Numbers Authority and described in corresponding docu-
and t + 1 must have an increasing value in the oc-seq param- ments. The client transmits the request to the chosen server.
eter. This allows the upstream SIP client to properly collate If a server receives a SIP request containing the oc and oc-
out-of-order responses. Note that a timestamp can be used as algo parameters, the server must determine if it has already
a value of the oc-seq parameter. selected the overload control algorithm class with this client.
If the value contained in the oc-seq parameter overflows If it has, the server should use the previously selected algo-
during the period in which the load reduction is in effect, rithm class in its response to the message.
then the oc-seq parameter must be reset to the current time- If the server determines that the message is from a new
stamp or an appropriate base value. Also note that a client client or a client the server has not heard from in a long time,
implementation can recognize that an overflow has occurred the server must choose one algorithm from the list of algo-
when it receives an oc-seq parameter whose value is signifi- rithms in the oc-algo parameter. It must put the chosen algo-
cantly less than several previous values. Note that an oc-seq rithm as the sole parameter value in the oc-algo parameter of
parameter whose value does not deviate significantly from the response it sends to the client. In addition, if the server is
the last several previous values is symptomatic of a tardy currently not in an overload condition, it must set the value
packet. However, overflow will cause the oc-seq parameter of the oc parameter to be 0 and may insert an oc-validity=0
value to be significantly less than the last several values. If an parameter in the response to further qualify the value in the oc
overflow is detected, then the client should use the overload parameter. If the server is currently overloaded, it must follow
parameters in the new message, even though the sequence the procedures described below. A client that supports the rate-
number is lower. The client should also reset any internal based overload control scheme (see Section 13.4) will consider
state to reflect the overflow so that future messages following oc=0 as an indication not to send any requests downstream
the overflow will be accepted. at all. Thus, when the server inserts oc-validity=0 as well, it is
indicating that it does support overload control, but it is not
under overload mode right now.
13.3.4 General Behavior
When forwarding a SIP request, a SIP client uses the SIP 13.3.4.2 Creating and Updating the
procedures of RFC 3263 (see Section 8.2.4) to determine the
Overload Control Parameters
next-hop SIP server. The procedures of RFC 3263 take a SIP
URI as input, extract the domain portion of that URI for use A SIP server provides overload control feedback to its
as a lookup key, query the DNS to obtain an ordered set of upstream clients by providing a value for the oc parameter
one or more IP addresses with a port number, and transport to the topmost Via header field of a SIP response, that is, the
corresponding to each IP address in this set—the Expected Via header added by the client before it sent the request to
Output. the server. Since the topmost Via header of a response will
After selecting a specific SIP server from the Expected be removed by an upstream client after processing it, over-
Output, a SIP client determines whether overload controls load control feedback contained in the oc parameter will not
are currently active with that server. If overload controls travel beyond the upstream SIP client. A Via header param-
are currently active (and the oc-validity period has not yet eter therefore provides hop-by-hop semantics for overload
expired), the client applies the relevant algorithm to deter- control feedback (RFC 6357) even if the next-hop neighbor
mine whether or not to send the SIP request to the server. does not support this specification. The oc parameter can be
If overload controls are not currently active with this server used in all response types, including provisional, success,
(which will be the case if this is the initial contact with the and failure responses. We also explained later the special
Connections Management and Overload Control in SIP ◾ 475
13.3.4.5 Using the Overload Control ◾◾ The client is explicitly told by the server to stop perform-
Parameter Values ing overload control using the oc-validity=0 parameter.
A SIP server can decide to terminate overload control
A SIP client must honor the overload control values it receives by explicitly signaling the client. To do so, the SIP
from downstream neighbors. The SIP client must not forward server must set the value of the oc-validity parameter
more requests to a SIP server than allowed by the current oc to 0. The SIP server must increment the value of oc-seq
and oc-algo parameter values from that particular downstream and should set the value of the oc parameter to 0.
server. When forwarding a SIP request, a SIP client uses the SIP
procedures of RFC 3263 to determine the next-hop SIP server. Note that the loss-based overload control scheme described
The procedures of RFC 3263 (see Section 8.2.4) take a SIP URI later can effectively stop overload control by setting the value
as input, extract the domain portion of that URI for use as a of the oc parameter to 0. However, the rate-based scheme (see
lookup key, query the DNS to obtain an ordered set of one or Section 13.4) needs an additional piece of information in the
more IP addresses with a port number, and transport corre- form of oc-validity=0. When the client receives a response
sponding to each IP address in this set—the Expected Output. with a higher oc-seq number than the one it most recently
After selecting a specific SIP server from the Expected processed, it checks the oc-validity parameter. If the value of
Output, the SIP client determines if it already has over- the oc-validity parameter is 0, this indicates to the client that
load control parameter values for the server chosen from overload control of messages destined to the server is no lon-
the Expected Output. If the SIP client has a nonexpired oc ger necessary, and the traffic can flow without any reduction.
parameter value for the server chosen from the Expected Furthermore, when the value of the oc-validity parameter is
Output, then this chosen server is operating in overload con- 0, the client should disregard the value in the oc parameter.
trol mode. Thus, the SIP client determines if it can or can-
not forward the current request to the SIP server based on
the oc and oc-algo parameters and any relevant local policy. 13.3.4.8 Stabilizing Overload
The particular algorithm used to determine whether or not Algorithm Selection
to forward a particular SIP request is a matter of local policy
Realities of deployments of SIP necessitate that the overload
and may take into account a variety of prioritization factors.
control algorithm may be changed upon a system reboot or a
However, this local policy should transmit the same number
software upgrade. However, frequent changes of the overload
of SIP requests as the sample algorithm defined by the over-
control algorithm must be avoided. Frequent changes of the
load control scheme being used.
overload control algorithm will not benefit the client or the
server as such flapping does not allow the chosen algorithm
13.3.4.6 Forwarding the Overload to stabilize. An algorithm change, when desired, is simply
Control Parameters accomplished by the SIP server choosing a new algorithm
from the list in the client’s oc-algo parameter and sending it
Overload control is defined in a hop-by-hop manner. Therefore, back to the client in a response.
forwarding the contents of the overload control parameters The client associates a specific algorithm with each server
is generally not recommended and should only be performed it sends traffic to, and when the server changes the algo-
if permitted by the configuration of SIP servers. This means rithm, the client must change its behavior accordingly. Once
that a SIP proxy should strip the overload control parameters the server selects a specific overload control algorithm for a
inserted by the client before proxying the request further given client, the algorithm should not change the algorithm
downstream. Of course, when the proxy acts as a client and associated with that client for at least 3600 seconds (1 hour).
proxies the request downstream, it is free to add overload This period may involve one or more cycles of overload con-
control parameters pertinent to itself in the Via header it trol being in effect, and then being stopped depending on
inserted in the request. the traffic and resources at the server. Note that one way to
accomplish this involves the server saving the time of the
last algorithm change in a lookup table, indexed by the cli-
13.3.4.7 Terminating Overload Control
ent’s network identifiers. The server only changes the oc-algo
A SIP client removes overload control if one of the following parameter when the time since the last change has surpassed
events occur: 3600 seconds.
a timeout error is received from the transaction layer, it must A SIP client should honor the local policy for prioritizing
be treated as if a 408 Request Timeout status code has been SIP requests relating to emergency calls as identified by the
received. If a fatal transport error is reported by the trans- SOS URN (RFC 5031, see Section 16.11.2) indicating an
port layer, the condition MUST be treated as a 503 Service emergency request. This policy ensures that when a server is
Unavailable status code. In the event of repeated timeouts or overloaded and nonemergency calls outnumber emergency
fatal transport errors, the SIP client must stop sending requests calls in the traffic arriving at the client, the few emergency
to this server. The SIP client should periodically probe if the calls will be given preference. If, on the other hand, the
downstream server is alive using any mechanism at its dis- server is overloaded and the majority of calls arriving at the
posal. Clients should be conservative in their probing (e.g., client are emergency in nature, then no amount of message
using an exponential backoff) so that their aliveness probes prioritization will ensure the delivery of all emergency calls
do not exacerbate an overload situation. Once a SIP client if the client is to reduce the amount of traffic as requested
has successfully received a normal response for a request sent by the server. A local policy can be expected to combine
to the downstream server, the SIP client can resume sending both the SIP request type and the prioritization markings,
SIP requests. It should, of course, honor any overload control and it should be honored when overload conditions prevail.
parameters it may receive in the initial, or later, responses.
13.3.4.10.2 Rejecting Requests at an
13.3.4.10 Responding to an Overloaded Server
Overload Indication
If the upstream SIP client to the overloaded server does not
A SIP client can receive overload control feedback indicating support overload control, it will continue to direct requests
that it needs to reduce the traffic it sends to its downstream to the overloaded server. Thus, for the nonparticipating cli-
server. The client can accomplish this task by sending some ent, the overloaded server must bear the cost of rejecting
of the requests that would have gone to the overloaded ele- some requests from the client as well as the cost of process-
ment to a different destination. It needs to ensure, however, ing the nonrejected requests to completion. It would be fair
that this destination is not in overload and is capable of pro- to devote the same amount of processing at the overloaded
cessing the extra load. A client can also buffer requests in the server to the combination of rejection and processing from a
hope that the overload condition will resolve quickly and the nonparticipating client as the overloaded server would devote
requests can still be forwarded in time. In many cases, how- to processing requests from a participating client. This is to
ever, it will need to reject these requests with a 503 Service ensure that SIP clients that do not support this specification
Unavailable response without the Retry-After header. do not receive an unfair advantage over those that do. A SIP
server that is in overload and has started to throttle incom-
ing traffic must reject some requests from nonparticipating
13.3.4.10.1 Message Prioritization at Hop
clients with a 503 Service Unavailable response without the
before an Overloaded Server
Retry-After header.
During an overload condition, a SIP client needs to prioritize
requests and select those requests that need to be rejected or
13.3.4.11 Provisional Response
redirected. This selection is largely a matter of local policy. It
and Overload Control
is expected that a SIP client will follow local policy as long
as the result in reduction of traffic is consistent with the The overload control information sent from a SIP server
overload algorithm in effect at that node. Accordingly, the to a client is transported in the responses. While imple-
normative behavior in the next three paragraphs should be mentations can insert overload control information in any
interpreted with the understanding that the SIP client will response, special attention should be accorded to overload
aim to preserve local policy to the fullest extent possible. A control information transported in a 100 Trying response.
SIP client should honor the local policy for prioritizing SIP Traditionally, the 100 Trying response has been used in SIP
requests such as policies based on message type, for example, to quench retransmissions. In some implementations, the
INVITEs versus requests associated with existing sessions. A 100 Trying message may not be generated by the transaction
SIP client should honor the local policy for prioritizing SIP user (TU) nor consumed by the TU. In these implementa-
requests based on the content of the Resource-Priority header tions, the 100 Trying response is generated at the transaction
(RPH) (RFC 4412, see Section 2.8). Specific (namespace. layer and sent to the upstream SIP client. At the receiving
value) RPH contents may indicate high-priority requests that SIP client, the 100 Trying is consumed at the transaction
should be preserved as much as possible during overload. The layer by inhibiting the retransmission of the corresponding
RPH contents can also indicate a low-priority request that is request. Consequently, implementations that insert overload
eligible to be dropped during times of overload. control information in the 100 Trying cannot assume that
478 ◾ Handbook on Session Initiation Protocol
the upstream SIP client passed the overload control infor- oc=0;oc-algo="loss";oc-validity=0;oc-
mation in the 100 Trying to their corresponding TU. For seq=1282321892.439
...
this reason, implementations that insert overload control
information in the 100 Trying must reinsert the same (or
updated) overload control information in the first non-100
Trying response being sent to the upstream SIP client. 13.3.5 Loss-Based Overload Control Scheme
Under a loss-based approach, a SIP server asks an upstream
13.3.4.12 Example neighbor to reduce the number of requests it would normally
forward to this server by a certain percentage. For example, a
Consider a SIP client, P1, which is sending requests to
SIP server can ask an upstream neighbor to reduce the num-
another downstream SIP server, P2. The following snip-
ber of requests this neighbor would normally send by 10%.
pets of SIP messages demonstrate how the overload control
The upstream neighbor then redirects or rejects 10% of the
parameters work.
traffic originally destined for that server. This section speci-
INVITE sips:[email protected] SIP/2.0 fies the semantics of the overload control parameters associ-
Via: SIP/2.0/TLS p1.example.net; ated with the loss-based overload control scheme. The general
branch=z9hG4bK2d4790.1;oc;oc-algo="loss,A" behavior of SIP clients and servers is specified in Section 2.1,
... and is applicable to SIP clients and servers that implement
SIP/2.0 100 Trying loss-based overload control.
Via: SIP/2.0/TLS p1.example.net;
branch=z9hG4bK2d4790.1;received=192.0.2.111;
oc=0;oc-algo="loss";oc-validity=0
...
13.3.5.1 Special Parameter Values
The loss-based overload control scheme is identified using
In the messages above, the first line is sent by P1 to P2. the token loss. This token appears in the oc-algo param-
This line is a SIP request; because P1 supports overload con- eter list sent by the SIP client. Upon entering the overload
trol, it inserts the oc parameter in the topmost Via header that state, a SIP server that has selected the loss-based algorithm
it created. P1 supports two overload control algorithms: loss will assign a value to the oc parameter. This value must be
and an algorithm called A. The second line, a SIP response, in the range of [0, 100], inclusive. This value indicates to
shows the topmost Via header amended by P2 according to the client the percentage by which the client is to reduce
this specification and sent to P1. Because P2 also supports the number of requests being forwarded to the overloaded
overload control and chooses the loss-based scheme, it sends server. The SIP client may use any algorithm that reduces
loss back to P1 in the oc-algo parameter. It also sets the value the traffic it sends to the overloaded server by the amount
of the oc and oc-validity parameters to 0 because it is not indicated. Such an algorithm should honor the message pri-
currently requesting overload control activation. Had P2 not oritization discussed earlier. While a particular algorithm is
supported overload control, it would have left the oc and not subject to standardization, for completeness, a default
oc-algo parameters unchanged, thus allowing the client to algorithm for loss-based overload control is provided in the
know that it did not support overload control. At some later next section.
time, P2 starts to experience overload. It sends the following
SIP message indicating that P1 should decrease the messages
arriving to P2 by 20% for 0.5 seconds. 13.3.5.2 Default Algorithm
if oc=10 and 40% of the requests should be included in the oc_value := extract_oc(oc_context)
first category, then oc_validity :=
extract_oc_validity(oc_context)
10 / 40 * 100 = 25 if (in_oc == false or oc_validity is
not in effect) {
send_to_network(sip_msg) // Process it
Or, 25% of the requests in the first category can be //normally by sending the request to the next
reduced to obtain an overall reduction of 10%. The client //hop since this particular destination is
uses random discard to achieve the 25% reduction of mes- //not subject to overload. Optionally, clear
sages in the first category. Messages in the second category //the oc context for this server (not shown).
proceed downstream unscathed. To affect the 25% reduc-
tion rate from the first category, the client draws a random }
number between 1 and 100 for the request picked from the else { // Begin performing overload
first category. If the random number is less than or equal to //control.
r := random()
the converted value of the oc parameter, the request is not
drop_msg := false
forwarded; otherwise, the request is forwarded. A reference category :=
algorithm is shown below. assign_msg_to_category(sip_msg)
pct_to_reduce_cat1 = oc_value / cat1 *
cat1: = 80.0 // Category 1 -- Subject 100
//to reduction if (oc_value <= cat1) { // Reduce all
cat2: = 100.0 - cat1 // Category 2 //msgs from category 1
//-- Under normal operations, if (r <= pct_to_reduce_cat1 &&
//only subject to reduction after category 1 category == cat1) {
//is exhausted. drop_msg := true
//Note that the above ratio is simply a }
//reasonable default. }
//The actual values will change through else { // oc_value > category 1.
//periodic sampling as the traffic mix //Reduce 100% of msgs from
//changes over time.
//category 1 and remaining from category 2.
while (true) {
pct_to_reduce_cat2 = (oc_value - cat1) /
//We’re modeling message processing as a cat2 * 100
//single work queue that contains both if (category == cat1) {
//incoming and outgoing messages. drop_msg := true
}
sip_msg := get_next_message_from_work_ else {
//queue() if (r <= pct_to_reduce_cat2) {
update_mix(cat1, cat2) // See Note drop_msg := true;
//below }
switch (sip_msg.type) { }
case outbound request: }
destination := get_next_hop(sip_msg) if (drop_msg == false) {
oc_context := send_to_network(sip_msg) // Process it
get_oc_context(destination) //normally by sending the request to
if (oc_context == null) { //the next hop.
send_to_network(sip_msg) // Process it }
//normally by sending the request to the next else {
//hop since this particular destination is
//not subject to overload. //Do not send request downstream; handle it
//locally by generating response (if a proxy)
} //or treating it as an error (if a user
else { //agent).
}
//Determine if server wants to enter in } // End perform overload control.
//overload or is in }
//overload. end case // outbound request
case outbound response:
in_oc := extract_in_oc(oc_context) if (we are in overload) {
480 ◾ Handbook on Session Initiation Protocol
◾◾ A Via header parameter is lightweight and creates very send traffic again. However, these notifications do gen-
little overhead. It does not require the transmission of erate additional traffic, which adds to the overall load.
additional messages for overload control and does not ◾◾ A SIP entity needs to set up and maintain overload
increase traffic or processing burdens in an overload control subscriptions with all upstream and down-
situation. stream neighbors. A new subscription needs to be
◾◾ Overload control status can frequently be reported to set up before/while a request is transmitted to a new
upstream neighbors since it is a part of a SIP response. downstream neighbor. Servers can be configured to
This enables the use of this mechanism in scenarios subscribe at boot time. However, this would require
where the overload status needs to be adjusted fre- additional protection to avoid the avalanche restart
quently. It also enables the use of overload control mech- problem for overload control. Subscriptions need to be
anisms that use regular feedback, such as window-based terminated when they are not needed any more, which
overload control. can be done, for example, using a timeout mechanism.
◾◾ With a Via header parameter, overload control status is ◾◾ A receiver needs to send NOTIFY messages to all sub-
inherent in SIP signaling and is automatically conveyed scribed upstream neighbors in a timely manner when
to all relevant upstream neighbors, that is, neighbors the control algorithm requires a change in the control
that are currently contributing traffic. There is no need variable (e.g., when a SIP server is in an overload con-
for a SIP server to specifically track and manage the dition). This includes active as well as inactive neigh-
set of current upstream or downstream neighbors with bors. These NOTIFYs add to the amount of traffic that
which it should exchange overload feedback. needs to be processed. To ensure that these requests
◾◾ Overload status is not conveyed to inactive senders. will not be dropped due to overload, a priority mech-
This avoids the transmission of overload feedback to anism needs to be implemented in all servers these
inactive senders, which do not contribute traffic. If an requests will pass through.
inactive sender starts to transmit while the receiver is ◾◾ As overload feedback is sent to all senders in separate
in overload, it will receive overload feedback in the messages, this mechanism is not suitable when fre-
first response and can adjust the amount of traffic for- quent overload control feedback is needed.
warded accordingly. ◾◾ A SIP server can limit the set of senders that can receive
◾◾ A SIP server can limit the distribution of overload overload control information by authenticating sub-
control information by only inserting it into responses scriptions to this event package.
to known upstream neighbors. A SIP server can use ◾◾ This approach requires each proxy to implement
transport-level authentication (e.g., via TLS) with its UA functionality (UAS and UAC) to manage the
upstream neighbors. subscriptions.
therefore follow the behavior outlined earlier to handle cli- set of upstream elements is not enumerable. There are no
ents that do not support overload control. constraints on the number of upstream clients.
◾◾ The mechanism supports servers that receive requests
from a finite set of upstream elements, where the set
13.3.9 Salient Features of Overload Control
of upstream elements is enumerable. There are no con-
RFC 7339 specified here provide the following benefits straints on the number of upstream clients.
meeting the overload control for the SIP network require- ◾◾ The mechanism works between servers in different
ments specified in RFC 5390: domains. There are no inherent limitations on using
overload control between domains. However, intercon-
◾◾ The overload control mechanism allows an overloaded nection points that engage in overload control between
SIP server to maintain a reasonable level of through- domains will have to populate and maintain the over-
put as it enters into congestion mode by requesting the load control parameters as requests cross domains.
upstream clients to reduce traffic destined downstream. ◾◾ The mechanism does not dictate a specific algorithm
◾◾ When a SIP server enters overload mode, it requests for prioritizing the processing of work within a proxy
the upstream clients to throttle the traffic destined to during times of overload. It must permit a proxy to
it. As a consequence of this, the overloaded SIP server prioritize requests based on any local policy so that cer-
itself generates proportionally less downstream traffic, tain ones, such as a call for emergency services or a call
thereby limiting the impact on other elements in the with a specific value of the RPH field (RFC 4412, see
network. Section 2.8), are given preferential treatment, such as
◾◾ On the server side, the overload condition is deter- not being dropped, being given additional retransmis-
mined monitoring S and reporting a load feedback sion, or being processed ahead of others.
F as a value to the oc parameter. On the client side, ◾◾ The mechanism provides unambiguous directions to
a throttle T is applied to requests going downstream clients on when they should retry a request and when
based on F. This specification does not prescribe they should not. This especially applies to TCP con-
any value for S nor a particular value for F. The oc- nection establishment and SIP registrations in order
algo parameter allows for automatic convergence to mitigate against an avalanche restart. The scheme
to a particular class of overload control algorithm. provides normative behavior on when to retry a request
There are suggested default values for the oc-validity after repeated timeouts, and fatal transport errors
parameter. resulting from communications with a nonresponsive
◾◾ The mechanism is designed to reduce congestion when downstream SIP server.
a pair of communicating entities support it. If a down- ◾◾ The mechanism has capabilities to function properly
stream overloaded SIP server does not respond to a in cases where a network element fails, is so overloaded
request in time, a SIP client will attempt to reduce traf- that it cannot process messages, or cannot communi-
fic destined toward the nonresponsive server. cate owing to a network failure or network partition. It
◾◾ The mechanism does not assume that it will only be is not able to provide explicit indications of the nature
deployed in environments with completely trusted ele- of the failure or its levels of congestion, because it pro-
ments. The overload control information is shared vides normative behavior on when to retry a request
between a pair of communicating entities. Consequently, after repeated timeouts and fatal transport errors
a confidential and authenticated channel can be used resulting from communications with a nonresponsive
for this communication. However, if such a channel is downstream SIP server.
not available, then the needed security ramifications are ◾◾ The mechanism attempts to minimize the overhead
specified. of the overload control messaging. Overload control
◾◾ The mechanism provides a way for an element to throt- messages are sent in the topmost Via header, which is
tle the amount of traffic it receives from an upstream always processed by the SIP elements.
element. This throttling is graded to a great extent with ◾◾ The overload mechanism tries to prevent malicious
the current 503 Service Unavailable mechanism. attacks, including denial-of-service and distributed
◾◾ A SIP client that has overload information from mul- denial-of-service attacks.
tiple downstream servers will not retry the request on ◾◾ Overload control information is shared between a pair
another element. However, if a SIP client does not of communicating entities, and a confidential and
know the overload status of a downstream server, it authenticated channel can be used for this commu-
may send the request to that server. nication. However, if such a channel is not available,
◾◾ The mechanism supports servers that receive requests from then the security ramifications specified by the mecha-
a large number of different upstream elements, where the nism should be used.
Connections Management and Overload Control in SIP ◾ 483
◾◾ The overload mechanism is unambiguous about whether Section 13.3). That solution provides a communication
a load indication applies to a specific IP address, host, scheme for overload control algorithms. It also includes a
or URI so that an upstream element can determine the default loss-based overload control algorithm that makes it
load of the entity to which a request is to be sent. possible for a set of clients to limit offered load toward an
◾◾ The specification for the overload mechanism gives overloaded server. However, such a loss control algorithm is
guidance on which message types might be desirable sensitive to variations in load so that any increase in load
to process over others during times of overload, based would be directly reflected by the clients in the offered load
on SIP-specific considerations. For example, it may be presented to the overloaded servers. More important, a loss-
more beneficial to process a SUBSCRIBE refresh with based control scheme cannot guarantee an upper bound on
Expires of zero than a SUBSCRIBE refresh with a the offered load from the clients toward an overloaded server,
nonzero expiration (since the former reduces the over- and requires frequent updates that may have implications for
all amount of load on the element) or to process re- stability. The use of SIP in large-scale next-generation net-
INVITEs over new INVITEs. works requires that SIP-based networks provide adequate
◾◾ In a mixed environment of elements that do and do control mechanisms for handling traffic growth. In particu-
not implement the overload mechanism, no dispropor- lar, SIP networks must be able to handle traffic overloads
tionate benefit accrues to the users or operators of the gracefully, maintaining transaction throughput by prevent-
elements that do not implement the mechanism. An ing congestion collapse.
element that does not implement overload control does The IETF draft [1] that is described here proposes an
not receive any measure of extra benefit. overload control and the rate-based overload control algo-
◾◾ The overload mechanism ensures that the system rithm that complements the loss-based control scheme, using
remains stable. When the offered load drops from above the same signaling within the framework defined in RFC
the overall capacity of the network to below the overall 7339 (see Section 13.3). The rate-based control guarantees an
capacity, the throughput should stabilize and become upper bound on the rate, constant between server updates,
equal to the offered load. The specified overload control of requests sent by clients toward an overloaded server. The
mechanism ensures the stability of the system. trade-off is in terms of algorithmic complexity since the over-
◾◾ It is possible to disable the reporting of load informa- loaded server is more likely to use a different target (maxi-
tion toward upstream targets based on the identity of mum rate) for each client than the loss-based approach. The
those targets. An operator of a SIP server can configure proposed rate-based overload control algorithm mitigates
the SIP server to only report overload control infor- congestion in SIP networks while adhering to the overload
mation for requests received over a confidential chan- signaling scheme in RFC 7339 and presenting a rate-based
nel, for example. However, note that this introduces a control as an optional alternative to the default loss-based
modicum of extra configuration. control scheme in RFC 7339.
◾◾ The overload mechanism can also work in cases where
there is a load balancer in front of a farm of proxies.
Depending on the type of load balancer, this require- 13.4.2 Rate-Based Algorithm Scheme
ment is met. A load balancer fronting a farm of SIP
13.4.2.1 Objective
proxies could be a SIP-aware load balancer or one that
is not SIP-aware. If the load balancer is SIP-aware, it The server is the one protected by the overload control algo-
can make conscious decisions on throttling outgoing rithm defined here, and the client is the one that throttles
traffic toward the individual server in the farm based traffic toward the server. Following the procedures defined
on the overload control parameters returned by the in RFC 7339 (see Section 13.3), the server and clients signal
server. On the other hand, if the load balancer is not one another’s support for rate-based overload control. Then,
SIP-aware, then there are other strategies to perform periodically, the server relies on internal measurements (e.g.,
overload control. CPU utilization or queuing delay) to evaluate its overload
state and estimate a target maximum SIP request rate in
number of requests per second (as opposed to target percent
loss in the case of loss-based control). When in overload, the
13.4 Rate-Based Overload server uses the Via header field oc parameter of RFC 7339
Control in SIP Network of SIP responses in order to inform the clients of its overload
state and of the target maximum SIP request rate for that
13.4.1 Overview client. Upon receiving the oc parameter with a target maxi-
We have described a promising SIP-based overload control mum SIP request rate, each client throttles new SIP requests
solution in the earlier section specified in RFC 7339 (see toward the overloaded server.
484 ◾ Handbook on Session Initiation Protocol
13.4.2.2 Via Header Field Parameters set the same rate for every client, or may set different rates
for Overload Control for different clients. The maximum rate determined by the
server for a client applies to the entire stream of SIP requests,
The use of the Via header field oc parameter informs clients even though throttling may only affect a particular subset
of the desired maximum rate. They are defined in RFC 7339 of the requests, since as per RFC 7339 (see Section 13.3),
(see Section 13.3) and summarized below: request prioritization is a client’s responsibility. When set-
ting the maximum rate for a particular client, the server may
◾◾ oc: Used by clients in SIP requests to indicate (RFC need take into account the workload (e.g., CPU load per
7339, see Section 13.3) support and by servers to indi- request) of the distribution of message types from that client.
cate the load reduction amount in the loss algorithm, Furthermore, because the client may prioritize the specific
and the maximum rate, in messages per second, for the types of messages it sends while under overload restriction,
rate-based algorithm described here. this distribution of message types may be different from (e.g.,
◾◾ oc-algo: Used by clients in SIP requests to advertise either higher or lower CPU load) the message distribution for
supported overload control algorithms and by servers that client under nonoverload conditions.
to notify clients of the algorithm in effect. Note that the oc parameter for the rate algorithm is an
◾◾ values: loss (default), rate (optional). upper bound (in messages per second) on the traffic sent
◾◾ oc-validity: Used by servers in SIP responses to indi- by the client to the server. The client may send traffic at a
cate an interval of time (milliseconds) that the load rate significantly lower than the upper bound for a variety
reduction should be in effect. A value of 0 is reserved of reasons. In other words, when multiple clients are being
for the server to stop overload control. A nonzero value controlled by an overloaded server, at any given time some
is required in all other cases. clients may receive requests at a rate below their target (maxi-
◾◾ oc-seq: A sequence number associated with the oc mum) SIP request rate, while others above that target rate.
parameter. Consult Section Section 13.3.3 for an illus- However, the resulting request rate presented to the over-
tration of the Via header field oc parameter usage. loaded server will converge toward the target SIP request
rate. Upon detection of overload and the determination to
invoke overload controls, the server must follow the specifi-
13.4.2.3 Client and Server Rate-Control cations in RFC 7339 (see Section 13.3) to notify its clients of
Algorithm Selection the allocated target SIP request rate, and that rate-based con-
trol is in effect. The server must use the oc parameter (RFC
Per RFC 7339 (see Section 13.3), new clients indicate sup- 7339) to send a target SIP request rate to each of its clients.
ported overload control algorithms to servers by inserting oc When a client supports the default loss algorithm and not the
and oc-algo, with the names of the supported algorithms, rate algorithm, the client would be handled in the same way
in the Via header field of SIP requests destined to servers. as described here [1].
The inclusion by the client of the token rate indicates that
the client supports a rate-based algorithm. Conversely, serv-
ers notify clients of the selected overload control algorithm 13.4.2.5 Client Operation
through the oc-algo parameter in the Via header field of SIP
responses to clients. The inclusion by the server of the token 13.4.2.5.1 Default Algorithm
rate in the oc-algo parameter indicates that the rate-based In determining whether or not to transmit a specific mes-
algorithm has been selected by the server. Support of rate- sage, the client may use any algorithm that limits the mes-
based control must be indicated by clients including the sage rate to the oc parameter in units of messages per second.
token rate in the oc-algo list. Selection of rate-based control For ease of discussion, we define T = 1/[oc parameter] as
must be indicated by servers by setting oc-algo to the token the target inter-SIP request interval. The algorithm may be
rate. strictly deterministic, or it may be probabilistic. It may, or
may not, have a tolerance factor, to allow for short bursts, as
13.4.2.4 Server Operation
long as the long-term rate remains below 1/T. The algorithm
The actual algorithm used by the server to determine its over- may have provisions for prioritizing traffic. If the algorithm
load state and estimate a target maximum SIP request rate requires other parameters (in addition to T, which is 1/[oc
is beyond the scope of this document. However, the server parameter]), they may be set autonomously by the client, or
MUST periodically evaluate its overload state and estimate they may be negotiated between client and server indepen-
a target SIP request rate beyond which it would become dently of the SIP-based overload control solution. In either
overloaded. The server must determine how it will allocate case, the coordination is out of scope for this document. The
the target SIP request rate among its clients. The server may default algorithms presented here (one without provisions for
Connections Management and Overload Control in SIP ◾ 485
prioritizing traffic, one with provisions) are only examples. where X is the value of the leaky bucket counter after arrival
To throttle new SIP requests at the rate specified by the oc of the last forwarded SIP request, and LCT is the time at
parameter sent by the server to its clients, the client may use which the last SIP request was forwarded.
the proposed default algorithm for rate-based control or any If X′ is less than or equal to the limit value TAU, then the
other equivalent algorithm that forward messages in confor- new SIP request is forwarded and the leaky bucket counter
mance with the upper bound of 1/T messages per second. X is set to X′ (or to 0 if X′ is negative) plus the increment T,
The default leaky bucket algorithm is presented here [2]. The and LCT is set to the current time ta(k). If X′ is greater than
algorithm makes it possible for clients to deliver SIP requests the limit value TAU, then the new SIP request is rejected,
at a rate specified by the oc parameter with tolerance param- and the values of X and LCT are unchanged. When the
eter TAU (preferably configurable). first response from the server has been received, indicating
Conceptually, the leaky bucket algorithm can be viewed control activation (oc-validity > 0), LCT is set to the time
as a finite capacity bucket whose real-valued content drains of activation, and the leaky bucket counter is initialized to
out at a continuous rate of 1 unit of content per time unit the parameter TAU0 (preferably configurable), which is 0 or
and whose content increases by the increment T for each larger but less than or equal to TAU. TAU can assume any
forwarded SIP request. T is computed as the inverse of the positive real number value and is not necessarily bounded by
rate specified by the oc parameter, namely T = 1/[oc param- T. TAU = 4 * T is a reasonable compromise between burst
eter]. Note that when the oc parameter is 0 with a non- size and throttled rate adaptation at low offered rates. Note
zero oc-validity, then the client should reject 100% of SIP that specification of a value for TAU and any communication
requests destined to the overload server. However, when the or coordination between servers are beyond the scope of this
oc-validity value is 0, the client should immediately stop document. A reference algorithm is shown below.
throttling. If, at a new SIP request arrival, the content of No priority case:
the bucket is less than or equal to the limit value TAU, then
the SIP request is forwarded to the server; otherwise, the //T: inter-transmission interval, set to 1 /
//[oc parameter]
SIP request is rejected. Note that the capacity of the bucket
//TAU: tolerance parameter
(the upper bound of the counter) is (T + TAU ). The toler- //ta: arrival time of the most recent arrival
ance parameter TAU determines how close the long-term //received by the client
admitted rate is to an ideal control that would admit all //LCT: arrival time of last SIP request that
SIP requests for arrival rates less than 1/T, and then admit //was sent to the server
SIP requests precisely at the rate of 1/T for arrival rates //(initialized to the first arrival time)
//X: current value of the leaky bucket
above 1/T. In particular, at mean arrival rates close to 1/T,
//counter (initialized to TAU0)
it determines the tolerance to deviation of the interarrival //After most recent arrival, calculate
time from T (the larger the TAU, the more tolerance to //auxiliary variable Xp
deviations from the interdeparture interval T ).
This deviation from the interdeparture interval influences Xp = X − (ta − LCT);
the admitted rate burstiness, or the number of consecutive if (Xp ≤ TAU) {
SIP requests forwarded to the server (burst size proportional
//Transmit SIP request
to TAU over the difference between 1/T and the arrival rate). //Update X and LCT
In situations where clients are configured with some knowl-
edge about the server (e.g., operator preprovisioning), it can X = max(0, Xp) + T;
be beneficial to choose a value of TAU based on how many LCT = ta;
clients will be sending requests to the server. Servers with }else {
//Reject SIP request
a very large number of clients, each with a relatively small
//Do not update X and LCT
arrival rate, will generally benefit from a smaller value for }
TAU in order to limit queuing (and hence response times) at
the server when subjected to a sudden surge of traffic from
all clients. Conversely, a server with a relatively small number
13.4.2.5.2 Priority Treatment
of clients, each with a proportionally larger arrival rate, will
benefit from a larger value of TAU. Once the control has As with the loss-based algorithm of RFC 7339 (see Section
been activated, at the arrival time of the kth new SIP request, 13.3), a client implementing the rate-based algorithm also
ta(k), the content of the bucket is provisionally updated to prioritizes messages into two or more categories of requests:
the value requests that are candidates for reduction and requests not
subject to reduction (except under extenuating circumstances
X′ = X − (ta(k) − LCT ) when there are no messages in the first category that can be
486 ◾ Handbook on Session Initiation Protocol
reduced). Accordingly, the proposed leaky bucket implemen- //Transmit SIP request
tation is modified to support priority using two thresholds //Update X and LCT
X = max(0, Xp) + T;
for SIP requests in the set of request candidates for reduction. LTC = Ta;
With two priorities, the proposed leaky bucket requires two }else {
thresholds TAU1 < TAU2: //Reject SIP request
//Do not update X and LCT
◾◾ All new requests would be admitted when the leaky }
bucket counter is at or below TAU1.
◾◾ Only higher-priority requests would be admitted when
the leaky bucket counter is between TAU1 and TAU2. 13.4.2.5.3 Optional Enhancement:
◾◾ All requests would be rejected when the bucket counter Avoidance of Resonance
is at or above TAU2. As the number of client sources of traffic increases or the
throughput of the server decreases, the maximum rate
This can be generalized to n priorities using n thresholds admitted by each client needs to decrease, and therefore the
for n > 2 in the obvious way. With a priority scheme that relies value of T becomes larger. Under some circumstances, for
on two tolerance parameters (TAU2 influences the priority example, if the traffic arises very quickly simultaneously at
traffic, TAU1 influences the nonpriority traffic), always set many sources, the occupancies of each bucket can become
TAU1 < TAU2 (TAU is replaced by TAU1 and TAU2). Setting synchronized, resulting in the admissions from each source
both tolerance parameters to the same value is equivalent to being close in time, and batched or very peaky arrivals at the
having no priority. TAU1 influences the admitted rate the server, which not only gives rise to control instability but also
same way as TAU does when no priority is set. Moreover, to very poor delays and even lost messages. An appropriate
the larger the difference between TAU1 and TAU2, the closer term for this is resonance [3]. If the network topology is such
the control is to strict priority queuing. TAU1 and TAU2 can that resonance can occur, then a simple way to avoid reso-
assume any positive real number value and are not necessar- nance is to randomize the bucket occupancy at two appro-
ily bounded by T. Reasonable values for TAU0, TAU1, and priate points: At the activation of control and whenever the
TAU2 are bucket empties, as follows. After updating the value of the
leaky bucket to X′, generate a value u as follows:
1
TAU 0 = 0, TAU 1 = * TAU 2 and TAU 2 = 10 * T if X′ > 0, then u = 0
2
else if X′ ≤ 0, then let u be set to a
uniformly distributed random value
Note that specification of a value for TAU1 and TAU2, 1 1
and any communication or coordination between servers are between − and +
2 2
beyond the scope of this document. A reference algorithm is if X’ > 0, then u = 0
shown below. else if X’ ≤ 0, then let u be set to a random
Priority case: value uniformly distributed between −1/2 and
+1/2.
//T: inter-transmission interval, set to 1 /
//[oc parameter] Then, (only) if the arrival is admitted, increase the bucket
//TAU1: tolerance parameter of no-priority by an amount T + uT, which will therefore be just T if the
//SIP requests
//TAU2: tolerance parameter of priority SIP
bucket has not emptied, or lie between T/2 and 3T/2 if it
//requests has. This randomization should also be done when control
//ta: arrival time of the most recent arrival is activated; that is, instead of simply initializing the leaky
//received by the client bucket counter to TAU0, initialize it to TAU0 + uT, where u is
//LCT: arrival time of last SIP request that uniformly distributed as stated above. Since activation would
//was sent to the server have been a result of response to a request sent by the client,
//(initialized to the first arrival time)
//X: current value of the leaky bucket
the second term in this expression can be interpreted as being
//counter (initialized to TAU0) the bucket increment following that admission. This method
//After most recent arrival, calculate has the following characteristics:
//auxiliary variable Xp
Xp = X −(ta − LCT);
if (AnyRequestReceived && Xp ≤ TAU1) ◾◾ If TAU0 is chosen to be equal to u and all sources
|| (PriorityRequestReceived && were to activate control at the same time owing to an
Xp ≤ TAU2 & & Xp > TAU1 { extremely high request rate, then the time until the
Connections Management and Overload Control in SIP ◾ 487
T 3T
formly distributed over , , and the mean time
2 2
R
between admissions is the same, that is, T + where 13.5 Summary
R is the request arrival rate. 2
We have defined the notion of logical connection, known
◾◾ As high load randomization rarely occurs, there is no as network flows of SIP messages, in the form of dia-
loss of precision of the admitted rate, even though the log forming and registration between the SIP entities
randomized phasing of the buckets remains. for end-to-end communications in the SIP application
layer. These flow-based logical connections can be reused
for routing of messages by SIP entities especially for
13.4.3 Example proxies. The flow-based connection setup, keep-alive
Adapting the example in RFC 7339 (see Section 13.3), where mechanisms connection management procedures, reg-
client P1 sends requests to a downstream server P2: istration of client-initiated procedures by SIP entities,
flow-recovery mechanisms, and connection management
INVITE sips:[email protected] SIP/2.0
examples are explained in great detail. Like connection
Via: SIP/2.0/TLS p1.example.net; management, the growing needs of overload control for
branch=z9hG4bK2d4790.1;received=192.0.2.111; the next-generation large-scale SIP networks have resulted
oc;oc-algo="loss,rate" in extensions in SIP. The loss-based and rate-based over-
... load control mechanisms that complement each other
SIP/2.0 100 Trying are explained along with their load control algorithms.
Via: SIP/2.0/TLS p1.example.net;
branch=z9hG4bK2d4790.1;received=192.0.2.111;
The general behavior of SIP entities related to determin-
oc=0;oc-algo="rate";oc-validity=0; ing support for overload control, creating and updating
oc-seq=1282321615.781 the control parameters, processing and using, terminat-
... ing, stabilizing, self-limiting, responding with overload
indication, and dealing with provisional response by the
In the messages above, the first line is sent by P1 to P2. overload control is described. It is explained that the rate-
This line is a SIP request; because P1 supports overload con- based overload mechanism uses a different load control
trol, it inserts the oc parameter in the topmost Via header field algorithm, but uses the same SIP signaling messages that
that it created. P1 supports two overload control algorithms: are used by the loss-based control. The examples for both
loss and rate. The second line, a SIP response, shows the top control mechanisms are discussed.
most Via header field amended by P2 according to this specifi-
cation and sent to P1. Because P2 also supports overload con- PROBLEMS
trol, it chooses the rate-based scheme and sends that back to 1. What is the logical connection known as the network
P1 in the oc-algo parameter. It uses oc-validity=0 to indicate flows of SIP messages? Why is it useful? Articulate
no overload control. In this example, oc=0; however, oc could with examples. How does it differ with that of TCP or
be any value as oc is ignored when oc-validity=0. At some later UDP?
time, P2 starts to experience overload. It sends the following 2. How are the flow-based connections set up over the SIP
SIP message indicating P1 should send SIP requests at a rate network? Describe in detail the roles that are played by
no greater than or equal to 150 SIP requests per second and for SIP entities in setting up the connection.
a duration of 1000 milliseconds. 3. What is the keep-alive mechanism? Describe the CRLF
and STUN keep-alive techniques. How do they help in
SIP/2.0 180 Ringing managing the logical connection of network flows?
Via: SIP/2.0/TLS p1.example.net;
branch=z9hG4bK2d4790.1;received=192.0.2.111;
4. Describe in detail the procedures for connection
oc=150;oc-algo="rate";oc-validity=1000; management by UA, edge proxy, and registrar for
oc-seq=1282321615.782 both RGISTER and non-REGISTER messages as
... applicable.
488 ◾ Handbook on Session Initiation Protocol
5. Describe the connection management procedures by 12. Describe the loss-based overload control scheme. What
the authoritative proxy. are its special parameter values? What is its default
6. Describe the registration call flows managing client- algorithm?
initiated connection for registration, incoming call and 13. Develop a detailed design architecture for implementa-
proxy crash, reregistration, and outgoing call. tion of the loss-based overload control mechanism along
7. Describe the keep-alive mechanism in the SIP network with SIP mechanism and backward compatibility.
with CRLF and STUN. 14. Describe in detail the security features that the over-
8. Describe the flow-recovery mechanisms in the SIP net- load control mechanism need to take care of.
work with call flow example: configuration subscrip- 15. What is the rate-based overload control mechanism?
tion, registration, and incoming proxy in view of proxy How does it complement the load-based overload con-
crash, reregistration, and outgoing call. trol mechanism?
9. Why is overload control needed in the SIP net- 16. Describe the rate-based control algorithm scheme
work? How does the overload control mechanism along with Via header filed, client and server rate con-
operate? What are the requirements that the over- trol algorithm selection, server, and client operation
load control mechanism in the SIP network need to with examples.
satisfy?
10. Explain the functions of the parameters that are being
extended for the SIP Via header: oc, oc-algo, oc-validity, References
and oc-seq.
1. Noel, E. and Williams, P.M., “Session Initiation Protocol
11. Explain the general behavior in determining support
(SIP) rate control,” draft-ietf-soc-overload-rate-control-10.
for overload control, creating and updating the control txt, IETF Draft, Work in Progress, April 10, 2015.
parameters, processing and using, terminating, stabi- 2. “Traffic control and congestion control in B-ISDN,” ITU-T
lizing, self-limiting, responding with overload indi- Recommendation I.371, 2000.
cation, and dealing with provisional response by the 3. Erramilli, A. and Forys, L.J., “Traffic synchronization effects
overload control along with examples. in teletraffic systems,” ITC-13, 1991.
Chapter 14
489
490 ◾ Handbook on Session Initiation Protocol
functional entity known as the SBC that often carries both to as SBCs but do not implement SIP are outside the scope
SIP signaling and media to meet some requirements, as of this document. SBCs usually sit between two service pro-
follows: vider networks in a peering environment, or between an
access network and a backbone network to provide service
◾◾ Perimeter defense, for example, access control, topol- to residential or enterprise customers. They provide a variety
ogy hiding, and denial-of-service (DoS) prevention of functions to enable or enhance session-based multimedia
and detection services (e.g., SIP-based voice over IP) as described above.
◾◾ Functionality not available in the end points, for exam- Some of these functions may also get integrated into other
ple, NAT traversal and protocol interworking or repair SIP elements, such as prepaid platforms, Third Generation
◾◾ Traffic management, for example, media monitoring Partnership Project (3GPP) Proxy-Call Session Control
and QOS Function (P-CSCF), Interrogating-CSPF (I-CSPF) [1], and
others.
Even though many SBCs currently behave in ways that SIP-based SBCs can implement behavior that is equiva-
can break end-to-end security and influence feature negotia- lent to a privacy service (RFC 3323, see Section 20.2) per-
tions, there is clearly a market for them. Network operators forming both Header Privacy and Session Privacy because,
need many of the features current SBCs provide, and often as stated earlier, they usually handle both signaling and
there are no standard mechanisms available to provide them. media acting as the B2BUAs. SBCs often modify certain SIP
SBCs are typically deployed at the border between two net- headers and message bodies that proxies are not allowed to
works. The reason for this is that network policies are typi- modify. The transparency of these B2BUAs varies depend-
cally enforced at the edge of the network. ing on the functions they perform. For example, some SBCs
On the basis of this marketing trend of SIP deployment, modify the session description carried in the message and
Request for Comment (RFC) 5853 has been created to provide insert a Record-Route entry. Other SBCs replace the value of
some implementation guidance for SBCs so that SBC prod- the Contact header field with the SBCs’ address, and gener-
ucts from different vendors may interoperate abiding by this ate a new Call-ID and new To and From tags. Figure 14.1
framework, preventing the proliferation of non-interoperable shows the logical architecture of an SBC, which includes a
proprietary SBC products. This specification describes func- signaling and a media component. In this chapter, the terms
tions implemented in SBCs. A special focus is given to those outer network and inner network are used to describe these
practices that conflict with SIP architectural principles in some two networks.
way. This RFC also explores the underlying requirements of An SBC is logically associated with the inner network,
network operators that have led to the use of these functions and it typically provides functions such as controlling and
and practices in order to identify protocol requirements and protecting access to the inner network from the outer net-
determine whether those requirements are satisfied by existing work. The SBC itself is configured and managed by the orga-
specifications, or if additional work on standards is required. nization operating the inner network. In some scenarios,
SBCs operate with users’ (implicit or explicit) consent; how-
ever, in others, they operate without users’ consent (this latter
14.2.2 Background on SBCs case can potentially cause problems). For example, if an SBC
The term SBC is relatively nonspecific since it is not stan- in the same administrative domain as a set of enterprise users
dardized or defined anywhere. Nodes that may be referred performs topology hiding, the enterprise users can choose to
Signaling
Signaling
Media
Media
SBC
route their SIP messages through it. If they choose to route receiving the redirect message, the SBC sends the INVITE
through the SBC, then the SBC can be seen as having the to the terminating gateway.
users’ implicit consent. Another example is a scenario where From the SBC’s perspective, operator A is the outer net-
a service provider has broken gateways and it deploys an SBC work and operator B is the inner network. Operator B can
in front of them for protocol repair reasons. Users can choose use the SBC, for example, to control access to its network,
to configure the SBC as their gateway and, thus, the SBC can protect its gateways and SoftSwitches from unauthorized use
be seen as having the users’ implicit consent. and DoS attacks, and monitor the signaling and media traf-
fic. It also simplifies network management by minimizing
the number of Access Control List (ACL) entries in the gate-
14.2.2.1 Peering Scenario
ways. The gateways do not need to be exposed to the peer
A typical peering scenario involves two network operators network, and they can restrict access (both media and sig-
who exchange traffic with each other. An example peering naling) to the SBCs. The SBC helps ensure that only media
scenario is illustrated in Figure 14.2. An originating gateway from sessions the SBC authorizes will reach the gateway.
(GW-A1) in operator A’s network sends an INVITE that is
routed to the SBC in operator B’s network. Then, the SBC
14.2.2.2 Access Scenario
forward it to the SoftSwitch (SS-B). The SoftSwitch responds
with a redirect (3xx) message back to the SBC that points In an access scenario, presented in Figure 14.3, the SBC is
to the appropriate terminating gateway (GW-B1) in operator placed at the border between the access network (outer net-
B’s network. If operator B does not have an SBC, the redirect work) and the operator’s network (inner network) to con-
message would go to operator A’s originating gateway. After trol access to the operator’s network, protect its components
Operator A Operator B
F2 INVITE
SS-A SS-B
F3 3xx (Redirect)
F1 INVITE F4 INVITE
GW-A1 SBC GW-B1
GW-A2 GW-B2
Figure 14.2 Peering with SBC. (Copyright IETF. Reproduced with permission.)
UA 1
UA 2 SBC Proxy
UA 3 NAT
Figure 14.3 Access scenario with SBC. (Copyright IETF. Reproduced with permission.)
492 ◾ Handbook on Session Initiation Protocol
(media servers, application servers, gateways, etc.) from This may be because they do not want to expose their equip-
unauthorized use and DoS attacks, and monitor the signal- ment to DoS attacks, they may use other carriers for certain
ing and media traffic. Also, since the SBC is call stateful, traffic and do not want their customers to be aware of it,
it may provide access control functions to prevent oversub- or they may want to hide their internal network architec-
scription of the access links. End points are configured with ture from competitors or partners. In some environments,
the SBC as their outbound proxy address. The SBC routes the operator’s customers may wish to hide the addresses of
requests to one or more proxies in the operator network. their equipment, or the SIP messages may contain private,
The SBC may be hosted in the access network (e.g., nonroutable addresses. The most common form of topol-
this is common when the access network is an enterprise ogy hiding is the application of header privacy (RFC 3323,
network), or in the operator network (e.g., this is common see Section 20.2), which involves stripping Via and Record-
when the access network is a residential or small business Route headers, replacing the Contact header, and even
network). Despite where the SBC is hosted, it is managed by changing Call-IDs. However, in deployments that use IP
the organization maintaining the operator network. Some addresses instead of domain names in headers that cannot be
end points may be behind enterprise or residential NATs. removed (e.g., From and To headers), the SBC may replace
In cases where the access network is a private network, the these IP addresses with its own IP address or domain name.
SBC is a NAT for all traffic. It is noteworthy that SIP traffic For reference, there are also other ways of hiding topology
may have to traverse more than one NAT. The proxy usu- information than inserting an intermediary, like an SBC, to
ally does authentication or authorization for registrations the signaling path. One of the ways is the user agent (UA)-
and outbound calls. The SBC modifies the REGISTER driven privacy mechanism (RFC 5767, see Section 20.6.2),
request so that subsequent requests to the registered address where the UA can facilitate the concealment of topology
of record are routed to the SBC. This is done either with a information.
Path header or by modifying the Contact to point at the
SBC. The scenario presented in this section is a general one,
14.2.3.1.2 Architectural Issues
and it applies also to other similar settings. One example
from a similar setting is the one where an access network is Performing topology hiding, as described above, by SBCs
the open Internet, and the operator network is the network that do not have the users’ consent presents some issues.
of a SIP service provider. This functionality is based on a hop-by-hop trust model as
opposed to an end-to-end trust model. The messages are
modified without the subscriber’s consent and could poten-
14.2.3 Functions of SBCs tially modify or remove information about the user’s privacy,
This section lists those functions that are used in SBC security requirements, and higher-layer applications that are
deployments in current communication networks. Each sub- communicated end-to-end using SIP. Neither UA in an end-
section describes a particular function or feature, the opera- to-end call has any way to distinguish the SBC actions from
tors’ requirements for having it, explanation of any impact to a man-in-the-middle (MITM) attack. The topology hiding
the end-to-end SIP architecture, and a concrete implemen- function does not work well with Authenticated Identity
tation example. Each section also discusses potential con- Management (RFC 4474, see Sections 2.8 and 19.4.8) in
cerns specific to that particular implementation technique. scenarios where the SBC does not have any kind of consent
Suggestions for alternative implementation techniques that from the users. The Authenticated Identity Management
may be more architecturally compatible with SIP are outside mechanism is based on a hash value that is calculated from
the scope of this document. All the examples given in this parts of From, To, Call-ID, CSeq, Date, and Contact header
section are simplified; only the relevant header lines from fields plus from the whole message body. If the authentica-
SIP and Session Description Protocol (SDP) messages are tion service is not provided by the SBC itself, the modifi-
displayed. cation of the aforementioned header fields and the message
body is in violation of RFC 4474. Some forms of topology
hiding are in violation, because they are, for example, replac-
14.2.3.1 Topology Hiding ing the Contact header of a SIP message.
14.2.3.1.1 General Information and Requirements
Topology hiding consists of limiting the amount of topol-
14.2.3.1.3 Example
ogy information given to external parties. Operators have a The current way of implementing topology hiding consists
requirement for this functionality because they do not want of having an SBC act as a B2BUA and removing all traces
the IP addresses of their equipment (proxies, gateways, appli- of topology information (e.g., Via and Record-Route entries)
cation servers, and others) to be exposed to outside parties. from outgoing messages. Imagine the following example
Interworking Services in SIP ◾ 493
scenario: the SBC (p4.domain.example.com) receives an order to control the traffic being carried on their network
INVITE request from the inner network, which in this case on behalf of their subscribers. Traffic management helps the
is an operator network. The received SIP message is shown creation of different kinds of billing models (e.g., video tele-
in Figure 14.4. phony can be priced differently than voice-only calls), and
Then, the SBC performs a topology hiding function. In it also makes it possible for operators to enforce the usage of
this scenario, the SBC removes and stores all existing Via selected codecs. One of the use cases for media traffic man-
and Record-Route headers, and then inserts Via and Record- agement is the implementation of intercept capabilities that
Route header fields with its own SIP Uniform Resource are required to support audit or legal obligations. It is note-
Identifier (URI). After the topology hiding function, the worthy that the legal obligations mainly apply to operators
message could appear as shown in Figure 14.5. providing voice services, and those operators typically have
Like a regular proxy server that inserts a Record-Route infrastructure (e.g., SIP proxies acting as B2BUAs) for pro-
entry, the SBC handles every single message of a given SIP viding intercept capabilities even without SBCs. Since the
dialog. If the SBC loses state (e.g., the SBC restarts for some media path is independent of the signaling path, the media
reason), it may not be able to route messages properly (note: may not traverse through the operator’s network unless the
some SBCs preserve the state information also on restart). SBC modifies the session description. By modifying the ses-
For example, if the SBC removes Via entries from a request sion description, the SBC can force the media to be sent
and then restarts, thus losing state, the SBC may not be able through a media relay that may be colocated with the SBC.
to route responses to that request, depending on the informa- This kind of traffic management can be done, for exam-
tion that was lost when the SBC restarted. This is only one ple, to ensure a certain QOS level, or to ensure that sub-
example of topology hiding. Besides topology hiding (i.e., scribers are using only allowed codecs. It is noteworthy that
information related to the network elements is being hidden), the SBCs do not have direct ties to routing topology and
SBCs may also do identity hiding (i.e., information related they do not, for example, change bandwidth reservations on
to identity of subscribers is being hidden). While performing Traffic Engineering tunnels, nor do they have direct interac-
identity hiding, SBCs may modify Contact header field val- tion with the routing protocol. Some operators do not want
ues and other header fields containing identity information. to manage the traffic, but only to monitor it to collect statis-
The header fields containing identity information is listed in tics and make sure that they are able to meet any business-
RFC 3323 (see Section 20.2). Since the publication of RFC service-level agreements with their subscribers or partners.
3323, the following header fields containing identity infor- The protocol techniques, from the SBC’s viewpoint, needed
mation have been defined: P-Asserted-Identity, Referred-By, for monitoring media traffic are the same as for managing
Identity, and Identity-Info. media traffic. SBCs on the media path are also capable of
dealing with the lost BYE issue if either end point dies in the
middle of the session. The SBC can detect that the media has
14.2.3.2 Media Traffic Management stopped flowing, and issue a BYE to both sides to clean up
any state in other intermediate elements and the end points.
14.2.3.2.1 General Information and Requirements
One possible form of media traffic management is that SBCs
Media traffic management is the function of controlling media terminate media streams and SIP dialogs by generating BYE
traffic. Network operators may require this functionality in requests. This kind of procedure can take place, for example,
494 ◾ Handbook on Session Initiation Protocol
example, an SBC can enable a plain SIP (RFC 3261) UA to Then, the SBC performs a capability mismatch fixing
connect to a 3GPP network, or enable a connection between function. In this scenario, the SBC inserts Record-Route
UAs that support different IP versions, different codecs, and Via headers and rewrites the c= line from the sessions
or that are in different address realms. Operators have a descriptor. Figure 14.9 shows the request after the capability
requirement and a strong motivation for performing capa- mismatch adjustment.
bility mismatch fixing, so that they can provide transpar- This message is then sent by the SBC to the onward IPv6
ent communication across different domains. In some cases, network.
different SIP extensions or methods to implement the same
SIP application (like monitoring session liveness, call history/
14.2.3.4 Maintaining SIP-Related NAT Bindings
diversion, etc.) may also be interworked through the SBC.
14.2.3.4.1 General Information and Requirements
14.2.3.3.2 Architectural Issues NAT traversal in this instance refers to the specific message
modifications required to assist a UA in maintaining SIP
SBCs that are fixing capability mismatches do it by inserting
and media connectivity when there is a NAT device located
a media element into the media path using the procedures
between a UA and a proxy/registrar and, possibly, any other
described earlier. Therefore, these SBCs have the same con-
UA. The primary purpose of the NAT traversal function is to
cerns as SBCs performing traffic management: the SBC may
keep up a control connection to UAs behind NATs. This can,
modify SIP messages without consent from any of the UAs.
for example, be achieved by generating periodic network traf-
This may break end-to-end security and application exten-
fic that keeps bindings in NATs alive. SBCs’ NAT traversal
sions negotiation. The capability mismatch fixing is a fragile
function is required in scenarios where the NAT is outside the
function in the long term. The number of incompatibilities
SBC (i.e., not in cases where the SBC itself acts as a NAT).
built into various network elements is increasing the fragil-
An SBC performing a NAT traversal function for a UA
ity and complexity over time. This might lead to a situation
behind a NAT sits between the UA and the registrar of the
where SBCs need to be able to handle a large number of
domain. NATs are widely deployed in various access net-
capability mismatches in parallel.
works today, so operators have a requirement to support it.
When the registrar receives a REGISTER request from the
14.2.3.3.3 Example UA and responds with a 200 OK response, the SBC modi-
fies such a response, decreasing the validity of the registra-
Consider the following example scenario (Figure 14.8) where
tion (i.e., the registration expires sooner). This forces the UA
the inner network is an access network using IPv4 and the
to send a new REGISTER to refresh the registration sooner
outer network is using IPv6. The SBC receives an INVITE
than it would have done on receiving the original response
request with a session description from the access network
from the registrar. The REGISTER requests sent by the UA
(Figure 14.8).
refresh the binding of the NAT before the binding expires.
Note that the SBC does not need to relay all the
REGISTER requests received from the UA to the registrar.
The SBC can generate responses to REGISTER requests
received before the registration is about to expire at the reg-
istrar. Moreover, the SBC needs to deregister the UA if this
fails to refresh its registration in time, even if the registra-
tion at the registrar would still be valid. SBCs can also force
Consider the following example scenario: the SBC resides ◾◾ Preventing packets from unregistered users to mitigate
between the UA and Registrar. Previously, the UA has sent chances of DoS attack
a REGISTER request to the Registrar, and the SBC receives ◾◾ Prioritization or rerouting of traffic based on user or
the registration response shown in Figure 14.10. service, like E911 as it enters the network
◾◾ Performing a load-balancing function or reducing the
load on other network equipment
answers. With this information, the SBC can reject sessions going to flow through the SBC itself. When the media starts
before the available bandwidth is exhausted to allow existing flowing, the SBC can inspect whether the callee and caller
sessions to maintain acceptable QOS. Otherwise, the link use the codec(s) upon which they had previously agreed.
could become oversubscribed and all sessions would expe-
rience a deterioration in QOS. SBCs may contact a policy
server to determine whether sufficient bandwidth is available 14.2.3.6 Protocol Repair
on a per-session basis. 14.2.3.6.1 General Information and Requirements
SBCs are also used to repair protocol messages generated by
14.2.3.5.2 Architectural Issues clients that are not fully standard compliant, or are badly
Since the SBC needs to handle all SIP messages, this function implemented. Operators may wish to support protocol repair
has scalability implications. In addition, the SBC is a single if they want to support as many clients as possible. It is note-
point of failure from an architectural point of view. However, worthy that this function affects only the signaling com-
in practice, many current SBCs have the capability of sup- ponent of an SBC, and that the protocol repair function is
porting redundant configuration, which prevents the loss of not the same as protocol conversion (i.e., making translation
calls or sessions in the event of a failure on a single node. If between two completely different protocols).
access control is performed only on behalf of signaling, then
the SBC is compatible with general SIP architectural prin-
14.2.3.6.2 Architectural Issues
ciples; however, if it is performed for signaling and for media,
then there are similar problems as described earlier. In many cases, doing protocol repair for SIP header fields can
be seen as being compatible with SIP architectural principles,
and it does not violate the end-to-end model of SIP. The SBC
14.2.3.5.3 Example repairing protocol message behaves as a proxy server that is
Figure 14.12 shows a call flow where the SBC is providing both liberal in what it accepts and strict in what it sends. However,
signaling and media access control (ACKs omitted for brevity). protocol repair may break security mechanisms that do cryp-
In this scenario, the SBC first identifies the caller, so it tographical computations on SIP header values. Attempting
can determine whether or not to give signaling access to the protocol repair for SIP message bodies that contain, for
caller. This might be achieved using information gathered example, SDP is incompatible with Authenticated Identity
during registration, or by other means. Some SBCs may rely Management (RFC 4474, see Sections 2.8 and 19.4.8) and
on the proxy to authenticate the UA placing the call. After end-to-end security mechanisms such as S/MIME. A similar
identification, the SBC modifies the session descriptors in problem related to increasing complexity, as explained ear-
INVITE and 200 OK messages in a way so that the media is lier, also affects protocol repair function.
Figure 14.12 Example access call flow. (Copyright IETF. Reproduced with permission.)
498 ◾ Handbook on Session Initiation Protocol
Figure 14.14 Media encryption example. (Copyright IETF. Reproduced with permission.)
Interworking Services in SIP ◾ 499
14.2.4 Derived Requirements for Future However, the network address translation is used by the
SIP Standardization Work network administrator for their private internal networks
using private IP addresses and port numbers hiding the inter-
Some of the functions listed in this chapter are more SIP- nal network topologies for security reasons or nonavailabil-
unfriendly than others. This list of requirements is derived ity of the public IP addresses. The NAT that is considered
from the functions that break the principles of SIP in one a network-layer (or even layer 4 in some cases) device usu-
way or another when performed by SBCs that do not have ally maps the internal private IP addresses and port numbers
the users’ consent. The derived requirements are as follows: into public IP and port numbers. The key is when a calling,
called, or both parties reside behind NATs, their private IP
◾◾ There should be a SIP-friendly way to hide network and port numbers will not be known a priori first for routing
topology information. Currently, this is done by strip- of the SIP signaling messages before the session is set up and
ping and replacing header fields, which is against the then for routing of the media between the communicating
principles of SIP on behalf of some header fields. parties once the session is established. Moreover, there are
◾◾ There should be a SIP-friendly way to direct media different kinds of NATs, and the behavior of one kind of
traffic through intermediaries. Currently, this is done NAT significantly differs from that of another.
by modifying session descriptors, which is against the The NAT crossing by SIP has drained huge resources,
principles of SIP. including extensions in SDP in the IETF breaking the basic
◾◾ There should be a SIP-friendly way to fix capability mis- tenet of the Open Standard International (OSI) protocol
matches in SIP messages. This requirement is harder to stack where each layer is supposed to work independently
fulfill on complex mismatch cases, like the 3GPP/SIP of all other layers for scalability. A key issue has been that
(RFC 3261) network mismatch. Currently, this is done a NAT opens and closes its pinholes for communications
by modifying SIP messages, which may violate end-to- based on policies that depend on the layer 3/4 properties and
end security, on behalf of some header fields. parameters of the order of a few seconds to minutes, while a
SIP session and the media flows of the session may continue
The first two requirements do not have an existing stan- for hours if not days. The pinhole opening and closing poli-
dardized solution today. There is ongoing work in the Internet cies are again handled by the upper-layer firewall (FW). In
Engineering Task Force (IETF) for addressing the second fact, the NAT and FW close hand-by-hand. If NAT and FW
requirement, such as SIP session policies, TURN (see Section are not collocated, an application protocol needs to be used
14.3), and ICE (see Section 14.3). Nonetheless, future work between the two, and, often, these protocols are proprietary.
is needed in order to develop solutions to these requirements. It turns out that SIP, being an upper application-layer
protocol, needs to be NAT-aware, that is, a lower-layer (layer
3 or 4) device, in order to cross NATs. That is, enhancements
in NAT functionalities need to be made like a gateway that
14.3 NAT Crossing by SIP is aware of the network layer to the SIP application layer as
14.3.1 Overview if acting as a SIP B2BUA. Session Traversal of UDP through
NAT (STUN) (RFC 5389), Traversal Using Relays around
Being an application-layer signaling protocol, SIP’s mes- NAT (TURN) (RFC 5766), and Interactive Connectivity
sages usually consist of two parts: header and message Establishment (ICE) (RFC 5245) protocols have been devel-
body. The header part, which contains the public addresses oped to cross the NATs by SIP so that, first, the session is
of the SIP functional entities for routing of SIP signaling established and then the media is transferred between the
messages, consists of SIP, SIP Security, or telephone URIs. conferencing end points. Each of these NAT-crossing pro-
These URIs can then be translated into public IP addresses tocols has its own pros, cons, and particular usages depend-
along with the port numbers of the corresponding trans- ing on different network settings. We will only discuss some
port protocols such as UDP, TCP, or others. Similarly, basic features of NAT crossing.
the SIP message body may contain the public addresses of
media such as audio, video, and data applications, in terms
of public IP addresses as well as ports of UDP, TCP, or 14.3.2 NAT-Crossing Protocols
other transport protocols expressed in SDP or other proto-
cols for communications between the parties once the ses- 14.3.2.1 STUN
sion is established. It should be noted that the IP addresses The primary purpose of the STUN protocol is to discover
and port numbers that are dynamically allocated during the presence and types of NATs and firewalls between them,
the call/session in SIP are also communicated between the and the public IP network including Internet for the appli-
calling and called parties. cations like SIP. In addition, applications such as SIP can
500 ◾ Handbook on Session Initiation Protocol
determine the public IP addresses allocated to them by the can be impossible for that host to communicate directly with
NAT. However, STUN, as its name implies, only works with other SIP hosts (peers). The TURN protocol is an extension
NATs such as full-cone NATs that use UDP transport pro- of STUN and allows the SIP host to control the operation
tocols, and also cannot work with symmetric NATs thereby of the relay and to exchange packets with its peers using the
raising security concerns. The STUN functional entity, for relay. Working for both connection-oriented (e.g., TCP)
example, implemented in a stand-alone server, must have the and connectionless (e.g., UDP) transport protocols along
public IP address. Figure 14.15 shows an example of STUN with full-cone and symmetric NATs, TURN provides the
server configuration that can be used in the SIP network by same protection as that created by symmetric NATs/firewalls
the SIP functional entities (e.g., UAs and proxies) to be aware because it connects clients behind a NAT to a single peer.
of the STUN protocol acting as the STUN client/server. Note that TURN differs from some other relay control pro-
Figure 14.15a depicts a STUN client on the SIP phone or tocols in that it allows a client to communicate with multiple
other end-point device sending packets via the NAT/FW on the peers using a single relay address.
STUN server that has the public IP address. The STUN server Any data received by the TURN server is forwarded as it
answers back with information about the IP address and ports acts as the relay. However, it can be seen that the TURN cli-
from which the packets were received (Figure 14.15a). In this ent on the inside NAT/FW can then be on the receiving end
way, the STUN server detects the type of NAT device through (but not the sending end, thereby limiting applicability) of a
which the packets were sent. The STUN client in the SIP end connection that is requested by the client on the inside. That
point uses this information in constructing its headers so that is, it only allows the inbound traffic through a NAT/FW
external contacts can reach them without the need for any other where the TURN client is in control. In addition, the media
device or technique. Figure 14.15b shows the communications must go through the TURN server because it relays both
between hosts that are residing behind NATs once the public incoming and outgoing media stream, consuming signifi-
addresses are discovered using the STUN protocol. It should be cant bandwidth. Extra media delays are introduced because
noted that the media goes to the hosts directly via the respective of relaying and extra hops that the media needs to go over the
NATs without any involvement of the STUN server. network. Consequently, the use of the TURN server limits
the scalability for NAT-crossing implementations of multi-
media applications like SIP over the IP network.
14.3.2.2 TURN
We have seen that STUN does not work for NATs that are
14.3.2.3 ICE
symmetric or use connection-oriented transport protocol
such as TCP where the duration of the transport connection, ICE allows application end points like SIP to discover other
unlike connectionless UDP protocol, is dependent on the peers residing behind NATs/FWs and then establish a con-
application type. In these situations, if a host like the SIP end nection. ICE itself is a complex protocol that encompasses all
point is located behind a NAT, then in certain situations it the functionalities of STUN, TURN, and other protocols
(a) (b)
Figure 14.15 STUN protocol: (a) address discovery by STUN clients residing behind NATs and (b) communications
between hosts residing behind NATs.
Interworking Services in SIP ◾ 501
for solving NAT/FW-crossing problems that are faced by of SIP interworking with PSTN/ISDN protocols following
applications like SIP. It encompasses multiple solutions and SIP-T here.
is regarded as one that will always enable the connection,
regardless of the number of NATs involved. ICE essentially 14.4.2 SIP-PSTN/ISDN Protocols
incorporates all the functionalities that are needed by appli-
Interworking Framework
cations like SIP for NAT/FW crossing. It requires that the
SIP entities (e.g., UAs) need to be ICE-aware. However, ICE RFC 3372 provides a framework for the integration of legacy
can be used by any protocol utilizing the offer–answer model, PSTN/ISDN protocols into SIP messages taking a couple
such as the SIP. ICE takes over the control for opening and of basic SIP–ISUP interworking call flow examples. SIP-T
closing of pinholes to let the communication through from provides the two characteristics through techniques known
the NAT/FW. Figure 14.15 depicts the NAT/FW crossing by as encapsulation and translation/mapping, respectively. At a
SIP end points using ICE. In some complicated scenarios, SIP–ISUP gateway, ISUP messages are encapsulated within
ICE may not work where the NAT/FW deviates from the SIP so that information necessary for services is not discarded
behaviors that are expected by STUN, TURN, and ICE. In in the SIP request. However, intermediaries like proxy serv-
those cases, extensions are needed in ICE. ers that make routing decisions for SIP requests cannot be
expected to understand ISUP; thus, simultaneously, some
critical information is translated from an ISUP message into
the corresponding SIP headers in order to determine how the
14.4 SIP–PSTN/ISDN Protocols SIP request will be routed. Table 14.1 shows the summary of
Interworking the interworking framework.
Let us take a simple interworking example, as shown in
14.4.1 Overview Figure 14.16, that shows that an IP network is connecting
SIP is a new generation of a networked multimedia proto- the two different PSTN/ISDN networks. The call is origi-
col that separates between the signaling/call/session control nating in the PSTN/ISDN and is terminating in another
and the media flows for both point-to-point and multipoint PSTN/ISDN network via the IP network, while the ISUP
communications. A SIP call/session can have many lags/ and SIP call signaling protocol is used for the PSTN/ISDN
subsessions, and each of these lags/subsessions can be con- and the IP network, respectively. We have shown two media
trolled, established, and tear down independently without gateway controllers (MGC) that act as the gateway between
affecting others. A family of PSTN/ISDN protocols have the IP and PSTN/ISDN networks. It should be noted
emerged over many years, such as Channel Associated that the MGC will have both signaling and media inter-
Signaling (CAS), ISDN, ISDN User Part (ISUP), QSIG faces. The SIP proxy servers are usually responsible for rout-
(ITU-T Q-Series Signaling Standard), and others. In addi- ing SIP requests (based on the Request-URI) to the eventual
tion, many variations of each of these protocols have been end points. At this, the originating MGC will not know at
created in each region/country worldwide. However, in which a SIP call will terminate toward the destination end
PSTN/ISDN protocols, the signaling/call/session and media point.
flows are tightly coupled together including the physical cir-
cuits of the PSTN/ISDN.
Table 14.1 SIP–PSTN/ISUP Protocols Interworking
As a result, PSTN/ISDN protocols are extremely inflex-
Framework Summary
ible to deal with multiple media of a given multimedia
application where each media needs to be controlled and SIP-PSTN/ISDN Protocols
manipulated independently with a given call/session, not to Interworking
mention about the individual call-leg/subsession. In view of Requirements SIP-T Functions (RFC 3372)
this, the interworking between SIP and PSTN/ISDN proto- Transparency of ISUP Encapsulation of ISUP in
cols has become very difficult for mapping on a one-to-one signaling the SIP body
basis. The PSTN/ISDN protocol information that cannot
Routability of SIP Translation of ISUP
be translated or mapped to SIP messages will be lost if the
messages with information into the SIP
call routes back to the PSTN. In many cases, the part of the
dependencies on ISUP header
PSTN/ISDN protocols that cannot be translated/mapped
into SIP, encapsulation is made putting into the SIP message Transfer of mid-call ISUP Use of the SIP INFO
body carrying in MIME bodies. Many IETF interworking signaling messages method (RFC 6086) for
standards such as SIP for Telephony (SIP-T) (RFC 3372), mid-message call
SIP–ISUP (RFC 3398), SIP-QSIG (RFC 4497), and others signaling
have been created. We are only providing a brief framework Source: Copyright IETF. Reproduced with permission.
502 ◾ Handbook on Session Initiation Protocol
IP network
ISUP ISUP
MGC1 MGC2
SIP SIP
SIP proxy
PSTN/ISDN
PSTN/ISDN
Figure 14.16 PSTN origination–PSTN termination (SIP bridging). (Copyright IETF. Reproduced with permission.)
Therefore, the originator does not select from the flows PSTN/ISDN, reusing any encapsulated ISUP present in the
described in this section, as a matter of static configuration SIP request it receives as appropriate. A very elementary call
or on a per-call basis; rather, each call is routed by the SIP flow for SIP bridging is shown in Figure 14.17.
network independently, and it may instantiate any of the This scenario (Figures 14.16 and 14.17), which shows calls
flows below as the routing logic the network dictates. When originating and terminating in PSTN/ISDN networks inter-
a call destined for the SIP network originates in the PSTN, connected by the IP backbone network, is also known as the
an ISUP message will eventually be received by the gateway SIP trunking configuration. SIP trunking has been very suc-
that is the point of interconnection with the PSTN network. cessful in early deployment of SIP because the large traffic
This gateway is from the perspective of the SIP UAC for this volume passing over the IP backbone network interconnect-
call setup request. ing PSTN/ISDN networks requires only a small portion of
Traditional SIP routing is used in the IP network to deter- network bandwidth because of packet switching as opposed
mine the appropriate point of termination (in this instance, to circuit switching. In addition, it is easy to integrate the
a gateway) and to establish a SIP dialog and begin negotia- equipment of different vendors for interoperability by using
tion of a media session between the origination and termina- a single SIP standard, while there are different variations of
tion end points. The egress gateway then signals ISUP to the PSTN/ISDN standards worldwide. Many other scenarios of
PSTN/ PSTN/
ISDN MGC1 Proxy MGC2 ISDN
F1. IAM
F2. INVITE
F3. IAM
F4. 100 Trying
F5. ACM
F6. 18x
F7. ACM
F8. ANM
F9. 200 OK
F10. ANM
F11. ACK
Conversation
F12. REL
F13. RLC F14. BYE
F15. REL
F16. 200 OK
F17. RLC
Figure 14.17 Elementary call flows. (Copyright IETF. Reproduced with permission.)
Interworking Services in SIP ◾ 503
505
506 ◾ Handbook on Session Initiation Protocol
it also allows the preemption of calls for (scarce) resource There are many IP-based services that can assist during
reasons if the resources are insufficient for a particular ses- emergencies. RFC 4412 covers real-time communications
sion to continue. applications involving SIP (RFC 3261), including Voice
However, the resources that are required to support over IP, multimedia conferencing, instant messaging, and
the needs of the multimedia applications in terms of presence. SIP applications may involve at least five differ-
bandwidth, and others, reside primarily in the network ent resources that may become scarce and congested dur-
layer. It implies that session establishment should not take ing emergencies. These resources include gateway resources,
place until there are enough resources in the network for circuit-switched network resources, IP network resources,
meeting the upper-layer performance requirements. This receiving end-system resources, and SIP proxy resources. IP
situation imposes a kind of precondition for setting up the network resources are beyond the scope of SIP signaling and
session. RFC 3312 (see Section 15.4) that is described here are therefore not considered here.
provides mechanisms to define those preconditions in Even if the resources at the SIP element itself are not
meeting the quality-of-service (QOS) requirements before scarce, a SIP gateway may mark outgoing calls with an
setting up SIP sessions. The network layer QOS signal- indication of priority, for example, on an ISUP (Integrated
ing protocols, like the Resource ReSerVation Protocol Services Digital Network User Part) Initial Address Message
(RSVP), that reserves the network resources mapping the (IAM) originated by a SIP gateway with the public switched
SIP application layer QOS requirements are described telephone network (PSTN). To improve emergency response,
in detail. The multimedia application QOS parameters it may become necessary to prioritize access to SIP-signaled
are being carried over the Session Description Protocol resources during periods of emergency-induced resource
(SDP) used in the message body of SIP signaling mes- scarcity. We call this resource prioritization. The mechanism
sages. Different media streams may have different QOS itself may well be in place at all times, but may only mate-
requirements, and the QOS flows need to be mapped to rially affect call handling during times of resource scarcity.
the appropriate media streams appropriately. In addition, Currently, SIP does not include a mechanism that allows a
SDP offer–answer model allows negotiating appropri- request originator to indicate to a SIP element that it wishes
ate QOS for each of the media streams between the SIP the request to invoke such resource prioritization. To address
end points. RFCs 3524 (see Section 15.5) and 5432 (see this need, this document adds a SIP element that labels cer-
Section 15.6) that are described here specify QOS map- tain SIP requests.
ping to media streams and QOS negotiations between the RFC 4412 defines two new SIP header fields for com-
end points, respectively. munications resource priority, called Resource-Priority and
Accept-Resource-Priority. The Resource-Priority header field
may be used by SIP user agents (UAs), including PSTN gate-
ways and terminals, and SIP proxy servers to influence their
15.2 Communications Resource treatment of SIP requests, including the priority afforded
Priority in SIP to PSTN calls. For PSTN gateways, the behavior translates
into analogous schemes in the PSTN, for example, the ITU
15.2.1 Overview
Recommendation Q.735.3 [1] prioritization mechanism,
During emergencies, communications resources (including in both the PSTN-to-IP and IP-to-PSTN directions. ITU
telephone circuits, Internet Protocol [IP] bandwidth, and Recommendation I.255.3 [2] is another example. A SIP
gateways between the circuit-switched and IP networks) request with a Resource-Priority indication can be treated
may become congested. Congestion can occur due to heavy differently in these situations:
usage, loss of resources caused by the natural or man-made
disaster, and attacks on the network during man-made emer- ◾◾ The request can be given elevated priority for access to
gencies. This congestion may make it difficult for persons PSTN gateway resources, such as trunk circuits.
charged with emergency assistance, recovery, or law enforce- ◾◾ The request can interrupt lower-priority requests at a
ment to coordinate their efforts. As IP networks become user terminal, such as an IP phone.
part of converged or hybrid networks, along with public and ◾◾ The request can carry information from one multilevel
private circuit-switched (telephone) networks, it becomes priority domain in the telephone network (e.g., using the
necessary to ensure that these networks can assist during facilities of Q.735.3 [1]) to another, without the SIP prox-
such emergencies. Also, users may want to interrupt their ies themselves inspecting or modifying the header field.
lower-priority communications activities and dedicate their ◾◾ In SIP proxies and back-to-back UAs, requests of
end-system resources to the high-priority communications higher priorities may displace existing signaling
attempt if a high-priority communications request arrives at requests or bypass PSTN gateway capacity limits in
their end system. effect for lower priorities.
Resource Priority and Quality of Service in SIP ◾ 507
This header field is related to, but differs in semantics from these different priority lists, called a namespace in this
from, the Priority header field of RFC 3261 (see Section document.
2.8). The Priority header field describes the importance that Typically, each SIP element only supports one such
the SIP request should have for the receiving human or its namespace, but we discuss what happens if an element needs
agent. For example, that header may be factored into deci- to support multiple namespaces. Since gaining prioritized
sions about call routing to mobile devices and assistants, and access to resources offers opportunities to deny service to
about call acceptance when the call destination is busy. The others, it is expected that all such prioritized calls are sub-
Priority header field does not affect the usage of PSTN gate- ject to authentication and authorization, using standard SIP
way or proxy resources, for example. In addition, SIP user security or other appropriate mechanisms. Since calls may
agent client (UAC) can assert any Priority value, and usage traverse multiple administrative domains with different
of Resource-Priority header field values is subject to autho- namespaces or multiple elements with the same namespace,
rization. While the Resource-Priority header field does not it is strongly suggested that all such domains and elements
directly influence the forwarding behavior of IP routers or apply the same algorithms for the same namespace, as oth-
the use of communications resources such as packet forward- erwise the end-to-end experience of privileged users may be
ing priority, procedures for using this header field to cause compromised.
such influence may be defined in other documents.
Existing implementations of RFC 3261 that do not par-
ticipate in the resource-priority mechanism follow the nor-
15.2.2 Resource-Priority SIP Header Field
mal rules of RFC 3261 (repeating from Section 3.1.3.2): “If a We are describing the Resource-Priority and Accept-
UAS does not understand a header field in a request (that is, Resource-Priority SIP header field syntax in more detail
the header field is not defined in this specification or in any for convenience, although all SIP syntaxes are described in
supported extension), the server must ignore that header field Section 2.4.1. The behaviors of SIP entities for processing
and continue processing the message.” Thus, the use of this these headers are described in the next section.
mechanism is wholly invisible to existing implementations
unless the request includes the Require header field with the
15.2.2.1 Resource-Priority Header Field
resource-priority option tag. The mechanism described here
can be used for emergency preparedness in emergency tele- The Resource-Priority request header field marks a SIP
communications systems, but is only a small part of an emer- request as desiring prioritized access to resources, as described
gency preparedness network and is not restricted to such use. in Section 15.1. There is no protocol requirement that all
The mechanism aims to satisfy the requirements in RFC requests within a SIP dialog or session use the Resource-
3487. RFC 3487 is structured so that it works in all SIP and Priority header field. Local administrative policy may man-
Real-Time Transport Protocol (RTP) specified in RFC 3550 date the inclusion of the Resource-Priority header field in
(see Section 7.2) and transparent networks. all requests. Implementations of this specification must allow
In such networks, all network elements and SIP prox- inclusion to be either by explicit user request or automatic for
ies let valid SIP requests pass through unchanged. This is all requests. The syntax of the Resource-Priority header field
important since it is likely that this mechanism will often be is described below. The token-nodot production is used from
deployed in networks where the edge networks are unaware SIP Event Notification RFC 6665 (see Section 5.2).
of the resource-priority mechanism and provide no special
privileges to such requests. The request then reaches a PSTN Resource-Priority = "Resource-Priority"
HCOLON r-value *(COMMA
gateway or set of SIP elements that are aware of the mecha- r-value)
nism. For conciseness, we refer to SIP proxies and UAs that r-value = namespace "." r-priority
act on the Resource-Priority header field as resource-priority namespace = token-nodot
actors (RP actors). It is likely to be common that the same SIP r-priority = token-nodot
element will handle requests that bear the Resource-Priority token-nodot = 1*(alphanum/"-"/"!"/"%"/
header fields and those that do not. Government entities "*" /"_"/"+"/"ʻ"/"ʼ"/"˜")
and standardization bodies have developed several different
An example Resource-Priority header field is shown
priority schemes for their networks. Users would like to be
below:
able to obtain authorized priority handling in several of these
networks, without changing SIP clients. Also, a single call Resource-Priority: dsn.flash
may traverse SIP elements that are run by different adminis-
trations and subject to different priority mechanisms. Since The r-value parameter in the Resource-Priority header
there is no global ordering among those priorities, we allow field indicates the resource priority desired by the request
each request to contain more than one priority value drawn originator. Each resource value (r-value) is formatted as
508 ◾ Handbook on Session Initiation Protocol
namespace “.” priority value. The value is drawn from the 15.2.2.3 Resource-Priority Header Field Usage
namespace identified by the namespace token. Namespaces
The usage of the Resource-Priority and Accept-Resource-
and priorities are case-insensitive ASCII tokens that do
Priority header fields by all SIP methods and functional enti-
not contain periods. Thus, dsn.flash and DSN.Flash, for
ties is shown in Table 2.5, Section 2.8.
example, are equivalent. Each namespace has at least one
priority value. Namespaces and priority values within each
namespace must be registered with the Internet Assigned 15.2.2.4 Resource-Priority Option Tag
Numbers Authority (IANA). Initial namespace registrations
are described here. Since a request may traverse multiple RFC 4412 also defines the resource-priority option tag. The
administrative domains with multiple different namespaces, behavior is described in Section 15.2.
it is necessary to be able to enumerate several different
namespaces within the same message. 15.2.3 Behavior of SIP Elements That
However, a particular namespace must not appear more Receive Prioritized Requests
than once in the same SIP message. These may be expressed
equivalently as either comma-separated lists within a single 15.2.3.1 General Rules
header field, as multiple header fields, or as some combina- The Resource-Priority header field is potentially applicable
tion. The ordering of r-values within the header field has no to all SIP request messages. At a minimum, implementa-
significance. Thus, for example, the following header snip- tions of the following request types MUST support the
pets are equivalent: Resource-Priority header to be in compliance with this speci-
fication: INVITE, ACK, PRACK, UPDATE, and REFER.
Resource-Priority: dsn.flash, wps.3 Implementations should support the Resource-Priority
Resource-Priority: wps.3, dsn.flash
header field in the following request types: MESSAGE,
Resource-Priority: wps.3
Resource-Priority: dsn.flash SUBSCRIBE, and NOTIFY.
Note that this does not imply that all implementations
have to support all request methods listed. If a SIP element
receives the Resource-Priority header field in a request other
15.2.2.2 Accept-Resource-Priority Header Field than those listed above, the header may be ignored, accord-
The Accept-Resource-Priority response header field enumer- ing to the rules of RFC 3261. In short, an RP actor performs
ates the resource values (r-values) a SIP user agent server the following steps when receiving a prioritized request:
(UAS) is willing to process. (This does not imply that a call
with such values will find sufficient resources and succeed.) ◾◾ If the RP actor recognizes none of the namespaces,
The syntax of the Accept-Resource-Priority header field is as it treats the request as if it had no Resource-Priority
follows: header field.
◾◾ It ascertains that the request is authorized according
Accept-Resource-Priority = "Accept-Resource-
to local policy to use the priority levels indicated. If
Priority" HCOLON the request is not authorized, it rejects it. Examples
[r-value *(COMMA of authorization policies are discussed in Security
r-value)] Considerations.
◾◾ If the request is authorized and resources are available
An example is given below: (no congestion), it serves the request as usual. If the
request is authorized but resources are not available
Accept-Resource-Priority: dsn.flash- (congestion), it either preempts other current sessions
override, dsn. or inserts the request into a priority queue.
flash, dsn.
immediate, dsn.
priority, dsn.
routine 15.2.3.2 Usage of Require Header
with Resource-Priority
Some administrative domains may choose to disable the Following standard SIP behavior, if a SIP request contains
use of the Accept-Resource-Priority header for revealing too the Require header field with the resource-priority option
much information about that domain in responses. However, tag, a SIP UA MUST respond with a 420 Bad Extension
this behavior is not recommended, as this header field aids in if it does not support the SIP extensions described in this
troubleshooting. document. It then lists resource-priority in the Unsupported
Resource Priority and Quality of Service in SIP ◾ 509
header field included in the response. The use of the resource- bandwidth or a different number of circuits, a single higher-
priority option tag in Proxy-Require header field is not priority session may displace more than one lower-priority
recommended. session. Unless otherwise noted, requests do not preempt
other requests of equal priority. As noted above, the pro-
cessing of SIP requests itself is not preempted. Thus, since
15.2.3.3 OPTIONS Request
proxies do not manage sessions, they do not perform pre-
with Resource-Priority
emption. RFC 4411 (see Section 15.3) contains more details
An OPTIONS request can be used to determine if an ele- and examples of this behavior. UAS behavior for preemption
ment supports the mechanism. A compliant implementation is discussed here.
should return an Accept-Resource-Priority header field in
OPTIONS responses enumerating all valid resource values,
15.2.3.4.2 Priority Queuing
but an RP actor may be configured not to return such values
or only to return them to authorized requestors. Following In a priority queuing policy, requests that find no available
standard SIP behavior, OPTIONS responses must include resources are queued to the queue assigned to the priority
the Supported header field that includes the resource- value. Unless otherwise specified, requests are queued in a
priority option tag. According to RFCs 3261 and 3840 (see first-come, first-served order. Each priority value may have
Sections 3.4 and 3.5), proxies that receive a request with a its own queue, or several priority values may share a single
Max-Forwards header field value of zero may answer the queue. If a resource becomes available, the RP actor selects the
OPTIONS request, allowing a UAC to discover the capabili- request from the highest-priority nonempty queue according
ties of both proxy and UASs. to the queue service policy. For first-come, first-served poli-
cies, the request from that queue that has been waiting the
longest is served. Each queue can hold a finite number of
15.2.3.4 Approaches for Preferential
pending requests. If the per-priority-value queue for a newly
Treatment of Requests
arriving request is full, the request is rejected immediately,
SIP elements may use the resource-priority mechanism with the status codes specified here. In addition, a priority
to modify a variety of behaviors, such as routing requests, queuing policy may impose a waiting time limit for each pri-
authentication requirements, override of network capacity ority class, whereby requests that exceed a specified waiting
controls, or logging. The resource-priority mechanism may time are ejected from the queue and a 408 Request Timeout
influence the treatment of the request itself, the marking failure response is returned to the requestor. Finally, an RP
of outbound PSTN calls at a gateway, or of the session cre- actor may impose a global queue size limit summed across all
ated by the request. (Here, we use the terms session and call queues and drop waiting lower-priority requests with a 408
interchangeably, both implying a continuous data stream Request Timeout failure response. This does not imply pre-
between two or more parties. Sessions are established by SIP emption, since the session has not been established yet. UAS
dialogs.) Below, we define two common algorithms, namely, behavior for queuing is discussed later.
preemption and priority queuing. Preemption applies only
to sessions created by SIP requests, while both sessions and
15.2.3.5 Error Conditions
request handling can be subject to priority queuing. Both
algorithms can sometimes be combined in the same element, We describe the error behavior that is shared among multiple
although none of the namespaces described in this document types of RP actors (including various instances of UAS such
do this. Algorithms can be defined for each namespace or, as trunk gateways, line gateways, and IP phones) and proxies.
in some cases, can be specific to an administrative domain. A request containing a resource-priority indication can fail
Other behavior, such as request routing or network manage- for four reasons:
ment controls, is not defined by this specification. Naturally,
only SIP elements that understand this mechanism and the ◾◾ The RP actor does not understand the priority value.
namespace and resource value perform these algorithms. We ◾◾ The requestor is not authenticated.
also discuss what happens if an RP actor does not under- ◾◾ An authenticated requestor is not authorized to make
stand priority values contained in a request. such a request.
◾◾ There are insufficient resources for an authorized
request.
15.2.3.4.1 Preemption
An RP actor following a preemption policy may disrupt an We treat these error cases in the order that they typically
existing session to make room for a higher-priority incom- arise in the processing of requests with Resource-Priority
ing session. Since sessions may require different amounts of headers. However, this order is not mandated. For example,
510 ◾ Handbook on Session Initiation Protocol
an RP actor that knows that a particular resource value can- request neither preempts another session nor is queued. A
not be served or queued may, as a matter of local policy, forgo request can fail because the RP actor has either insufficient
authorization, since it would only add processing load with- processing capacity to handle the SIP request, or insufficient
out changing the outcome. bandwidth or trunk capacity to establish the requested ses-
sion for session-creating SIP requests. If the request fails
because the RP actor cannot handle the signaling load, the
15.2.3.5.1 No Known Namespace or Priority Value
RP actor responds with 503 Service Unavailable. If there is
If an RP actor does not understand any of the resource values not enough bandwidth, or if there is an insufficient num-
in the request, the treatment depends on the presence of the ber of trunks, a 488 Not Acceptable Here response indicates
Require resource-priority option tag: that the RP actor is rejecting the request due to media path
availability, such as insufficient gateway resources. In that
1. Without the option tag, the RP actor treats the request case, RFC 3261 advises that a 488 response (see Section
as if it contained no Resource-Priority header field and 2.6) should include a Warning header field with a reason for
processes it with default priority. Resource values that the rejection; warning code 370 Insufficient Bandwidth is
are not understood must not be modified or deleted. typical. For systems implementing queuing, if the request
2. With the option tag, it must reject the request with a is queued, the UAS will return 408 Request Timeout if the
417 Unknown Resource-Priority response code. request exceeds the maximum configured waiting time in
the queue.
Making the first case the default is necessary since oth-
erwise there would be no way to successfully complete any
calls in the case where a proxy on the way to the UAS shares 15.2.3.5.5 Busy
no common namespaces with the UAC; however, the UAC Resource contention also occurs when a call request arrives
and UAS do have such a namespace in common. In general, at a UAS that is unable to accept another call, because the
as noted, a SIP request can contain more than one Resource- UAS either has just one line appearance or has active calls on
Priority header field. This is necessary if a request needs to all line appearances. If the call request indicates an equal- or
traverse different administrative domains, each with its own lower-priority value when compared with all active calls pres-
set of valid resource values. For example, the ETS namespace ent on the UAS, the UAS returns a 486 Busy Here response.
might be enabled for US government networks that also If the request is queued instead, the UAS will return a 408
support the Defense Switched Network (DSN) or Defense Request Timeout if the request exceeds the maximum con-
Red Switched Network (DRSN) namespaces for most indi- figured waiting time in the device queue. If a proxy gets 486
viduals in those domains. A 417 Unknown Resource-Priority Busy Here responses on all branches, it can then return a 600
response may, according to local policy, include an Accept- Busy Everywhere response to the caller.
Resource-Priority header field enumerating the acceptable
resource values.
15.2.4 UAC Behavior
15.2.3.5.2 Authentication Failure SIP UACs supporting this specification must be able to
If the request is not authenticated, a 401 Unauthorized or generate the Resource-Priority header field for requests that
407 Proxy Authentication Required response is returned to require elevated resource access priority. As stated previously,
allow the requestor to insert appropriate credentials. the UAC should be able to generate more than one resource
value in a single SIP request. Upon receiving a 417 Unknown
Resource-Priority response, the UAC may attempt a subse-
15.2.3.5.3 Authorization Failure quent request with the same or different resource value. If
If the RP actor receives an authenticated request with a available, it should choose authorized resource values from
namespace and priority value it recognizes but the originator the set of values returned in the Accept-Resource-Priority
is not authorized for that level of service, the element must header field.
return a 403 Forbidden response.
15.2.4.1 Preemption Algorithm
15.2.3.5.4 Insufficient Resources
A UAC that requests a priority value that may cause pre-
Insufficient resource conditions can occur on proxy servers emption must understand a Reason header field in the BYE
and UASs, typically trunk gateways, if an RP actor receives request explaining why the session was terminated, as dis-
an authorized request, has insufficient resources, and the cussed in RFC 4411 (see Section 15.3).
Resource Priority and Quality of Service in SIP ◾ 511
the request like any other request. Naturally, the request may ;received=192.0.2.101
still succeed. From: BigGuy <sip:[email protected]>
;tag=9fxced76sl
To: LittleGuy <sip:[email protected]>
15.2.9 Examples ;tag=8321234356
Call-ID: [email protected].
The SDP message body and the BYE and ACK exchanges/ com
call flows are the same as in RFC 3665 and are omitted for CSeq: 1 INVITE
brevity. Contact: <sip:[email protected].
com;transport=tcp>
Content-Length: 0
15.2.9.1 Simple Call
F3 200 OK User B -> User A
In this scenario (Figure 15.1), user A completes a call to user
B directly. The call from A to B is marked with a resource- SIP/2.0 200 OK
priority indication. Via: SIP/2.0/TCP client.atlanta.example.
com:5060;branch=z9hG4bK74bf9
;received=192.0.2.101
F1 INVITE User A -> User B From: BigGuy <sip:[email protected]>
;tag=9fxced76sl
INVITE sip:[email protected] SIP/2.0
To: LittleGuy <sip:[email protected]>
Via: SIP/2.0/TCP client.atlanta.example.
;tag=8321234356
com:5060;branch=z9hG4bK74bf9
Call-ID: [email protected].
Max-Forwards: 70
com
From: BigGuy <sip:[email protected]>
CSeq: 1 INVITE
;tag=9fxced76sl
Contact: <sip:[email protected].
To: LittleGuy <sip:[email protected]>
com;transport=tcp>
Call-ID: [email protected].
Content-Type: application/SDP
com
Content-Length:...
CSeq: 1 INVITE
...
Resource-Priority: dsn.flash
Contact: <sip:[email protected].
com;transport=tcp>
Content-Type: application/SDP 15.2.9.2 Receiver Does Not
Content-Length:... Understand Namespace
...
In this example (Figure 15.2), the receiving UA does not
F2 180 Ringing User B -> User A understand the dsn namespace and thus returns a 417
Unknown Resource-Priority status code. We omit the mes-
SIP/2.0 180 Ringing sage details for messages F5 through F7, since they are essen-
Via: SIP/2.0/TCP client.atlanta.example.
com:5060;branch=z9hG4bK74bf9 tially the same as in the first example.
15.2.10 Handling Multiple
Figure 15.2 Call flows where receiver does not understand Concurrent Namespaces
namespace. (Copyright IETF. Reproduced with permission.)
15.2.10.1 General Rules
A single SIP request may contain resource values from mul-
F2 417 Resource-Priority failed User B -> User A tiple namespaces. As noted earlier, an RP actor disregards
all namespaces it does not recognize. This specification only
SIP/2.0 417 Unknown Resource-Priority
Via: SIP/2.0/TCP client.atlanta.example.com addresses the case where an RP actor then selects one of the
:5060;branch=z9hG4bK74bf9 remaining resource values for processing, usually choosing
;received=192.0.2.101 the one with the highest relative priority. If an RP actor
From: BigGuy <sip:[email protected]> understands multiple namespaces, it must create a local total
;tag=9fxced76sl ordering across all resource values from these namespaces,
To: LittleGuy <sip:[email protected]>
maintaining the relative ordering within each namespace. It
;tag=8321234356
Call-ID: [email protected]. is recommended that the same ordering be used across an
com administrative domain. However, there is no requirement
CSeq: 1 INVITE that such ordering be the same across all administrative
Accept-Resource-Priority: q735.0, q735.1, domains.
q735.2, q735.3, q735.4
Contact: <sip:[email protected].
com;transport=tcp> 15.2.10.2 Examples of Valid Orderings
Content-Type: application/SDP
Content-Length: 0 Table 15.1 shows a set of examples of an RP actor that sup-
ports two namespaces, foo and bar. Foo’s priority-values are
F3 ACK User A -> User B 3 (highest), then 2, and then 1 (lowest), and bar’s priority-
values are C (highest), then B, and then A (lowest). Five lists
ACK sip:[email protected] SIP/2.0
of acceptable priority orders the SIP element may use are
Via: SIP/2.0/TCP client.atlanta.example.com
:5060;branch=z9hG4bK74bd5 shown in Table 15.1.
Max-Forwards: 70
From: BigGuy <sip:[email protected]>
;tag=9fxced76sl 15.2.10.3 Examples of Invalid Orderings
To: LittleGuy <sip:[email protected] On the basis of the priority order of the namespaces in Table
>;tag=8321234356
Call-ID: [email protected].
15.1, the combinations in Table 15.2 are examples of order-
com ings that are not acceptable and must not be configurable.
CSeq: 1 ACK These examples are invalid since the global orderings are not
Content-Length: 0 consistent with the namespace-internal order.
514 ◾ Handbook on Session Initiation Protocol
Table 15.1 Valid Examples of Resource-Priority Actor Table 15.2 Invalid Examples of Resource-Priority
Supporting Two Namespaces Actor Supporting Two Namespaces
Example Example Example Priority Example Example Example Priority
1 2 3 Level 1 2 or 3 Level
<label> <# of levels> <preemption <new warn <new resp. code> <RFC>
or queue> code>
Source: Copyright IETF. Reproduced with permission.
algorithmic aspect from the DSN and Q735 namespaces. The wps.2
behavior for the flash-override-override priority value differs wps.1
(highest) wps.0
from the other values. Normally, requests do not preempt
those of equal priority, but a newly arriving flash-override- The WPS namespace operates according to the priority
override request will displace another one of equal priority queuing algorithm described earlier.
if there are insufficient resources. This can also be expressed
as saying that flash-override-override requests defend them-
selves as flash-override only.
15.3 Preemption Events in SIP
15.2.12.4 Q735 Namespace 15.3.1 Overview
Q.735.3 [1] was created to be a commercial version of the With the introduction of the SIP Resource-Priority header in
operationally equivalent DSN specification for Multilevel RFC 4412 (see Section 15.2), there became the possibility of
Precedence and Preemption (MLPP). The Q735 namespace sessions being torn down for (scarce) resource reasons, mean-
is defined here in the same manner. The Q735 namespace ing there were not enough resources for a particular session
defines the following resource values, listed from lowest pri- to continue. Certain domains will implement this mecha-
ority to highest priority: nism where resources may become constrained either at the
(lowest) q735.4 SIP UA or at congested router interfaces where more impor-
q735.3 tant sessions are to be completed at the expense of less impor-
q735.2 tant sessions. Which sessions are more or less important than
q735.1 others will not be discussed here. What is proposed in RFC
(highest) q735.0 4411 that is described here is a SIP (RFC 3261) extension
The Q735 namespace operates according to the preemp- to synchronize SIP elements as to why a preemption event
tion algorithm described earlier. occurred and which type of preemption event occurred, as
viewed by the element that performed the preemption of a
session.
15.2.12.5 ETS Namespace RFC 4411 proposes an IANA registration extension to
The ETS namespace derives its name indirectly from the the SIP Reason header to be included in a BYE Method
name of the US government telecommunications service, Request as a result of a session preemption event, either at
called Government Emergency Telecommunications Service a UA, or somewhere in the network involving a reservation-
(or GETS), though the organization responsible for the based protocol such as the RSVP or Next Steps in Signaling
GETS service chose the acronym ETS for its GETS over IP (NSIS). However, this specification does not attempt to
service, which stands for Emergency Telecommunications address routers failing in the packet path; instead, it addresses
Service. The ETS namespace defines the following resource a deliberate tear down of a flow between UAs, and informs
values, listed from lowest priority to highest priority: the terminated UA(s) with an indication of what occurred.
The SIP Reason header is an application layer feedback
(lowest) ets.4
mechanism to synchronize SIP elements of events; the partic-
ets.3
ets.2 ular event explained here deals with preemption of a session.
ets.1 Q.850 [3] provides an indication for preemption (cause=8)
(highest) ets.0 and for preemption circuit reserved for reuse (cause=9). Q.850
cause=9 does not apply to IP, as IP has no concept of cir-
The ETS namespace operates according to the priority cuits. Some domains wish to differentiate appropriate IP rea-
queuing algorithm described earlier. sons for preemption of sessions and to indicate topologically
where the preemption event occurred. No other means exists
15.2.12.6 WPS Namespace today to give feedback as to why a session was torn down on
preemption grounds.
The WPS namespace derives its name from the Wireless
In the event that a session is terminated for a specific rea-
Priority Service, defined in GSM and other wireless tech-
son that can (or should) be shared with SIP servers and UAs
nologies. The WPS namespace defines the following resource
sharing dialog, the Reason header (RFC 3326, see Section
values, listed from lowest priority to highest priority:
2.8) was created to be included in the BYE Request. This was
(lowest) wps.4 not the only Method for this new header; RFC 3326 also
wps.3 discusses the CANCEL Method usage.
Resource Priority and Quality of Service in SIP ◾ 517
RFC 4411 defines two use cases in which new preemp- functionality of a UA, there must be a means by which that
tion Reason values are necessary: UA (not the user) informs the other UA(s) why a session was
just torn down prematurely. The appropriate mechanism is
◾◾ Access Preemption Event (APE): This is when a UA the BYE method. The user of the other far side UA will
receives a new SIP session request message with a valid not understand why that session “just went away” without
RP (Resource-Priority) value that is higher than the there being a means of informing the UA of what occurred
one associated with the currently active session at that (if this event was purposeful). Through this type of indica-
UA. The UA must discontinue the existing session in tion to the preempted UA, it can indicate to the user of that
order to accept the new one (according to local policy device appropriately.
of some domains). The rules within a domain surrounding the UA to be
◾◾ Network P reemption E vent ( NPE): This is when a informed can be different from the rules for informing the
network element, such as a router, reaches capacity user. Local policy should determine if the user should be
on a particular interface and has the ability to state- informed of the specific reason. This indication in SIP will
fully choose which session(s) will remain active when provide a means for the UA to react in a locally determined
a new session/reservation is signaled for under the way, if appropriate (play a certain tone or tone sequence,
parameters outlined in SIP preconditions per RFC point toward a special announcement uri, cause the UA’s
3312 (see Section 15.4) that would otherwise over- visual display to do something, etc.). Figure 15.3 illus-
load that interface (perhaps adversely affecting all ses- trates the scenario. UA1 invites UA2 to a session with the
sions). In this case, the router must terminate one or resource-priority level of 3 (levels 1 and 2 are higher in this
more reservations of lower priority in order to allow domain, and the namespace element is not necessary for this
this higher-priority reservation access to the requested discussion).
amount of bandwidth (according to local policy of After the session between UA1 and UA2 is established,
some domains). UA3 invites UA2 to a new session with an RP of 2 (a higher
priority than the current session between UA1 and UA2).
The semantics for these two cases have been registered Local policy within this domain dictates that UA2 must pre-
with IANA for the new protocol value Preemption for the empt all existing calls of lower priority in order to accept
Reason Header field, with four cause values for the above a higher-priority call. What Reason value could be inserted
preemption conditions. Additionally, this document will above to mean preemption at a UA? There are several choices:
create a new IANA registry for reason-text strings that are 410 Gone, 480 Temporarily Unavailable, 486 Busy Here,
not currently defined through existing SIP Response codes and 503 Service Unavailable. The use of any of these here is
or Q.850 [3] cause codes. This new registry will be useful questionable because the session is already established. It is
for future protocols used by the SIP Reason header. further complicated if there needs to be a difference in the
We are using an existing SIP RFC 3312 (see Section 15.4) Reason value for an APE versus an NPE (which is a require-
as the starting point for NPEs. RFC 3312 set rules surround- ment here). The limits of Q.850 [3] have been stated pre-
ing SIP interaction using a reservation protocol for QOS viously. It should be possible to configure UAs receiving a
preconditions, using RSVP as the example protocol. That preemption indication to indicate to the user that no particu-
effort did not preclude other preconditions or future proto- lar type of preemption occurred. There are some domains
col work from becoming a means of preconditions. NSIS is that might prefer their users to remain unaware of the spe-
a new reservation protocol effort that specifies a preemption cifics of network behavior. This should not ever prevent a
operation similar to RSVP’s ResvErr message involving the known preemption indication from being sent in a BYE from
NSIS NOTIFY message in RFC 5974 with a Transient error a UA.
code 0x04000005 (Resources Preempted). Note that SIP
itself does not cause RSVP or NSIS reservation signaling to
15.3.2.1 Effects of Preemption at the UA
start or end. That operation is part of a separate API within
each UA. If two UAs are in a session and one UA must preempt that
session to accept another session, a BYE method message is
the appropriate mechanism to perform this task. However,
15.3.2 Access Preemption Events taking this a step further, if a UA is the common point of a
As mentioned previously, APEs occur at the UA. It does three-way (or more) ad hoc conference and must preempt all
not matter which UA in a unicast or multicast session sessions in that conference due to receipt of a higher-priority
this happens to (the UAC or UAS of a session). If local session request (that this UA must accept), then a BYE mes-
policy dictates in a particular domain rules regarding the sage must be sent to all UAs in that ad hoc conference.
518 ◾ Handbook on Session Initiation Protocol
F1 INVITE (R-P:3)
F2 200 OK
F3 ACK
RTP
F5 BYE (Reason: ?)
F6 200 OK
F7 200 OK
F8 ACK
RTP
Figure 15.3 Access preemption with obscure reason. (Copyright IETF. Reproduced with permission.)
15.3.2.2 Reason Header Requirements for APEs (based on local policy) to the receiving UA such that
this UA cannot display more information than the
The addition of an appropriate Reason value for an APE as
domain wants the user to see.
described above and shown in Figure 15.3 meets the follow-
ing requirements:
F2 200 OK
F3 ACK
RTP
F4 BYE (Reason: ?)
F5 200 OK
Figure 15.5 Network preemption with obscure reason. (Copyright IETF. Reproduced with permission.)
520 ◾ Handbook on Session Initiation Protocol
2 Reserved The session preemption has been preempted, initiated within the network via a
Resources purposeful RSVP preemption occurrence, and not a link error.
3 Generic This is a limited-use preemption indication to be used on the final leg to the
Preemption preempted UA to generalize the event.
4 Non-IP Session preemption has occurred in a non-IP portion of the infrastructure, and this
Preemption is the Reason cause code given by the SIP gateway.
F1 INVITE (R-P:3)
F2 200 OK
F3 ACK
RTP
F6 200 OK
F7 200 OK
F8 ACK
RTP
Figure 15.7 Access preemption with reason: UA preemption. (Copyright IETF. Reproduced with permission.)
522 ◾ Handbook on Session Initiation Protocol
priority in order to accept a higher-priority call. UA2 sends a An example usage of this header value would be
BYE Request message with a Reason header with a value of
UA Preemption. This will inform the far-end UA (UA1) and Reason: Preemption :cause=2 ;text="Reserved
Resources Preempted"
all relevant SIP elements (e.g., SIP Proxies). The cause code is
unique to what is proposed in the RSVP Preemption Event
for differentiation purposes. 15.3.5.2.1 NPE Call Flow
Figure 15.8 replicates the call flow from Figure 15.7, but with
15.3.5.2 NPE Reason Code an appropriate Reason value indication that was proposed
earlier. Above is the call flow with router 2 from Figure
A more elaborate description of the Reserved Resources
15.4 included at the RSVP layer sending the Resv messages
Preempted Event cause=2 is as follows:
(RSVP).
A router has preempted a reservation flow and A complete call flow including all UAs and routers is
generated a reservation error message: a ResvErr not included for diagram complexity reasons. The signaling
traveling downstream in RSVP, and a NOTIFY between UA3 and UA4 is also not included. Upon receipt
in NSIS. The UA receiving the preemption error of the ResvErr message (RSVP) with the preemption error
message generates a BYE request toward the far- code, UA2 can now appropriately inform UA1 why this
side UA with a Reason Header with this value, event occurred. This BYE message will also inform all rel-
indicating that somewhere between two or more evant SIP elements, synchronizing them. The cause value is
UAs, a router has administratively preempted unique to that proposed earlier for APEs for differentiation
this session. purposes.
F2 200 OK
F3 ACK
RTP
F5 200 OK
Figure 15.8 Network preemption with Reserved Resources Preempted. (Copyright IETF. Reproduced with permission.)
Resource Priority and Quality of Service in SIP ◾ 523
This cause code is for infrastructures that do not A session exists in a hybrid IP/non-IP infrastruc-
wish to provide the preempted UA with a more ture and the preemption event occurs in the non-
precise reason than just preemption. It is pos- IP portion, and was indicated by that portion
sible that UAs will have code that will indicate that this call termination was due to preemption.
the type of preemption event that is contained This is the indication that would be generated by
in the Reason header, and certain domains have a SIP gateway toward the SIP UA that is being
expressed this as not being optimal, and wanted preempted, traversing whichever SIP proxies are
to generalize the indication. This must not be the involved in session signaling (a question of server
initial indication within these domains, as valu- state).
able traffic analysis and other NM applications
will be generalized as well. If this cause value An example usage of this header value would be
is to be implemented, it should only be done at
Reason: preemption ;cause=4 ;text="Non-IP
the final SIP Proxy in such a way that the cause Preemption"
value indicating which type of preemption event
actually occurred is changed to this generalized
preemption indication to be received by the pre-
empted UA. 15.3.5.4.1 Non-IP Preemption Event Call Flow
Figure 15.9 is a simple call flow diagram of the Non-IP
An example usage of this header value would be Preemption Event. In this case, UA1 signals user 3 to a ses-
sion. Once established, there is a preemption event in the
Reason: preemption ;cause=3 ;text="Generic non-IP portion of the session/call, and the TDM portion has
Preemption" the ability to inform the SIP gateway of this type of event.
UA1 SIP GW 1
F1 INVITE (R-P:1)
F2 200 OK
F3 ACK
User 3
RTP
Non-IP network
Preemption
indication
F4 BYE (Reason: preemption; cause = 1;
text = “UA Preemption”)
F5 200 OK
Figure 15.9 Non-IP preemption flow. (Copyright IETF. Reproduced with permission.)
524 ◾ Handbook on Session Initiation Protocol
This non-IP signal can be translated into SIP signaling is reached or surpassed, session establishment resumes. For
(into the BYE session termination message). Within this example, the following values for current and desired status
BYE, there should be a Reason header indicating such an would not allow session establishment to resume:
event to synchronize all SIP elements.
◾◾ Current status = resources reserved in the send direction
◾◾ Desired status = resources reserved in both (sendrecv)
directions
15.4 QOS in SIP
On the other hand, the values of the example below
15.4.1 Overview would make session establishment resume:
RFC 3312 that is described here defines a generic framework
for preconditions, which are extensible through IANA regis- ◾◾ Current status = resources reserved in both (sendrecv)
tration. We describe how network QOS can be made a pre- directions
condition for the establishment of sessions initiated by the ◾◾ Desired status = resources reserved in the send direction
SIP. These preconditions require that the participant reserve
These two state variables define a certain piece of state
network resources before continuing with the session. We do
of a media stream the same way the direction attribute or
not define new QOS reservation mechanisms; these precon-
the codecs in use define other pieces of state. Consequently,
ditions simply require a participant to use existing resource
we treat these two new variables in the same way as other
reservation mechanisms before beginning the session. Some
SDP media attributes are treated in the offer–answer model
architectures require that at session establishment time, once
used by SIP (see Section 3.8.4): they are exchanged between
the callee has been alerted, the chances of a session estab-
two UAs using an offer and an answer in order to have a
lishment failure are minimum. One source of failure is the
shared view of the status of the session. Figure 15.10 shows
inability to reserve network resources for a session. To mini-
mize ghost rings, it is necessary to reserve network resources
for the session before the callee is alerted. However, the reser- A B
vation of network resources frequently requires learning the F1. INVITE (SDP 1)
IP address, port, and session parameters from the callee. This
F2. 183 Session Progress (SDP 2)
information is obtained as a result of the initial offer–answer
exchange carried in SIP. F3. PRACK
This exchange normally causes the phone to ring, thus
Reservation
introducing a chicken-and-egg problem: resources can-
Reservation
a typical message exchange between two SIP UAs using ◾◾ Confirmation status: The confirmation status attri-
preconditions. bute carries threshold conditions for a media stream.
A includes QOS preconditions in the SDP of the initial When the status of network resources reach these con-
INVITE. A does not want B to be alerted until there are net- ditions, the peer UA will send an update of the session
work resources reserved in both directions (sendrecv) end- description containing an updated current status attri-
to-end. B agrees to reserve network resources for this session bute for this particular media stream.
before alerting the callee. B will handle resource reserva- ◾◾ Precondition t ype: This document defines QOS
tion in the B->A direction, but needs A to handle the A->B preconditions. Extensions may define other types of
direction. To indicate so, B returns a 183 (Session Progress) preconditions.
response to A asking A to start resource reservation and to ◾◾ Strength tag: The strength tag indicates whether or
confirm to B as soon as the A->B direction is ready for the not the callee can be alerted in case the network fails to
session. A and B both start resource reservation. B finishes meet the preconditions.
reserving resources in the B->A direction, but does not alert ◾◾ Status t ype: We define two types of status: end-to-
the user yet, because network resources in both directions end and segmented. The end-to-end status reflects the
are needed. When A finishes reserving resources in the A->B status of the end-to-end reservation of resources. The
direction, it sends an UPDATE to B. B returns a 200 OK segmented status reflects the status of the access net-
response for the UPDATE, indicating that all the precondi- work reservations of both UAs. The end-to-end status
tions for the session have been met. At this point in time, B corresponds to the tag e2e defined above and the seg-
starts alerting the user, and session establishment completes mented status to the tags local and remote. End-to-end
normally. status is useful when end-to-end resource reservation
mechanisms are available. The segmented status is use-
ful when one or both UAs perform resource reserva-
15.4.2 SDP Parameters tions on their respective access networks.
We define the following media-level SDP attributes: ◾◾ Direction t ag: The direction tag indicates the direc-
tion in which a particular attribute (current, desired or
current-status = "a=curr:" precondition- confirmation status) is applicable to.
type SP status-type SP
direction-tag
desired-status = "a=des:" precondition- The values of the tags send, recv, local, and remote rep-
type SP strength-tag SP resent the point of view of the entity generating the SDP
status-type SP description. In an offer, send is the direction offerer->answerer
direction-tag and local is the offerer’s access network. In an answer, send
confirm-status = "a=conf:" precondition- is the direction answerer->offerer and local is the answerer’s
type SP status-type SP
access network. The following example shows these new SDP
direction-tag
precondition-type = "qos"/token attributes in two media lines of a session description:
strength-tag = ("mandatory" | "optional"
| "none" m=audio 20000 RTP/AVP 0
= | "failure" |"unknown") a=curr:qos e2e send
status-type = ("e2e" | "local" | a=des:qos optional e2e send
"remote") a=des:qos mandatory e2e recv
direction-tag = ("none" | "send" | "recv" m=audio 20002 RTP/AVP 0
| "sendrecv") a=curr:qos local sendrecv
a=curr:qos remote none
a=des:qos optional local sendrecv
a=des:qos mandatory remote sendrecv
◾◾ Current s tatus: The current status attribute carries
the current status of network resources for a particular
media stream. 15.4.3 Usage of Preconditions
◾◾ Desired status: The desired status attribute carries the
with Offer–Answer
preconditions for a particular media stream. When the
direction tag of the current status attribute with a given Parameter negotiation in SIP is carried out using the offer–
precondition type/status type for a particular stream is answer model described in RFC 3264 (see Section 3.8.2).
equal to (or better than) the direction tag of the desired The idea behind this model is to provide a shared view of the
status attribute with the same precondition type/status session parameters for both UAs once the answer has been
type for that stream, then the preconditions are consid- received by the offerer. This section describes which values
ered to be met for that stream. our new SDP attributes can take in an answer, depending
526 ◾ Handbook on Session Initiation Protocol
on their value in the offer. To achieve a shared view of the The meaning of the fields is the same as in the end-to-end
status of a media stream, we define a model that consists of case.
three tables: both UAs implement a local status table, and Before generating an offer, the offerer must build a trans-
each offer–answer exchange has a transaction status table action status table with the current and the desired status for
associated to it. The offerer generates a transaction status each media stream. The different values of the strength tag
table identical to its local status table and sends it to the for the desired status attribute have the following semantics:
answerer in the offer. The answerer uses the information of
this transaction status table to update its local status table. ◾◾ None: no resource reservation is needed. Optional: the
The answerer also updates the transaction status table fields UAs should try to provide resource reservation, but the
that were out of date and returns this table to the offerer session can continue regardless of whether or not this
in the answer. The offerer can then update its local status provision is possible.
table with the information received in the answer. After this ◾◾ Mandatory: the UAs must provide resource reservation.
offer–answer exchange, the local status tables of both UAs Otherwise, session establishment must not continue.
are synchronized. They now have a common view of the sta-
tus of the media stream. Sessions that involve several media The offerer then decides whether it is going to use the
streams implement these tables per media stream. Note, end-to-end status type or the segmented status type. If the
however, that this is a model of UA behavior, not of soft- status type of the media line will be end-to-end, the UA gen-
ware. An implementation is free to take any approach that erates records with the desired status and the current status
replicates the external behavior this model defines. for each direction (send and recv) independently, as shown
in Table 15.6.
If the status type of the media line will be segmented, the
15.4.3.1 Generating an Offer
UA generates records with the desired status, and the current
Both UAs must maintain a local precondition status, which status for each direction (send and recv) and each segment
is referred to as a local status table. Tables 15.6 and 15.7 show (local and remote) independently, as shown in Table 15.7.
the format of these tables for both the end-to-end and the At the time of sending the offer, the offerer’s local status
segmented status types. For the end-to-end status type, the table and the transaction status table contain the same val-
table contains two rows; one for each direction (i.e., send and ues. With the transaction status table, the UA must generate
recv). A value of yes in the Current field indicates the success- the current-status and the desired-status lines following the
ful reservation of that resource in the corresponding direc- syntax of Section 15.4.2 and the rules described below.
tion. No indicates that resources have not been reserved yet.
The Desired Strength field indicates the strength of the pre-
15.4.3.1.1 SDP Encoding
conditions in the corresponding direction. The table for the
segmented status type contains four rows: both directions For the end-to-end status type, the UA must generate one
in the local access network and in the peer’s access network. current status line with the tag e2e for the media stream.
If the strength tags for both directions are equal (e.g., both
mandatory) in the transaction status table, the UA must add
Table 15.6 Table for the End-to-End Status Type one desired status line with the tag sendrecv. If both tags
are different, the UA must include two desired status lines,
Direction Current Desired Strength one with the tag send and the other with the tag recv. The
Send No Mandatory semantics of two lines with the same strength tag, one with
a send tag and the other with a recv tag, is the same as one
Recv No Mandatory sendrecv line. However, to achieve a more compact encod-
ing, we have chosen to make the latter format mandatory.
For the segmented status type, the UA must generate two
Table 15.7 Table for the Segmented Status Type current status lines: one with the tag local and the other with
the tag remote. The UA must add one or two desired status
Direction Current Desired Strength
lines per segment (i.e., local and remote). If, for a particular
Local send No None segment (local or remote), the tags for both directions in the
transaction status table are equal (e.g., both mandatory), the
Local recv No None
UA must add one desired status line with the tag sendrecv.
Remote send No Optional If both tags are different, the UA must include two desired
status lines, one with the tag send and the other with the tag
Remote recv No None
recv. Note that the rules above apply to the desired strength
Resource Priority and Quality of Service in SIP ◾ 527
tag none as well. This way, a UA that supports QOS but Table 15.9 Values of Tags in Offers and Answers
does not intend to use them adds desired-status lines with
Offer Answer
the strength tag none. Since this tag can be upgraded in the
answer, as described later, the answerer can request QOS res- Send Recv
ervation without a need of another offer–answer exchange.
Recv Send
The example below shows the SDP corresponding to Tables
15.6 and 15.7. Local Remote
No No No/no
a PSTN gateway reserves resources without sending signal- attribute in the answer. For example, if a UA receives an SDP
ing to the PSTN.) A UAS may receive an INVITE request description with the following attributes:
with no offer in it. In this case, the UAS will provide an offer
in a reliable 1xx response. The UAC will send the answer in m=audio 20002 RTP/AVP 0
a=curr:qos local none
another SIP request (i.e., the PRACK for the 1xx). If the offer
a=curr:qos remote none
and the answer contain preconditions, the UAS should not a=des:qos mandatory local sendrecv
alert the user until all the mandatory preconditions in the a=des:qos mandatory remote sendrecv
answer are met. Note that in this case, a UAS providing an a=conf:qos remote sendrecv
initial offer with preconditions, a 180 Ringing response with
preconditions, will never be sent, since the UAS cannot alert It will send an offer as soon as it reserves resources in its
the user until all the preconditions are met. access network (remote tag in the received message) for both
A UAS that is not capable of unilaterally meeting all of the directions (sendrecv).
mandatory preconditions must include a confirm-status attri-
bute in the SDP (offer or answer) that it sends. Furthermore, 15.4.6 Refusing an Offer
the SDP (offer or answer) that contains this confirm-status
attribute must be sent as soon as allowed by the SIP offer– We define a new SIP status code:
answer rules. While session establishment is suspended,
UAs should not send any data over any media stream. In the Server-Error = "580"; Precondition Failure
case of RTP (see Section 7.7), neither RTP nor Real-Time
Transport Control Protocol (RTCP) packets are sent. A UAS When a UAS acting as an answerer cannot or is not will-
knows that all the preconditions are met for a media line ing to meet the preconditions in the offer, it should reject
when its local status table has a value of yes in all the rows the offer by returning a 580 Precondition-Failure response.
whose strength tag is mandatory. When the preconditions of Using the 580 Precondition Failure status code to refuse an
all the media lines of the session are met, session establish- offer is useful when the offer comes in an INVITE or in an
ment should resume. For an initial INVITE suspending and UPDATE request. However, SIP does not provide a means
resuming session establishment is very intuitive. The callee to refuse offers that arrive in a response (1xx or 2xx) to an
will not be alerted until all the mandatory preconditions are INVITE. If a UAC generates an initial INVITE without an
met. However, offers containing preconditions sent in the offer and receives an offer in a 1xx or 2xx response that is not
middle of an ongoing session need further explanation. Both acceptable, it should respond to this offer with a correctly
UAs should continue using the old session parameters until formed answer, and immediately send a CANCEL or a BYE.
all the mandatory preconditions are met. At that moment, the If the offer comes in a 1xx or 2xx response to a re-INVITE,
UAs can begin using the new session parameters. A would not have a way to reject it without terminating the
session at the same time. The same recommendation given in
RFC 3261 (see Section 3.8) applies here:
15.4.5 Status Confirmation
The confirm-status attribute may be used in both offers and The UAS must ensure that the session description
answers. This attribute represents a threshold for the resource overlaps with its previous session description in
reservation. When this threshold is reached or surpassed, the media formats, transports, other parameters that
UA must send an offer to the peer UA, reflecting the new require support from the peer. This is to avoid the
current status of the media line as soon as allowed by the SIP need for the peer to reject the session description.
offer–answer rules. If this threshold is crossed again (e.g., If, however, it is unacceptable to A, A should gen-
the network stops providing resources for the media stream), erate an answer with a valid session description,
the UA must send a new offer as well, as soon as allowed and then send a BYE to terminate the session.
by the SIP offer–answer rules. If a peer has requested con-
firmation on a particular stream, an agent must mark that 580 Precondition Failure responses and BYE and CANCEL
stream with a flag in its local status table. When all the rows requests indicating failure to meet certain preconditions should
with this flag have a Current value of yes, the UA must send contain an SDP description indicating which desired status
a new offer to the peer. This offer will contain the current triggered the failure. Note that this SDP description is not an
status of resource reservation in the current-status attributes. offer or an answer, since it does not lead to the establishment
Later, if any of the rows with this flag transition to no, a new of a session. The format of such a description is based on the
offer must be sent as well. Confirmation attributes are not last SDP (an offer or an answer) received from the remote UA.
negotiated. The answerer uses the value of the confirm-status For each m= line in the last SDP description received,
attribute in the offer and the offerer uses the value of this there must be a corresponding m= line in the SDP description
Resource Priority and Quality of Service in SIP ◾ 529
indicating failure. This SDP description must contain exactly the a=curr:qos remote none
same number of m= lines as the last SDP description received. a=des:qos mandatory local sendrecv
a=des:qos mandatory remote sendrecv
The port number of every m= line must be set to zero, but the a=curr:qos e2e none
connection address is arbitrary. The desired status line corre- a=des:qos optional e2e sendrecv
sponding to the precondition that triggered the failure must use
the failure strength tag, as shown in the example below:
m=audio 0 RTP/AVP 0
15.4.8 Multiple Preconditions a=rtpmap:0 PCMU/8000
per Media Stream a=des:foo none e2e sendrecv
a=des:qos none local sendrecv
A media stream may contain multiple preconditions.
Different preconditions MAY have the same precondition Note that when this document was published, the pre-
type and different status types (e.g., end-to-end and seg- condition type foo had not been registered. It is used here
mented QOS preconditions) or different precondition types in the session description above to provide an example with
(this document only defines the qos precondition type, multiple precondition types. A UA that supports this frame-
but extensions may define more precondition types in the work should add a precondition tag to the Supported header
future). All the preconditions for a media stream must be field of its responses to OPTIONS requests.
met in order to resume session establishment. The following
example shows a session description that uses both end-to-
end and segmented status types for a media stream. 15.4.11 Examples
m=audio 20000 RTP/AVP 0 The following examples cover both status types: end-to-end
a=curr:qos local none and segmented.
530 ◾ Handbook on Session Initiation Protocol
Reservation
(F2), it starts performing resource reservation as well. Both
Reservation
A B A B
F1. INVITE (SDP 1)
Reservation
F2. 183 Session Progress (SDP 2)
F3. PRACK
Reservation
Reservation
Reservation
F5. UPDATE (SDP 3)
The call flow of Figure 15.13 shows a basic session establish- Let us assume that after receiving this response, A decides
ment using the segmented status type. The SDP descriptions that it wants to use only PCM u-law (payload 0), as opposed
of this example are shown below: to both PCM u-law and A-law (payload 8). It would send an
UPDATE to B, possibly before receiving the 200 OK for the
SDP 1: A includes local and remote QOS preconditions INVITE (F5). The SDP would look like
in the initial offer. Before sending the initial offer, A m=audio 20000 RTP/AVP 0
reserves resources in the access network. This is indi- c=IN IP4 192.0.2.1
cated in the local current status of the SDP below: a=curr:qos local sendrecv
a=curr:qos remote sendrecv
m=audio 20000 RTP/AVP 0 8 a=des:qos mandatory local sendrecv
c=IN IP4 192.0.2.1 a=des:qos mandatory remote sendrecv
532 ◾ Handbook on Session Initiation Protocol
B would generate an answer for this offer and place it it will receive RESV messages from the network.
in the 200 OK for the UPDATE. Note that this last offer– However, it does not know the status of the reserva-
answer to reduce the number of supported codecs may arrive tions in the other direction. B requests confirmation
to the UAS after the 200 OK response has been generated. for resource reservations in its recv direction, to the
This would mean that the session is established before A has peer UA A, in its answer.
reduced the number of supported codecs. To avoid this situ-
ation, the UAC could wait for the first answer from the UA m=audio 30000 RTP/AVP 0
c=IN IP4 192.0.2.4
before setting its local current status to sendrecv.
a=curr:qos e2e none
a=des:qos mandatory e2e sendrecv
15.4.11.3 Offer in a SIP Response a=conf:qos e2e recv
The call flow of Figure 15.14 shows a basic session estab- SDP 2: A includes its answer in the PRACK for the 183
lishment where the initial offer appears in a reliable 1xx Session Progress response.
response. This example uses the end-to-end status type. The
SDP descriptions of this example are shown below. m=audio 20000 RTP/AVP 0
The first INVITE (F1) does not contain a session descrip- c=IN IP4 192.0.2.1
tion. Therefore, the initial offer is sent by B in a reliable 183 a=curr:qos e2e none
a=des:qos mandatory e2e sendrecv
Session Progress response.
SDP 1: B includes end-to-end QOS preconditions in the After having sent the answer, A starts reserving network
initial offer. Since B uses RSVP, it can know when resources for the media stream. When B receives this answer
resources in its send direction are available, because (F3), it starts performing resource reservation as well. Both
UAs use RSVP, so A sends PATH messages toward B and B
A B sends PATH messages toward A.
F1. INVITE
SDP 3: When A receives RESV messages it sends an
F2. 180 Ringing (SDP 1) updated offer (F5) to B:
that is described here updates some of the procedures in RFC 15.4.12.1.4 Suspending and Resuming
3312 (see Sections 15.4.1 through 15.4.11) for using SIP pre- Session Establishment
conditions in situations that involve session mobility. RFC
RFC 3312 (see Section 15.4.4) describes the behavior of UAs
3312 focuses on media sessions that do not move around. That
from the moment session establishment is suspended, due to
is, media is sent between the same end points throughout the
a set of preconditions, until it is resumed when these precon-
duration of the session. Nevertheless, media sessions estab-
ditions are met. In general, the called user is not alerted until
lished by SIP are not always static. SIP offers mechanisms to
the preconditions are met. In addition to not alerting the
provide session mobility, namely re-INVITEs and UPDATEs
user, each precondition type must define any extra actions
(see Section 3.8.3). While existing implementations of RFC
UAs should perform or refrain from performing when session
3312 can probably handle session mobility, there is a need to
establishment is suspended. The behavior of media streams
explicitly point out the issues involved and make a slight update
during session suspension is therefore part of the definition
on some of the procedures defined there in. With the updated
of a particular precondition type. Some precondition types
procedures defined in this specification (RFC 4032), mes-
may allow media streams to send and receive packets dur-
sages carrying precondition information become more explicit
ing session suspension; others may not. Consequently, the
about the current status of the preconditions. Specifically, we
following paragraph from RFC 3312 only applies to QOS
now allow answers to downgrade current status values (this
preconditions:
was disallowed by RFC 3312). We consider moving an exist-
ing stream to a new location as equivalent to establishing a
new stream. Therefore, answers moving streams to new loca- While session establishment is suspended, UAs
tions set all the current status values in their answers to no and should not send any data over any media stream.
start a new precondition negotiation from scratch. In the case of RTP, neither RTP nor RTCP pack-
ets are sent.
15.4.12.1 Defining New Precondition Types To clarify the previous paragraph, the control messages
Specifications defining new precondition types need to used to establish connections in connection-oriented trans-
discuss the topics described in this section. Having clear port protocols (e.g., Transmission Control Protocol [TCP]
definitions of new precondition types is essential to ensure SYNs) are not affected by the previous rule. Thus, UAs follow
interoperability among different implementations. standard rules, for example, the SDP setup attribute (RFC
4145), to decide when to establish the connection, regardless
of QOS preconditions. New precondition types must also
15.4.12.1.1 Precondition Type Tag describe the behavior of UAs on reception of a re-INVITE
New precondition types must have an associated precondi- or an UPDATE with preconditions for an ongoing session.
tion type tag (e.g., qos is the tag for QOS preconditions).
Any new preconditions must be registered with IANA per 15.4.12.2 Issues Related to Session Mobility
RFC 3312 guideline.
RFC 3312 (se Section 15.4.3) describes how to use SIP pre-
conditions with the offer–answer model (RFC 3264, see
15.4.12.1.2 Status Type Section 3.8.4). RFC 3312 gives a set of rules that allow a UA
RFC 3312 defines two status types: end-to-end and seg- to communicate changes in the current status of the precon-
mented. Specifications defining new precondition types ditions to the remote UA. The idea is that a given UA knows
must indicate which status applies to the new precondition. about the current status of some part of the preconditions
New preconditions can use only one status type or both. For (e.g., send direction of the QOS precondition) through local
example, the QOS preconditions defined in RFC 3312 (see information (e.g., an RSVP RESV is received indicating that
Sections 15.4.1 through 15.4.11) can use both. resource reservation was successful). The UAC informs the
UAS about changes in the current status by sending an offer
to the UAS. The UAS, in turn, could (if needed) send an
15.4.12.1.3 Precondition Strength
offer to the UAC informing it about the status of the part of
RFC 3312 (see Sections 15.4.1 through 15.4.11) defines the preconditions the UAS has local information about.
optional and mandatory preconditions. Specifications Note that UASs do not usually send updates about the
defining new precondition types must describe whether or current status to the UAC because UASs are the ones resum-
not optional preconditions are applicable, and in case they ing session establishment when all the preconditions are met.
are, what is the expected behavior of a UA on reception of Therefore, rather than performing an offer–answer exchange
optional preconditions. to inform the UAC that all the preconditions are met, they
534 ◾ Handbook on Session Initiation Protocol
Dialog 1 Dialog 2
MEDIA
Dialog 1 Dialog 3
MEDIA
Figure 15.15 Session mobility using 3PCC. (Copyright IETF. Reproduced with permission.)
simply send a 180 Ringing response indicating that ses- the status of preconditions at the remote segment. However,
sion establishment has been resumed. While RFC 3312 (see moving an existing stream to a new location, from the pre-
Sections 15.4.1 through 15.4.11) allows updating current sta- conditions point of view, is like establishing a new stream.
tus information using the methods described above, it does Therefore, it is appropriate to set all the current status val-
not allow downgrading current status values in answers, as ues to no and start a new precondition negotiation from
shown in the third row of Table 13.8 (Table 3 of RFC 3312). scratch. The updated Table 15.10 and rules below apply to an
Figure 15.10 shows how performing such a downgrade in an answerer that is moving a media stream. It implies that Table
answer would sometimes be needed. 15.8 (Table 3 of RFC 3312) needs to be updated to allow
The Third-Party Call Control (3PCC) (RFC 3725, see answerers to downgrade current status values (as shown in
Section 18.3) controller in Figure 15.15 has established a Table 15.10 per RFC 4032). Note that the offerer was not
session between A and B using dialog 1 toward A and dia- aware of the move when it generated the offer.
log 2 toward B. At that point, the controller wants A to have An answerer must downgrade the current status values
a session with C instead of B. To transfer A to C (configu- received in the offer if it has local information about them
ration shown at the bottom of Figure 15.15), the controller or if the media stream is being moved to a new transport
sends an empty (no offer) re-INVITE to A. Since A does address. Note that for streams using segmented status, the
not know that the session will be moved, its offer in the 200 address change at the answerer may or may not affect the sta-
OK states that the current status of the media stream in the tus of the preconditions at the offerer’s segment. However, as
send direction is yes. After contacting C establishing dialog stated above, moving an existing stream to a new location,
3, the controller sends back an answer to A. This answer from the preconditions point of view, is like establishing a
contains a new destination for the media (C) and should new stream. Therefore, it is appropriate to set all the current
have downgraded the current status of the media stream status values to no and start a new precondition negotiation
to no, since there is no reservation of resources between A from scratch. Table 15.10 applies to an offerer that receives
and C. an answer that updates or downgrades its local status tables.
15.4.12.2.1 Update to RFC 3312 Table 15.10 Possible Values for the Current Fields
(Updated Version of Table 15.8, Section 15.4.3.2 per
Below is a set of new rules that update RFC 3312 to address
RFC 4032)
the issues above. The rule below applies to offerers moving a
media stream to a new address: Transaction Local Status New Values Transaction/
Status Table Table Local
When a stream is being moved to a new trans- No No No/No
port address, the offerer must set all current
status values about which it does not have local Yes Yes Yes/Yes
information about to no. Yes No Depends on local
information
Note that for streams using segmented status (as opposed
to end-to-end status), the fact that the address for the media No Yes Depends on local
stream at the local segment changes may or may not affect information
Resource Priority and Quality of Service in SIP ◾ 535
Table 15.11 Offerer Updated Local Status numerical objectives or acceptance threshold values for the
SIP performance metrics.
Transaction Local Status New Values Local
Status Table Table Status
overloaded and is unable to respond to the request. Note that, containing the necessary information is sent by the originat-
like RRD, the IRA using GRUU (see Section 4.3), client-initi- ing agent or user to the intended mediation or destination
ated connection management (see Section 13.2), and the varia- agent, until the last bit of the first provisional response or a
tions of different authentications schemes (see Section 19.4) in failure indication response. A failure response is described
SIP may differ significantly from that of the usual registration as a 4xx (excluding 401, 402, and 407 nonfailure challenge
described in RFC 3261 (see Section 3.3), and RFC 6076 has response codes), 5xx, or possible 6xx message. A change in
not discussed the implications of IRAs in those contexts. the metric output might indicate problems in downstream
signaling functions, which may be impairing the INVITE
message from reaching the intended UA or may indicate
15.4.13.3 Session Request Delay
changes in end-point behavior. While this metric calculates
The Session Request Delay (SRD) metric is utilized to detect the delay associated with a failed session request, the metric
failures or impairments causing delays in responding to a UA Ineffective Session Attempts (ISAs) described later is used for
session request. SRD is measured for both successful and calculating a ratio of session attempt failures
failed session setup requests independently as this metric
usually relates to a user experience. The duration associated
15.4.13.4 Session Disconnect Delay
with success and failure responses will likely vary substan-
tially, and the desired output time associated with each will The Session Disconnect Delay (SDD) metric is utilized to
be significantly different in many cases. It is measured at the detect failures or impairments delaying the time necessary to
originating UA only. The output value of this metric must end a session. SDD is measured independently for both suc-
indicate whether the output is for successful or failed session cessful and failed session disconnects. The SDD is calculated
requests and should be stated in units of seconds. The SRD using the following formula:
is calculated using the following formula:
SDD = Time of 2xx or timeout − time of completion
SRD = Time of status indicative response − time of message (BYE)
INVITE request
SDD is defined as the interval between the first bit of the
sent session completion message, such as a BYE, and the last
15.4.13.3.1 Successful Session Setup SRD
bit of the subsequently received 2xx response. In some cases,
In a successful request attempt, SRD is defined as the time a recoverable error response, such as a 503 Retry-After, may
interval from when the first bit of the initial INVITE message be received. In such situations, these responses should not be
containing the necessary information is sent by the originating used as the end time for this metric calculation. Instead, the
UA to the intended mediation or destination agent, until the successful (2xx) response related to the recovery message is
last bit of the first provisional response is received indicating used.
an audible or visual status of the initial session setup request.
(Note: In some cases, the initial INVITE may be forked. In
15.4.13.5 Session Duration Time
this case, all dialogs along with 200 OK answers need to be
considered. In many forking cases, the situations might be more The Session Duration Time (SDT) metric is used to detect
complicated. Each case, need to be taken into account accord- problems (e.g., poor audio quality) causing short session
ingly.) In SIP, the message indicating status would be a non-100 durations. SDT is measured for both successful and failed
Trying provisional message received in response to an INVITE session completions independently. It can be measured from
request. In some cases, a non-100 Trying provisional message is either end-point UA involved in the SIP dialog. The SDT is
not received, but rather a 200 message is received as the first sta- calculated using the following formula:
tus message instead. In these situations, the 200 message would
be used to calculate the interval. In most circumstances, this SDT = Time of BYE or timeout − time of 200 OK response
metric relies on receiving a non-100 Trying message. The use of to INVITE request
the PRACK method (RFC 3262, see Section 2.5) may improve
the quality and consistency of the results. This metric does not calculate the duration of sessions
leveraging early media. For example, some automated
response systems only use early media by responding with a
15.4.13.3.2 Failed Session Setup SRD
SIP 183 Session Progress message with the SDP connecting
In a failed request attempt, SRD is defined as the time inter- the originating UA with the automated message. Usually, in
val from when the first bit of the initial INVITE message these sessions, the originating UA never receives a 200 OK,
Resource Priority and Quality of Service in SIP ◾ 537
and the message exchange ends with the originating UA 486, 600, and 603 were chosen because they clearly indicate
sending a CANCEL. the effect of an individual user of the UA. It is possible an
individual user could cause a negative effect on the UA. The
SEER is calculated using the following formula:
15.4.13.5.1 Successful Session Duration SDT
In a successful session completion, SDT is calculated as an Number of INVITE requests
average and is defined as the duration of a dialog defined with associated 2000, 480, 600, or 603
SEER = × 100
by the interval between receipt of the first bit of a 200 Total number of INVITE requests −
OK response to an INVITE, and receipt of the last bit of Number of invite requests with 3xx response
an associated BYE message indicating dialog completion.
Retransmissions of the 200 OK and ACK messages due to
network impairments do not reset the metric timers. 15.4.13.8 Ineffective Session Attempts
ISAs occur when a proxy or agent internally releases a setup
15.4.13.5.2 Failed Session Completion SDT request with a failed or overloaded condition. This metric
In some cases, no response is received after a session comple- is similar to Ineffective Machine Attempts (IMAs) in tele-
tion message is sent and potentially retried. In this case, SDT phony applications of SIP. The output value of this metric is
is defined as the interval between receiving the first bit of a numerical and should be adjusted to indicate a percentage of
200 OK response to an INVITE, and the resulting timer F ISAs. The following failure responses provide a guideline for
expiration. this criterion:
15.5 SDP Media Streams particular mapping of media flows and reservation flows.
The mechanism described in this document is useful when
Mapping to QOS Flows the remote party needs to be involved in the resource
15.5.1 Overview reservation.
Resource reservation protocols assign network resources to
particular flows of IP packets. When a router receives an 15.5.4 Examples
IP packet, it applies a filter in order to map the packet to For this example, we have chosen to use SIP to transport
the flow it belongs. The router provides the IP packet with SDP sessions and RSVP (RFC 2205) to establish reserva-
the QOS corresponding to its flow. Routers typically use the tion flows. However, other protocols or mechanisms could
source and the destination IP addresses and port numbers to be used instead without affecting the SDP syntax. A UA
filter packets. Multimedia sessions typically contain multiple receives a SIP INVITE with the SDP below:
media streams (e.g., an audio stream and a video stream).
To provide QOS for a multimedia session, it is necessary to v=0
map all the media streams to resource reservation flows. This o=Laura 289083124 289083124 IN IP4 one.
mapping can be performed in different ways. Two possible example.com
ways are to map all the media streams to a single resource res- t=0 0
c=IN IP4 192.0.0.1
ervation flow or to map every single media stream to a differ- a=group:SRF 1 2
ent resource reservation flow. Some applications require that m=audio 30000 RTP/AVP 0
the former type of mapping is performed, while other appli- a=mid:1
cations require the latter. It is even possible that a mixture m=video 30002 RTP/AVP 31
of both mappings is required for a particular media session. a=mid:2
For instance, a multimedia session with three media streams
might require that two of them are mapped into a single This UA uses RSVP to perform resource reservation.
reservation flow while the third media stream uses a second Since both media streams are part of an SRF group, the UA
reservation flow. RFC 3524 that is described here defines the will establish a single RSVP session. An RSVP session is
SDP syntax needed to express how media streams need to be defined by the triple: (DestAddress, ProtocolId[, DstPort]).
mapped into reservation flows. It allows requesting a group Table 15.12 shows the parameters used to establish the RSVP
of media streams to be mapped into a single resource reser- session. If the same UA received an SDP session description
vation flow. It also specifies a new semantics attribute called with the same media streams but without the group line, it
Single Reservation Flow (SRF). would be free to map the two media streams into two differ-
ent RSVP sessions.
15.5.3 Applicability Statement
The way resource reservation works in some scenarios makes Table 15.12 Parameters Needed to Establish the RSVP
it unnecessary to use the mechanism described in this docu- Session
ment. Some resource reservation protocols allow the entity Session
generating the SDP session description to allocate resources Number DestAddress ProtocolId DstPort
in both directions (i.e., sendrecv) for the session. In this
1 192.0.0.1 UDP Any
case, the generator of the session description can chose any
Resource Priority and Quality of Service in SIP ◾ 539
15.6 QOS Mechanism Selection in SDP attribute identifies a QOS mechanism that can be used
to reserve resources for traffic received by the entity gen-
15.6.1 Overview erating the session description. The qos-mech-send and
qos-mech-recv attributes are not interdependent; one can
The offer–answer model (RFC 3264, see Section 3.8.4) be used without the other. The following is an example
for SDP (RFC 4566, see Section 7.7) does not provide any of an m= line with qos-mech-send and qos-mech-recv
mechanism for end points to negotiate the QOS mechanism attributes:
to be used for a particular media stream. Even when QOS
preconditions (RFC 3312, see Section 15.4) are used, the m=audio 50000 RTP/AVP 0
choice of the QOS mechanism is left unspecified and is up to a=qos-mech-send: rsvp nsis
a=qos-mech-recv: rsvp nsis
the end points. End points that support more than one QOS
mechanism need a way to negotiate which one to use for
a particular media stream. Examples of QOS mechanisms
are RSVP (RFC 2205) and NSIS (RFC 5974). RFC 5432 15.6.3 Offer–Answer Behavior
that is described herein defines a mechanism that allows end
points to negotiate the QOS mechanism to be used for a Through the use of the qos-mech-send and qos-mech-recv
particular media stream. However, the fact that end points attributes, an offer–answer exchange allows end points to
agree on a particular QOS mechanism does not imply that come up with a list of common QOS mechanisms sorted by
that particular mechanism is supported by the network. In preference. However, note that end points negotiate in which
any case, the information the end points exchange to nego- direction QOS is needed using other mechanisms, such as
tiate QOS mechanisms, as defined in RFC 5432, can be preconditions (RFC 3312, see Section 15.4). End points
useful for a network operator to resolve a subset of the QOS may also use other mechanisms to negotiate, if needed, the
interoperability problem—namely, to ensure that a mecha- parameters to use with a given QOS mechanism (e.g., band-
nism commonly acceptable to the end points is chosen and width to be reserved).
make it possible to debug potential misconfiguration situ-
ations. Note that discovering which QOS mechanisms are
15.6.3.1 Offerer Behavior
supported at the network layer is out of the scope of this
RFC 5432. Offerers include a qos-mech-send attribute with the tokens
corresponding to the QOS mechanisms (in order of pref-
erence) that are supported in the send direction. Similarly,
15.6.2 SDP Attribute Definitions offerers include a qos-mech-recv attribute with the tokens
This document defines the qos-mech-send and qos-mech-recv corresponding to the QOS mechanisms (in order of prefer-
session and media-level SDP (RFC 4566, see Section 7.7) ence) that are supported in the receive direction.
attributes. The following is their Augmented Backus-Naur
Form (RFC 5234) syntax, which is based on the SDP (RFC
4566) grammar: 15.6.3.2 Answerer Behavior
On receiving an offer with a set of tokens in a qos-mech-send
attribute = /qos-mech-send-attr
attribute, the answerer takes those tokens corresponding to
attribute = /qos-mech-recv-attr
qos-mech-send-attr = "qos-mech-send" ":" QOS mechanisms that it supports in the receive direction
[[SP] qos-mech *(SP and includes them, in order of preference, in a qos-mech-recv
qos-mech)] attribute in the answer. On receiving an offer with a set of
qos-mech-recv-attr = "qos-mech-recv” ":" tokens in a qos-mechrecv attribute, the answerer takes those
[[SP] qos-mech *(SP tokens corresponding to QOS mechanisms that it supports
qos-mech)]
in the send direction and includes them, in order of pref-
qos-mech = "rsvp"/"nsis"/
extension-mech erence, in a qos-mech-send attribute in the answer. When
extension-mech = token ordering the tokens in a qos-mech-send or a qos-mech-
recv attribute by preference, the answerer may take into
The qos-mech token identifies a QOS mechanism that account its own preferences and those expressed in the offer.
is supported by the entity generating the session descrip- However, the exact algorithm to be used to order such token
tion. A token that appears in a qos-mech-send attribute lists is outside the scope of this specification. Note that if the
identifies a QOS mechanism that can be used to reserve answerer does not have any QOS mechanism in common
resources for traffic sent by the entity generating the ses- with the offerer, it will return empty qos-mech-send and qos-
sion description. A token that appears in a qos-mech-recv mechrecv attributes.
540 ◾ Handbook on Session Initiation Protocol
SDP. When compression is used in SIP, the compression result, the SIP/SDP compression standard needs to be more
achieves its maximum rate once a few message exchanges have specific to the SIP/SDP signaling messages for optimizing
taken place. This is because the first message the compressor the compression further. RFC 5049 describes some specif-
sends to the decompressor is only partially compressed, as ics that apply when Signaling Compression is applied to the
there is not a previously stored state to compress against. As SIP, such as default minimum values of SigComp param-
the goal is to compress as much as possible, it seems sensible eters, compartment and state management, and a few issues
to investigate a mechanism to boost the compression rate on SigComp over TCP. Any implementation of SigComp for
from the first message. RFC 3485 defines a static diction- use with SIP must conform to this specification (RFC 5049)
ary standard for the text-based SIP and SDP protocol. The and SigComp, and in addition, support the SIP and SDP
dictionary is to be used in conjunction with SIP, SDP, and static dictionary.
SigComp (RFC 3320). The static SIP/SDP dictionary con- Note that the static SIP/SDP dictionary constitutes a
stitutes a SigComp state that can be referenced in the first SigComp state that can be referenced in the first SIP message
SIP message that the compressor sends out. The dictionary that the compressor sends out boosting the compression of
is compression algorithm independent and provides higher SIP and SDP, but, unfortunately, does not have any effect in
efficiency for compression of SIP/SDP messages. XML-based presence documents. SIP is extended by the SIP
Note that a SIP client sending a request to a SIP server events notification framework to provide subscriptions and
typically may perform a DNS lookup for the domain notifications of SIP events. One example of such event noti-
name of the server. When Naming Authority Pointer or fication mechanism is presence, which is expressed in XML
Service records are available for the server, the client can documents called presence documents. Typically, presence
specify the type of service it wants. The service in this documents can contain large amounts of data. The size of
context is the transport protocol to be used by SIP (e.g., this data is dependent on the number of presentities that a
User Datagram Protocol [UDP], TCP, or Stream Control watcher is subscribed to and the amount of information sup-
Transmission Protocol). A SIP server that supports different plied by the presentity. As a result, this can impose a problem
transport protocols will have different DNS entries. Since it in environments where resources are scarce (e.g., low-band-
is foreseen that the number of transport protocols supported width links with high latency), and the presence service is
by a particular application layer protocol is not going to grow offered at low or no cost. This is the case, for example, of
dramatically, having a DNS entry per transport seems like some wireless networks and devices. It is reasonable to try
a scalable enough solution. However, sometimes it is neces- to minimize the impact of bringing the presence service to
sary to include new layers between the transport protocol wireless networks under these circumstances. Work has been
and the application layer protocol. Examples of these layers done to mitigate the impact of transferring large amounts of
are transport layer security and compression. If DNS was presence documents between end points. For example, the
used to discover the availability of these layers for a particu- partial PIDF (Presence Information Data Format) reduces
lar server, the number of DNS entries needed for that server the amount of data transferred between the end points.
would grow dramatically. It can be seen that, in supporting RFC 5112 defines the presence-specific static dictionary that
all different transport protocols, it will not be scalable with SigComp can use in order to compress presence documents
a growing list of DNS entries because each entry for each to achieve higher efficiency. The dictionary is compression
transport protocol needs to show, for example, whether or algorithm independent. The detailed description of SIP,
not the transport layer security protocol and signaling com- SDP, SIP event packages, and Presence is beyond the scope
pression (SigComp) algorithms along with their port num- of this section.
bers are used for each transport protocol or a combination of
different transport protocols.
RFC 3486 describes a mechanism to signal that com-
pression is desired for one or more SIP messages. It also states
15.8 Summary
when it is appropriate to send compressed SIP messages to a We have described both resource-priority and QOS of the SIP
SIP entity. The SIP/SDP statically defined dictionary standard network that are related to the application layer. The different
for compression uses the SigComp (RFC 3320) framework SIP calls that will have different priority levels are expressed
for compressing messages generated by application proto- in the SIP Resource-Priority header. However, these priority
cols. The SigComp framework is intentionally designed to levels are defined by different authorities based on their poli-
be application agnostic so that it can be applied to any appli- cies. We have described the IANA-registered namespaces,
cation protocol. Consequently, many application-dependent namely DSN, DRSN, Q735, ETS, and WPS, that can be
specifics are left out of the base standard. It is intended that a designated in the Resource-Priority header. The implication
separate specification be used to describe those specifics when of allocations of network resources is that SIP calls may even
SigComp is applied to a particular application protocol. As a need to be preempted for providing services based on priority
542 ◾ Handbook on Session Initiation Protocol
levels if there are insufficient resources to support all the calls 6. What is the QOS namespace? What are there multiple
concurrently. We have described the behavior of SIP entities namespaces? Explain with detail examples the follow-
in handling the priority-based calls, including dealing with ing namespaces: DSN, DRSN, Q735, ETS, and WPS.
preemption. In addition, preemption events that represent 7. Explain the security aspects in offering resource-
instances where network routers between SIP UAs preempt priority over the SIP networks: authentication and
calls/sessions for the interfaces that handle SIP traffic are authorization, confidentiality and integrity, anonym-
explained in two areas: NPEs and APEs. ity, and denial-of-service attacks.
The SIP session cannot be successfully established if 8. What is preemption event in SIP? What are the access,
there are insufficient resources over the network should QOS network, and hybrid-infrastructure preemption events?
requirements in the SIP application layer need to be met. We Explain in detail including their appropriate reason
have described the preconditions in SIP in detail for support- codes.
ing QOS over SIP networks. We have explained how SIP 9. Why is the precondition needed in SIP in offering
application QOS requirements need to be met using precon- QOS over the SIP network? How is the precondition
ditions invoking the network layer QOS signaling protocols in QOS related to the preemption? How is preemption
such as RSVP for appropriate resources reservation at the related to the priority levels in SIP networks?
time of the session setup. However, SIP QOS requirements 10. How is the SDP offer–answer model used in QOS
for different media streams are negotiated between the SIP negotiations with preconditions between the SIP end
end points. SDP that is carried over the SIP message body points? Explain with call flows.
of the signaling messages express the QOS parameters for 11. Explain with call flows how the suspension, resump-
different media used by multimedia applications. We have tion, and rejection of session establishment is done in
described how SDP offer–answer model is used for negotiat- offering QOS with preconditions over the SIP network.
ing QOS between SIP UAs in addition to mapping of QOS 12. How is the unknown the precondition type used over
flows to the SDP media streams appropriately. The SIP and the SIP network in offering QOS? How are the mul-
SDP compression that saves the bandwidth for the signaling tiple preconditions per media handled? How is the
traffic is described. Finally, authentication and authorization, option tag for precondition used? How are the capa-
confidentiality and integrity, anonymity, and denial-of-ser- bilities of the SIP end points indicated? Explain with
vice attacks that are very critical for users for offering QOS examples the following: end-to-end status type, seg-
over SIP networks are explained elaborately. mented status type, and offer in a SIP response.
13. Explain in detail with examples how SDP media
PROBLEMS streams are mapped to QOS flows.
14. Explain how QOS mechanisms are selected in SDP,
1. Why is priority of the call/session needed over the SIP
including the definition of SDP attributes. Describe
network? Describe the detailed features of the SIP
the SDP offer–answer behavior of offerer and answerer
Resource-Priority and Accept-Resource-Priority head-
with examples. How is resource reservation done? How
ers, including the option tags.
are the subsequent offer–answer exchanges done?
2. Describe the behavior of all SIP elements that receive
15. Explain in brief all aspects of SDP and SIP compression.
prioritized requests, including preemption and priority
queuing. What can be the possible error conditions in
handling the priority over the SIP network? Explain in
detail. References
3. Explain the SIP UAC and UAS behavior in detail with
1. International Telecommunications Union, “Stage 3 descrip-
preemption algorithm and queuing priority. What tion for community of interest supplementary services
is the proxy behavior in handling the SIP Resource- using Signaling System No. 7: Multi-level Precedence and
Priority header? Preemption,” Recommendation Q.735.3, March 1993.
4. How does the third-party authentication work in 2. International Telecommunications Union, “Integrated
requesting the resource priority? Are the Resource- Services Digital Network (ISDN)—General structure
Priority and Accept-Resource-Priority headers back- and service capabilities—Multi-level Precedence and
wards compatible in SIP? Explain with call flows. Preemption,” Recommendation I.255.3, July 1990.
3. Usage of cause and location in the Digital. Subscriber
5. How are the multiple concurrent namespaces handled in
Signalling System No. 1 and the. Signalling System No. 7
SIP? Explain the general rules, valid and invalid orderings. ISDN User Part. ITU-T Recommendation Q.850, 1998.
Chapter 16
543
544 ◾ Handbook on Session Initiation Protocol
more multimedia-rich call services using SIP. Some exam- The following roles are used to describe transfer require-
ples of SIP INFO Packages are provided to show the creation ments and scenarios:
of more multimedia-rich intelligent services. Interestingly,
UUI call services, which transfer user information at the ◾◾ Originator: wishes to place a call to the Recipient. This
time of the call setup as opposed to after the establishment actor is the source of the first INVITE in a session, to
of the session as in the case of INFO, are also described. either a Facilitator or a Screener.
A new SIP header has been created specifically to indicate ◾◾ Facilitator: receives a call or out-of-band request from
this UUI that usually restricts the transfer of user informa- the Originator, establishes a call to the Recipient
tion to a few hundred bytes for providing interoperability through the Screener, and connects the Originator to
with public switched telephone network/Integrated Services the Recipient. Typically, a Facilitator acts on behalf of
Digital Network (PSTN/ISDN) protocols. Call services the Originator.
using DTMF are also briefly explained. Finally, emergency ◾◾ Screener: receives a call ultimately intended for
call services using SIP and IP networks are also described. the Recipient and transfers the calling party to the
Recipient, if appropriate. Typically, a Screener acts on
behalf of the Recipient.
16.2 Call Transfer and Related
Call Services 16.2.3 Requirements
1. Any party in a SIP session must be able to transfer any
16.2.1 Overview
other party in that session at any point in that session.
Some examples of relatively simple SIP call services are pro- 2. The Transferor and the Transferee must not be removed
vided in Request for Comment (RFC) 5359: Call Hold, from a session as part of a transfer transaction. At first
3-Way Conference, Find-Me, Music on Hold, Incoming Call glance, this requirement may seem to indicate that the
Screening, Unattended Transfer Outgoing, Call Screening, user experience in a transfer must be significantly dif-
Attended Transfer, Call Park, Instant Messaging Transfer, ferent from what a current Private Branch Exchange
Call Pickup, Unconditional Call Forwarding, Automatic (PBX) or Centrex user expects. As the call flows in
Redial, Call Forwarding on Busy, Click to Dial, and Call this document show, this is not the case. A client may
Forwarding on No Answer. However, more complicated preserve the current experience. In fact, without this
call services described in RFC 5589 are described here. requirement, some forms of the current experience
We describe the Call Transfer capabilities in SIP using the (ringback on transfer failure, for instance) will be lost.
REFER method and Replaces header to provide a number 3. The Transferor must know whether or not the transfer
of transfer services, including blind transfer, consultative was successful.
transfer, and attended transfer. This work is part of the SIP 4. The Transferee must be able to replace an existing dia-
multiparty call control framework. The mechanisms dis- log with a new dialog.
cussed here are most closely related to traditional, basic, and 5. The Transferor and Transferee should indicate their
consultation hold transfers. This chapter details the use of support for the primitives required to achieve transfer.
the REFER method and Replaces header field to achieve call 6. The Transferor should provide the Transfer Target
transfer. A user agent (UA) that fully supports the transfer and Transferee with information about the nature and
mechanisms described in this chapter supports REFER and progress of the transfer operation being attempted.
Replaces in addition to other capabilities of SIP. A UA should
use a Contact Uniform Resource Identifier (URI) and sup- To meet these requirements, the transfer operation can
ports the Target-Dialog header field (see Section 2.8). be modeled as an ad hoc conference between three parties,
as discussed later.
16.2.2 Actors and Roles
There are three actors in a given transfer event, each playing
16.2.4 Using REFER to Achieve Call Transfer
one of the following roles: A REFER can be issued by the Transferor to cause the
Transferee to issue an INVITE to the Transfer Target. Note
◾◾ Transferee: the party being transferred to the Transfer that a successful REFER transaction does not terminate the
Target session between the Transferor and the Transferee. If those
◾◾ Transferor: the party initiating the transfer parties wish to terminate their session, they must do so with
◾◾ Transfer Target: the new party being introduced into a a subsequent BYE request. The media negotiated between
call with the Transferee the Transferee and the Transfer Target is not affected by
Call Services in SIP ◾ 545
the media that had been negotiated between the Transferor a response), or if it satisfies the properties described in the
and the Transferee. In particular, the INVITE issued by GRUU specification (see Section 4.3).
the Transferee will have the same SDP body it would have This document does not prescribe the flows and examples
if the Transferee had initiated that INVITE on its own. precisely as they are shown, but rather the flows illustrate the
Furthermore, the disposition of the media streams between principles for best practice for the transfer feature. The call
the Transferor and the Transferee is not altered by the flows represent well-reviewed examples of SIP usage to imple-
REFER method. Agents may alter a session’s media through ment transfer with REFER, which are the Best Common
additional signaling. For example, they may make use of the Practice according to International Engineering Task Force
SIP hold re-INVITE or conferencing extensions described in (IETF) consensus. In most of the following examples,
the conferencing framework (RFC 4353). the Transferor is in the atlanta.example.com domain, the
To perform the transfer, the Transferor and Transferee Transferee is in the biloxi.example.com, and the Transfer
could reuse an existing dialog established by an INVITE to Target is in the chicago.example.com domain.
send the REFER. This would result in a single dialog shared
by two uses—an invite usage and a subscription usage. The
call flows for this are shown in detail later. However, the
16.2.5 Basic Transfer
approach described in this document is to avoid dialog reuse. Basic Transfer consists of the Transferor providing the
The issues and difficulties associated with dialog reuse are Transfer Target’s contact to the Transferee. The Transferee
described in RFC 5057 (see Section 3.6.5). Motivations for attempts to establish a session using that contact and reports
reusing the existing dialog include the following: the results of that attempt to the Transferor. The signaling
relationship between the Transferor and Transferee is not
1. There was no way to ensure that a REFER on a new terminated, so the call is recoverable if the Transfer Target
dialog would reach the particular end point involved cannot be reached. Note that the Transfer Target’s contact
in a transfer. Many factors, including details of imple- information has been exposed to the Transferee. The pro-
mentations and changes in proxy routing between an vided contact can be used to make new calls in the future.
INVITE and a REFER, could cause the REFER to be The participants in a basic transfer should indicate support
sent to the wrong place. Sending the REFER down the for the REFER and NOTIFY methods in Allow header fields
existing dialog ensured it got to the end point to which in INVITE, 200 OK to INVITE, and OPTIONS messages.
we were already talking. Participants should also indicate support for Target-Dialog
2. It was unclear how to associate an existing invite usage in the Supported header field. The diagrams below show
with a REFER arriving on a new dialog, where it was the first line of each message. The first column of the figure
completely obvious what the association was when the shows the dialog used in that particular message. In these
REFER came on the INVITE usage’s dialog. diagrams, media is managed through re-INVITE holds;
3. There were concerns with authorizing out-of-dialog however, other mechanisms (e.g., mixing multiple media
REFERs. The authorization policy for REFER in most streams at the UA or using the conferencing extensions) are
implementations piggybacks on the authorization pol- valid.
icy for INVITE (which is, in most cases, based simply Selected message details are shown, labeled as mes-
on “I placed or answered this call”). sage F1, F2, etc. Each of the flows below shows the dia-
log between the Transferor and the Transferee remaining
Globally Routable UA URIs (GRUUs) (see Section connected (on hold) during the REFER process. While this
4.2) can be used to address problem 1. Problem 2 can be provides the greatest flexibility for recovery from failure,
addressed using the Target-Dialog header field (see Section it is not necessary. If the Transferor’s agent does not wish
2.8). In the immediate term, this solution to problem 2 to participate in the remainder of the REFER process and
allows the existing REFER authorization policy to be has no intention of assisting with recovery from transfer
reused. As a result, if the Transferee supports the target- failure, it could emit a BYE to the Transferee as soon as
dialog extension and the Transferor knows the Contact the REFER transaction completes. This flow is sometimes
URI is routable outside the dialog, the REFER should be known as unattended transfer or blind transfer. Figure 16.1
sent in a new dialog. If the nature of the Contact URI is shows transfer when the Transferee utilizes a GRUU and
not known or if support for the target-dialog extension is supports the target-dialog extension and indicates this to
not known, the REFER should be sent inside the existing the Transferor. As a result, the Transferor sends the REFER
dialog. A Transferee must be prepared to receive a REFER outside the INVITE dialog. The Transferee is able to match
either inside or outside a dialog. One way that a Transferor this REFER to the existing dialog using the Target-Dialog
could know that a Contact URI is routable outside a dialog header field in the REFER, which references the existing
is by validation (e.g., sending an OPTIONS and receiving dialog.
546 ◾ Handbook on Session Initiation Protocol
Transfer Max-Forwards: 70
Transferor Transferee
target To: <sips:[email protected]>
From: <sips:[email protected]>
F1. INVITE
Dialog 1 ;tag=7553452
Call-ID: 090459243588173445
F2. 200 OK CSeq: 29887 INVITE
Dialog 1
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE,
F3. ACK REFER, NOTIFY
Dialog 1
Supported: replaces, gruu, tdialog
F4. INVITE (hold) Contact: <sips:[email protected].
Dialog 1
com;gr=3413kj2ha>
F5. 200 OK Content-Type: application/sdp
Dialog 1
Content-Length: ...
F6. ACK
Dialog 1
F7. REFER
(Target-Dialog 1) F2 200 OK Transferor -> Transferee
Dialog 2
F8. 202 Accepted
Dialog 2 SIP/2.0 200 OK
F9. NOTIFY Via: SIP/2.0/TLS
(100 Trying) 192.0.2.4;branch=z9hG4bKnas432
Dialog 2
To: <sips:[email protected]>
F10. 200 OK ;tag=31kdl4i3k
Dialog 2
From: <sips:[email protected]>
F11. INVITE ;tag=7553452
Dialog 3
Call-ID: 090459243588173445
F12. 200 OK CSeq: 29887 INVITE
Dialog 3
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE,
F13. ACK REFER, NOTIFY
Dialog 3
F14. NOTIFY Supported: replaces, gruu, tdialog
(200 OK) Contact: <sips:4889445d8kjtk3@atlanta.
Dialog 2
example.com;gr=723jd2d>
F15. 200 OK Content-Type: application/sdp
Dialog 2
Content-Length: ...
F16. BYE
Dialog 1
F17. 200 OK F3 REFER Transferor -> Transferee
Dialog 1
F18. BYE
Dialog 3 REFER sips:[email protected].
com;gr=3413kj2ha SIP/2.0
F19. 200 OK
Dialog 3 Via: SIP/2.0/TLS pc33.atlanta.example.
com;branch=z9hG4bKna9
Max-Forwards: 70
To: <sips:[email protected].
Figure 16.1 Basic transfer call flow. (Copyright IETF.
com;gr=3413kj2ha>
Reproduced with permission.)
From: <sips:[email protected]>
;tag=1928301774
Call-ID: a84b4c76e66710
16.2.5.1 Successful Transfer
CSeq: 314159 REFER
Figure 16.1 shows the call flows of a successful call transfer Allow: INVITE, ACK, CANCEL, OPTIONS, BYE,
utilizing GRUU when both parties support the dialog exten- REFER, NOTIFY
Supported: gruu, replaces, tdialog
sion using GRUU. All SIP messages for the successful call Require: tdialog
transfer are shown in detail. Refer-To: <sips:transfertarget@chicago.
example.com>
F1 INVITE Transferee -> Transferor Target-Dialog:
090459243588173445;local-tag=7553452
INVITE sips:[email protected] ;remote-tag=31kdl4i3k
SIP/2.0 Contact: <sips:4889445d8kjtk3@atlanta.
Via: SIP/2.0/TLS example.com;gr=723jd2d>
192.0.2.4;branch=z9hG4bKnas432 Content-Length: 0
Call Services in SIP ◾ 547
After the transfer failure occurs, the Transferor takes the F8. 202 Accepted
Dialog 2
Transferee off hold and resumes the session. F9. NOTIFY
(100 Trying)
Dialog 2
16.2.5.2.2 Target Busy F10. 200 OK
Dialog 2
Figure 16.3 shows the call flows where the call transferred F11. INVITE
has failed when the transfer target is busy. Dialog 3
F12. 486 Busy Here
Dialog 3
16.2.5.2.3 Transfer Target Does Not Answer F13. ACK
Dialog 3
Figure 16.4 describes the call flows where the transferred F14. NOTIFY
(486 Busy Here)
has not been successful because the transfer target did not Dialog 2
answer the call. F15. 200 OK
Dialog 2
F16. INVITE (unhold)
16.2.6 Transfer with Consultation Hold Dialog 1
F17. 200 OK
Transfer with consultation hold involves a session between Dialog 1
the Transferor and the Transfer Target before the transfer F18. ACK
actually takes place. This is implemented with SIP Hold Dialog 1
and Transfer as described above. A good feature is for the F19. BYE
Transferor to let the target know that the session relates to an Dialog 1
intended transfer. Since many UAs render the display name Dialog 1 F20. 200 OK
in the From header field to the user, a consultation INVITE
could contain a string such as “Incoming consultation from
Transferor with intent to transfer Transferee,” where the dis- Figure 16.3 Failed transfer—target busy. (Copyright IETF.
play names of the transferor and transferee are included in Reproduced with permission.)
the string.
16.2.6.2 Protecting Transfer Target
16.2.6.1 Exposing Transfer Target
The Transferor places the Transferee on hold, establishes
The Transferor places the Transferee on hold, establishes a a call with the Transfer Target, and then reverses their
call with the Transfer Target (Figure 16.5) to alert them to roles, transferring the original Transfer Target to the origi-
the impending transfer, terminates the connection with the nal Transferee. This has the advantage of hiding informa-
Transfer Target, and then proceeds with transfer as above. tion about the original Transfer Target from the original
This variation can be used to provide an experience simi- Transferee. On the other hand, the Transferee’s experience
lar to that expected by current PBX and Centrex users. To is different than in current systems. The Transferee is effec-
(hopefully) improve clarity, non-REFER transactions have tively called back by the Transfer Target. One of the prob-
been collapsed into one indicator with the arrow showing the lems with this simplest implementation of a target protecting
direction of the request. transfer is that the Transferee is receiving a new call from
550 ◾ Handbook on Session Initiation Protocol
Transferor Transferee
Transfer the Transfer Target. Unless the Transferee’s agent has a reli-
target able way to associate this new call with the call it already
F1. INVITE has with the Transferor, it will have to alert the new call on
Dialog 1
another appearance. If this, or some other call-waiting-like
F2. 200 OK user interface were not available, the Transferee might be
Dialog 1
stuck returning a Busy-Here to the Transfer Target, effec-
F3. ACK
Dialog 1 tively preventing the transfer.
F4. INVITE (hold) There are many ways that correlation could be pro-
Dialog 1 vided. The dialog parameters could be provided directly as
F5. 200 OK header parameters in the Refer-To URI, for example. The
Dialog 1
Replaces mechanism (RFC 3891, see Section 2.8.2) uses
F6. ACK this approach and solves this problem nicely. For the call
Dialog 1
F7. INVITE
flows below (Figure 16.6), dialog1 means dialog identifier
Dialog 2 1, and consists of the parameters of the Replaces header
F8. 200 OK for dialog 1. In RFC 3891, this is the Call-ID, To-tag, and
Dialog 2 From-tag. Note that the Transferee’s agent emits a BYE to
F9. ACK the Transferor’s agent as an immediate consequence of pro-
Dialog 2
cessing the Replaces header. The Transferor knows that both
F10. INVITE (hold) the Transferee and the Transfer Target support the Replaces
Dialog 2
F11. 200 OK
header from the Supported: replaces header contained in the
Dialog 2 200 OK responses from both. In this scenario, the Transferee
F12. ACK utilizes a GRUU as a Contact URI for reasons discussed ear-
Dialog 2 lier. Note that the conventions used in the SIP Torture Test
F13. REFER (Target-Dialog:2,
Refer-To:sips:transferee?Replace=1) Messages (RFC 4475) document are reused, specifically the
Dialog 3
<allOneLine> tag.
F14. 202 Accepted
Dialog 3
F1 INVITE Transferee -> Transferor
F15. NOTIFY (100 Trying)
Dialog 3
INVITE sips:[email protected]
F16. 200 OK
Dialog 3 SIP/2.0
F17. INVITE Via: SIP/2.0/TLS
(Replaces:Dialog1) 192.0.2.4;branch=z9hG4bKnas432
Dialog 4
Max-Forwards: 70
F18. 200 OK To: <sips:[email protected]>
Dialog 4
From: <sips:[email protected]>
F19. ACK
Dialog 4 ;tag=7553452
Call-ID: 090459243588173445
F20. BYE CSeq: 29887 INVITE
Dialog 1
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE,
F21. 200 OK REFER, NOTIFY
Dialog 1
Supported: replaces, gruu
F22. NOTIFY (200 OK) Contact: <sips:[email protected].
Dialog 3
com;gr=3413kj2ha>
F23. 200 OK Content-Type: application/sdp
Dialog 3
Content-Length: ...
F24. BYE
Dialog 2
F2 200 OK Transferor -> Transferee
F25. 200 OK
Dialog 2
(Transferee and target converse) SIP/2.0 200 OK
F26. BYE Via: SIP/2.0/TLS
Dialog 4
192.0.2.4;branch=z9hG4bKnas432
F27. 200 OK To: <sips:[email protected]>
Dialog 4
;tag=31431
From: <sips:[email protected]>
Figure 16.6 Transfer protecting Transfer Target. (Copyright ;tag=7553452
IETF. Reproduced with permission.) Call-ID: 090459243588173445
CSeq: 29887 INVITE
552 ◾ Handbook on Session Initiation Protocol
Transferor Transferee
Transfer Target, the Require: replaces header field should be used in
target the triggered INVITE. (This is to prevent an incorrect UA
F1. INVITE that does not support Replaces from ignoring the Replaces
Dialog 1
and answering the INVITE without a dialog match.) It is
F2. 200 OK possible that proxy/service routing may prevent the triggered
Dialog 1
INVITE from reaching the same UA. If this occurs, the trig-
F3. ACK
Dialog 1 gered invite will fail with a timeout, 403, 404, and other error
F4. INVITE (hold) messages (see Section 2.6). The Transferee may then retry the
Dialog 1 transfer with the Refer-To URI set to the Contact URI.
F5. 200 OK
Dialog 1
F1 INVITE Transferee -> Transferor
F6. ACK
Dialog 1
INVITE sips:[email protected]
F7. INVITE SIP/2.0
Dialog 2
Via: SIP/2.0/TLS
F8. 200 OK 192.0.2.4;branch=z9hG4bKnas432
Dialog 2 Max-Forwards: 70
F9. ACK To: <sips:[email protected]>
Dialog 2 From: <sips:[email protected]>
F10. INVITE (hold) ;tag=7553452
Dialog 2 Call-ID: 090459243588173445
F11. 200 OK CSeq: 29887 INVITE
Dialog 2 Allow: INVITE, ACK, CANCEL, OPTIONS, BYE,
F12. ACK REFER, NOTIFY
Dialog 2 Supported: replaces, gruu, tdialog
F13. REFER (Target-Dialog:2,
Refer-To:sips:transferee?Replace=2) Contact: <sips:[email protected].
Dialog 3 com;gr=3413kj2ha>
F14. 202 Accepted Content-Type: application/sdp
Dialog 3 Content-Length: ...
F15. NOTIFY
(100 Trying)
Dialog 3
F2 200 OK Transferor -> Transferee
F16. 200 OK
Dialog 3
F17. INVITE
SIP/2.0 200 OK
(Replaces:Dialog2)
Dialog 4 Via: SIP/2.0/TLS
F18. 200 OK
192.0.2.4;branch=z9hG4bKnas432
Dialog 4 To: <sips:[email protected]>
F19. ACK
;tag=31431
Dialog 4 From: <sips:[email protected]>
;tag=7553452
F20. BYE
Dialog 2 Call-ID: 090459243588173445
F21. 200 OK
CSeq: 29887 INVITE
Dialog 2 Allow: INVITE, ACK, CANCEL, OPTIONS, BYE,
F22. NOTIFY REFER, NOTIFY
(200 OK) Supported: replaces, gruu, tdialog
Dialog 3
Contact: <sips:4889445d8kjtk3@atlanta.
F23. 200 OK
Dialog 3 example.com;gr=723jd2d>
F24. BYE Content-Type: application/sdp
Dialog 1 Content-Length: ...
F25. 200 OK
Dialog 1 F3 INVITE Transferor -> Transfer Target
F26. BYE
Dialog 4
INVITE sips:[email protected].
F27. 200 OK com SIP/2.0
Dialog 4
Via: SIP/2.0/TLS pc33.atlanta.example.
com;branch=z9hG4bKnas432
Figure 16.7 Attended transfer call flow. (Copyright IETF. Max-Forwards: 70
Reproduced with permission.) To: <sips:[email protected]>
554 ◾ Handbook on Session Initiation Protocol
Screening Transfer
Transferor Transferee
proxy target
F1. INVITE
Dialog 1
F2. 200 OK
Dialog 1
F3. ACK
Dialog 1
F4. INVITE (hold)
Dialog 1
F5. 200 OK
Dialog 1
F6. ACK
Dialog 1
F7. INVITE
Dialog 2
F8. 200 OK
Dialog 2
F9. ACK
Dialog 2
F10. INVITE (hold)
Dialog 2
F11. 200 OK
Dialog 2
F12. ACK
Dialog 2
F13. REFER (Refer-
To:sips:TargetAOR?Replaces=dialog2&Require=replaces)
Dialog 1
F14. 202 Accepted
Dialog 1
F15. NOTIFY
(100 Trying)
Dialog 1
F16. 200 OK F17. INVITE F18. INVITE
Dialog 1 (Replaces:dialog2, (Replaces:dialog2,
Require;replaces) Require;replaces)
Dialog 4
F20. 200 OK F19. 200 OK
Dialog 4
F21. ACK F22. ACK
Dialog 4
F24. BYE F23. BYE
Dialog 2
F25. 200 OK F26. 200 OK
Dialog 2
F27. NOTIFY
(200 OK)
Dialog 1
F28. 200 OK
Dialog 1
F29. BYE
Dialog 1
F30. 200 OK
Dialog 1
F31. BYE
Dialog 3
F32. 200 OK
Dialog 3
Figure 16.9 Attended transfer call flow with a contact URI not known to be globally routable. (Copyright IETF. Reproduced
with permission.)
Call Services in SIP ◾ 557
the Transferor hangs up. Note that media must be played Transferor Transferee Transfer
to the Transfer Target upon answer—otherwise, the Target target
Transfer Transfer
Transferor Transferee Transferor Transferee
target target
Transferor Transferee
Transfer support either REFER or Replaces, making attended transfer
target impossible. The Transferor then ends dialog 2 by sending a
F1. INVITE BYE, then sends a REFER to the Transferee using the AOR
Dialog 1
URI of the Transfer Target.
F2. 200 OK
Dialog 1
F3. ACK
Dialog 1 16.2.7 Transfer with Referred-By
F4. OPTIONS
Dialog 1 In the previous examples, the Transfer Target does not have
F5. 200 OK definitive information about what party initiated the trans-
Dialog 1 fer, or, in some cases, even that transfer is taking place. The
F6. INVITE (hold)
Referred-By mechanism (RFC 3892, see Section 2.8) pro-
Dialog 1 vides a way for the Transferor to provide the Transferee with
F7. 200 OK a way to let the Transfer Target know what party initiated the
Dialog 1
transfer. The simplest and least secure approach just involves
F8. ACK
Dialog 1 the inclusion of the Referred-By header field in the REFER,
F9. INVITE which is then copied into the triggered INVITE. However,
Dialog 2 a more secure mechanism involves the Referred-By secu-
F10. 200 OK rity token, which is generated and signed by the Transferor
Dialog 2
and passed in a message body to the Transferee then to the
F11. ACK
Dialog 2 Transfer Target. The call flow in Figure 16.16 shows the
F12. OPTIONS Referred-By header field and body in the REFER F5 and
Dialog 2 triggered INVITE F6. Note that the Secure/Multipurpose
F13. 200 OK Internet Mail Extensions (S/MIME) (see Section 2.8) sig-
Dialog 2
nature is not shown in the example below. The conventions
F14. BYE
Dialog 2 used in the SIP Torture Test Messages (RFC 4475) docu-
F15. 200 OK
ment are reused, specifically the <hex> and <allOneLine>
Dialog 2 tags.
F16. REFER (Target-Dialog:1,Refer-
To:sips:TransferTarget)
Dialog 3
F17. 202 Accepted F5 REFER Transferor -> Transferee
Dialog 3
F18. NOTIFY
(100 Trying)
Dialog 3 REFER sips:[email protected].
F19. 200 OK com;gr=3413kj2ha SIP/2.0
Dialog 3 Via: SIP/2.0/TLS pc33.atlanta.example.
F20. INVITE com;branch=z9hG4bK392039842
Dialog 4 Max-Forwards: 70
F21. 200 OK To: <sips:[email protected].
Dialog 4 com;gr=3413kj2ha>
F22. ACK From: <sips:[email protected]>
Dialog 4 ;tag=1928301774
F23. NOTIFY
(200 OK) Call-ID: a84b4c76e66710
Dialog 3 CSeq: 314160 REFER
F24. 200 OK <allOneLine>
Dialog 3 Refer-To: <sips:482n4z24kdg@chicago.
F25. BYE example.com;gr=8594958
Dialog 1
?Replaces=090459243588173445%
F26. 200 OK 3Bto-tag%3D9m2n3wq%3Bfrom-tag
Dialog 1
%3D763231&Require=replaces>
F27. BYE </allOneLine>
Dialog 4
Supported: gruu, replaces, tdialog
F28. 200 OK Require: tdialog
Dialog 4
Referred-By: <sips:transferor@atlanta.
example.com>
Figure 16.15 Attended transfer fallback to basic transfer. ;cid="20398823.2UWQFN309shb3@atlanta.
(Copyright IETF. Reproduced with permission.) example.com"
Call Services in SIP ◾ 563
Transfer Target-Dialog:
Transferor Transferee
target 592435881734450904;local-tag=9m2n3wq;remote-
tag=763231
F1. INVITE
Dialog 1 Contact: <sips:4889445d8kjtk3@atlanta.
example.com;gr=723jd2d>
F2. 200 OK
Dialog 1 Content-Type: multipart/mixed;
boundary=unique-boundary-1
F3. ACK
Dialog 1 Content-Length: ...
F4. INVITE (hold)
Dialog 1 --unique-boundary-1
F5. 200 OK
Dialog 1
Content-ID: <20398823.2UWQFN309shb3@atlanta.
F6. ACK example.com>
Dialog 1
Content-Length: 2961
F7. INVITE
Dialog 2
Content-Type: multipart/signed;
protocol="application/pkcs-7-signature";
F8. 200 OK micalg=sha1;
Dialog 2
F9. ACK
Dialog 2 boundary="----590F24D439B31E08745DEF
0CD9397189"
F10. INVITE (hold)
Dialog 2
F11. 200 OK ------590F24D439B31E08745DEF0CD9397189
Dialog 2
F12. ACK Content-Type: message/sipfrag
Dialog 2 Date: Thu, 18 Sep 2003 13:07:43 GMT
F13. REFER (Target-Dialog:1,Referred- <allOneLine>
By:Transferor. Refer-To:sips:TransferTarge?Replaces2)
Dialog 3 Refer-To: <sips:482n4z24kdg@chicago.
example.com;gr=8594958
F14. 202 Accepted
Dialog 3
?Replaces=090459243588173445%3B
F15. NOTIFY to-tag%3D9m2n3wq%3Bfrom-tag%
(100 Trying) 3D763231&Require=replaces>
Dialog 3
</allOneLine>
F16. 200 OK Referred-By: <sips:transferor@atlanta.
Dialog 3
F17. INVITE example.com>
(Replaces:Dialog2, Referred-By:Transferor) ;cid="20398823.2UWQFN309shb3@atlanta.
Dialog 4
example.com"
F18. 200 OK
Dialog 4
------590F24D439B31E08745DEF0CD9397189
F19. ACK
Dialog 4
Content-Type: application/pkcs-7-signature;
F20. BYE
Dialog 2 name="smime.p7s"
Content-Transfer-Encoding: binary
F21. 200 OK
Dialog 2 Content-Disposition: attachment;
F22. NOTIFY filename="smime.p7s"
(200 OK) <hex>
Dialog 3
3082088806092A86
F23. 200 OK
Dialog 3 4886F70D010702A0820879308208750201013
10B300906052
F24. BYE B0E03021A050030
Dialog 1
... (Signature not shown)
F25. 200 OK
Dialog 1 8E63D306487A740A197A3970594CF47DD385
643B1DC49FF767
F26. BYE A3D2B428388966
Dialog 4
79089AAD95767F
F27. 200 OK </hex>
Dialog 4
------590F24D439B31E08745DEF0CD9397189—
Figure 16.16 Attended transfer call flow with Referred-By.
(Copyright IETF. Reproduced with permission.) --unique_boundary-1
564 ◾ Handbook on Session Initiation Protocol
Content-Type: application/sdp
Content-Length: 156
16.2.8 Transfer as an Ad Hoc Conference
In this flow, shown in Figure 16.17, Bob does an attended
v=0 transfer of Alice to Carol. To keep both Alice and Carol
o=referee 2890844526 2890844526 IN IP4
referee.example
fully informed of the nature and state of the transfer opera-
s=Session SDP tion, Bob acts as a focus (RFC 4579, see Sections 2.2 and
c=IN IP4 referee.example 2.4.4.1) and hosts an ad hoc conference involving Alice,
t=0 0 Bob, and Carol. Alice and Carol subscribe to the confer-
m=audio 49172 RTP/AVP 0 ence package (RFC 4575) of Bob’s focus, which allows
a=rtpmap:0 PCMU/8000 them to know the exact status of the operation. After
--my-boundary-9
the transfer operation is complete, Bob deletes the con-
ference. This call flow meets requirement 6 as described
Content-Length: 2961 earlier. NOTIFY messages related to the refer package are
Content-Type: multipart/signed; indicated as NOTIFY (refer), while NOTIFYs related to
protocol="application/pkcs-7-signature"; the Conference INFO Package are indicated as NOTIFY
micalg=sha1; (Conf-Info). Note that any type of semi-attended transfer
boundary="----590F24D439B31E08745DEF
in which media mixing or relaying could be implemented
0CD9397189" using this model. In addition to simply mixing, the focus
could introduce additional media signals such as simulated
------590F24D439B31E08745DEF0CD9397189 ringtone or on-hold announcements to improve the user
Content-Type: message/sipfrag
experience.
Date: Thu, 18 Sep 2003 13:07:43 GMT
<allOneLine>
Refer-To: <sips:transfertarget@ 16.2.9 Transfer with Multiple Parties
chicago.example.com;
In this example, shown in Figure 16.18, the Originator
Replaces=090459243588173445%3B
to-tag%3D9m2n3wq%3Bfrom-tag% places a call to the Facilitator who reaches the Recipient
3D763231&Require=replaces> through the Screener. The Recipient’s contact information is
</allOneLine> exposed to the Facilitator and the Originator. This example
Call Services in SIP ◾ 565
Bob places Alice on hold and is acting like a focus F23. 200 OK
F6. INVITE (hold)
(Contact:Conf-ID; isfocus) Bob begins transfer operation
F24. REFER
F7. 200 OK
(Refer-To: Carol)
F8. ACK F25. Accepted
F38. 200 OK
Carol subscribes to the conference package and learns
F39. NOTIFY (Conf-Info)
Bob on hold
F18. SUBSCRIBE F40. 200 OK
(sip:conf-ID)
Figure 16.17 Attended transfer as an ad hoc conference. (Copyright IETF. Reproduced with permission.)
is provided for clarification of the semantics of the REFER 16.2.10.1 Coerce Gateway Hairpins
method only, and it should not be used as the design of an to the Same Gateway
implementation.
To illustrate how a hairpin situation can occur in transfer,
consider this example. The original call dialog is setup with
16.2.10 Gateway Transfer Issues
the Transferee residing on the PSTN side of a SIP gateway.
A gateway in SIP acts as a UA. As a result, the entire preced- The Transferor is a SIP phone purely in the IP space. The
ing discussion and call flows apply equally well to gateways Transfer Target is on the PSTN side of a SIP gateway as well.
as native SIP end points. However, there are some gateway- After completing the transfer (regardless of consultative or
specific issues that are documented in this section. While blind), the Transferee is in a call with the Transfer Target
this discussion focuses on the common cases involving (both on the PSTN side of a gateway). It is often desirable
PSTN gateways, similar situations exist for other gateways, to remove the gateway(s) out of the loop. This is likely to
such as H.323/SIP gateways. only be possible if both legs of the target call are on the same
566 ◾ Handbook on Session Initiation Protocol
Figure 16.18 Transfer with multiple parties: example. (Copyright IETF. Reproduced with permission.)
Call Services in SIP ◾ 567
gateway. With both legs on the same gateway, it may be able known as Bridged Line Appearance (BLA) or Multiple Line
to invoke the analogous transfer on the PSTN side. Then, the Appearance (MLA), or Shared Call/Line Appearance (SCA).
target call would not involve the gateway. When implemented using the SIP, it is referred to as shared
Thus, the problem is how to give the proxy enough appearances of an AOR since SIP does not have the concept
information so that it knows to route the call to the same of lines. This feature is commonly offered in IP Centrex ser-
gateway. With a simple single call that hairpins, the incom- vices and IP Private Branch Exchange (IPBX) offerings, and
ing and outgoing leg have the same dialog. The proxy should is likely to be implemented on SIP IP telephones and SIP
have enough information to optimize the routing. In the feature servers used in a business environment. This feature
consultative transfer scenario, it is desirable to coerce the allows several UAs to share a common AOR, learn about calls
consultative INVITE out the same gateway as the original placed and received by other UAs in the group, and pick up or
call to be transferred. However, there is no way to relate join calls within the group. A variant of this feature is known
the consultation with the original call. In the consultative as Single Line Extension. This specification discusses use
case, the target call INVITE includes the Replaces header, cases, lists requirements, and defines extensions to implement
which contains dialog information that can be used to relate this feature. This specification updates RFCs 3261 and 4235.
it to the consultation. However, there is no information that In traditional telephony, the line is physical. A common
relates the target call to the original. In the blind transfer scenario in telephony is for a number of business telephones
scenario, it is desirable to coerce the target call onto the to share a single or a small number of lines. The sharing or
same gateway as the original call. However, the same prob- appearance of these lines between a number of phones is what
lem exists in that the target dialog cannot be related to the gives this feature its name. A common scenario in SIP is for
original dialog. a number of business telephones to share a single or a small
In either transfer scenario, it may be desirable to push number of AOR URIs. In addition, an AOR can have multiple
the transfer operation onto the non-SIP side of the gateway. appearances on a single UA in terms of the user interface. The
Presumably, this is not possible unless all of the legs go out appearance number relates to the user interface for the tele-
the same gateway. If the gateway supports more than one phone; typically, each appearance of an AOR has a visual dis-
trunk group, it might also be necessary to get all of the legs play (lamp that can change color or blink or a screen icon) and
on the same trunk group in order to perform the transfer on a button (used to select the appearance) where each appearance
the non-SIP side of the gateway. Solutions to these gateway- number is associated with a different dialog to/from the AOR.
specific issues may involve new extensions to SIP in the The telephony concept of line appearance is still relevant to
future. SIP due to the user interface considerations. It is important to
keep the appearance number construct because
16.2.10.2 Consultative-Turned-
1. Human users are used to the concept and will expect
Blind Gateway Glare
it in replacement systems (e.g., an overhead page
In the consultative transfer case turned blind, there is a announcement says, “Joe, pick up line 3”).
glare-like problem. The Transferor initiates the consultation 2. It is a useful structure for user interface representation.
INVITE; the Transferor gets impatient and hangs up, tran-
sitioning this to a blind transfer. The Transfer Target on the The purpose of the appearance number is to identify active
gateway (connected through a PSTN switch to a single line calls to facilitate sharing between users (e.g., passing a call from
or dumb analog phone) rings. The user answers the phone one user to another). If a telephone has enough buttons/lamps,
just after the CANCEL is received by the Transfer Target. the appearance number could be the positional sequence num-
The REFER and INVITE for the target call are sent. The ber of the button. If not, it may still be desirable to present the
Transferee attempts to set up the call on the PSTN side, but call state, but the appearance number should be displayed so
gets either a busy response or lands in the user’s voice mail that users know which call, for example, is on hold on which
as the user has the handset in hand and off hook. This is key. In this specification (RFC 7463), except for the usage
another example of a race condition that this call flow can scenarios in the next section, we will use the term appearance
cause. The recommended behavior is to use the approach rather than line appearance since SIP does not have the con-
described earlier. cept of lines. Note that this does not mean that a conventional
telephony user interface (lamps and buttons) must be used:
16.2.11 Call Services with Shared implementations may use another metaphor as long as the
appearance number is readily apparent to the user. Each AOR
Appearances of a SIP AOR has a separate appearance numbering space. As a result, a given
RFC 7463 that is described here specifies the requirements UA user interface may have multiple occurrences of the same
and implementation of a group telephony feature commonly appearance number, but they will be for different AORs.
568 ◾ Handbook on Session Initiation Protocol
Section 2.2. In addition, the functional elements that are A proxy inserting an appearance Alert-Info parameter
required for implementation of the system to implement follows normal Alert-Info policies. To indicate the appear-
services specific to the Shared Appearances of a SIP AOR ance number for this dialog, the proxy adds the Alert-Info
consists of header field with the appearance parameter to the INVITE.
If an Alert-Info is already present, the proxy adds the appear-
ance parameter to the Alert-Info header field. If an appear-
1. UAs that support publications, subscriptions, and noti ance number parameter is already present (associated with
fications for the SIP dialog event package, and the another AOR or by mistake), the value is rewritten adding
shared appearance dialog package extensions and the new appearance number. There must not be more than
behavior. one appearance parameter in an Alert-Info header field.
2. An Appearance Agent consisting of a State Agent for If no special ringtone is desired, a normal ringtone should
the dialog event package that implements an Event be indicated using the urn:alert:service:normal in the Alert-
State Compositor (see Section 5.2) and the shared Info, as per RFC 7462. The appearance number present in an
appearance dialog package extensions and behavior. Alert-Info header field should be rendered by the UA to the
The Appearance Agent also has logic for assigning and user, following the guidelines of RFC 7463. If the INVITE
releasing appearance numbers and resolving appear- is forwarded to another AOR, the appearance parameter in
ance number contention. the Alert-Info should be removed before forwarding outside
3. A forking proxy server that can communicate with the the group. The determination as to what value to use in the
State Agent. appearance parameter can be done at the proxy that forks the
4. A registrar that supports the registration event pack- incoming request to all the registered UAs.
age. The behavior of these elements is described nor- There is a variety of ways the proxy can determine what
matively in the following sections after the definitions value it should use to populate this parameter. For example, the
of the dialog package extensions. proxy could fetch this information by initiating a SUBSCRIBE
(see Section 5.2) request with Expires: 0 to the Appearance
16.2.11.4 Shared Appearance Dialog Agent for the AOR to fetch the list of lines that are in use.
Package Extensions Alternatively, it could act like a UA that is a part of the shared
appearance group and SUBSCRIBE to the State Agent like any
RFC 7463 normatively defines four new elements described other UA. This would ensure that the active dialog information
below in the subsections as extensions to the SIP Dialog is available without having to poll on a need basis. It could keep
Event package (RFC 4235), and the schema is defined in the track of the list of active calls for the appearance AOR based on
next section. The elements are <appearance>, <exclusive>, how many unique INVITE requests it has forked to or received
<joined-dialog>, and <replaced-dialog>, which are subele- from the appearance AOR. Another approach would be for the
ments of the <dialog> element. The detailed description of Proxy to first send the incoming INVITE to the Appearance
the extended event package is provided in this specification Agent, which would redirect to the shared appearance group
(RFC 7463). RFC 7463 also describes user interfaces for the URI and escape the proper Alert-Info header field for the Proxy
Shared Appearance/AOR, Interoperability with Non-Shared to recurse and distribute to the other UAs in the group. The
Appearance UA server (UAS), and call flows. We have not Appearance Agent needs to know about all incoming requests
included these here for the sake of brevity. to the AOR in order to seize the appearance number. One way
in which this could be done is for the Appearance Agent to
register against the AOR with a higher q-value. This will result
16.2.11.5 Alert-Info Appearance in the INVITE being sent to the Appearance Agent first, then
Parameter Definition being offered to the UAs in the group.
This specification (RFC 7463) extends RFC 3261 to add an RFC 7463 registers the two SIP header fields defining
appearance parameter to the Alert-Info header field and also new parameters as shown below through Internet Assigned
to allow proxies to modify or delete the Alert-Info header Numbers Authority (IANA) registration:
field. The changes to the augmented Backus–Naur Form
(ABNF) in RFC 3261 (also see Section 2.4.1) are Header Parameter Predefined
Field Name Values Reference
16.2.12 Completion of Call Services in SIP especially if the UAs share an AOR. The caller’s agent moni-
tors calls made from the caller’s UA(s) in order to determine
The completion of calls feature defined in RFC 6910 that is their destinations and (potentially) their final response sta-
described here allows the caller of a failed call to be notified tuses, and the Call-Info header fields of provisional and final
when the callee becomes available to receive a call. For the responses to invoke the CC feature. A callee’s monitor may
realization of a basic solution without queuing, this docu- service more than one UA as a collective group if a callee
ment references the usage of the dialog event package (RFC or population of users will be shared between the UAs, and
4235) that is described as Automatic Redial in “Session especially if the UAs share an AOR. The callee’s monitor may
Initiation Protocol Service Examples” (RFC 5359). For the supply the callee’s UAS(s) with Call-Info header field values
realization of a more comprehensive solution with queuing, for provisional and final responses. The callee’s monitor also
this specification (RFC 6910) introduces an architecture for instantiates a presence server used to monitor the caller’s
implementing these features in the SIP where completion-of- availability for CC recall.
calls implementations associated with the caller’s and callee’s The callees using the UA(s) may be able to indicate to the
end points cooperate to place the caller’s request for comple- callee’s monitor when they wish to receive CC calls. To allow
tion of calls into a queue at the callee’s end point; when a flexibility and innovation, most of the interaction between
caller’s request is ready to be serviced, reattempt of the origi- the caller’s agent, the caller(s) (user(s)), and the caller’s UA(s)
nal, failed call is then made. The architecture is designed to is out of the scope of this document. Similarly, most of the
interoperate well with existing completion of call solutions in interaction between the callee’s monitor, the callee(s), and
other networks. Note that RFC 6910 describes in detail the the callee’s UA(s) is out of the scope of this document, as is
caller’s agent and callee’s monitor behavior along with the the policy by which the callee’s monitor arbitrates between
call completion event package in great detail. However, we multiple CC requests. The caller’s agent must be capable of
have not addressed those for the sake of brevity. Instead, we performing a number of functions relative to the UA(s). The
have described some examples for basic call flows in explain- method by which it does so is outside the scope of this docu-
ing the call completion services. ment. The callee’s monitor must be capable of performing a
For the purpose of this service, RFC 6910 has defined number of functions relative to the UA(s). The method by
the following terminology: callee, caller, callee’s monitor, which it does so is outside the scope of RFC 6910. As a proof
caller’s agent, completion of calls (CC), CC activation, CC of concept, simple caller’s agents and callee’s monitors can
to busy subscriber (CCBS), CC on no reply (CCNR), CC be devised that interact with users and UAs entirely through
call, CC indicator, CC recall, CC recall events, CC recall standard SIP mechanisms such as event framework (RFC
timer, CC request, CC service duration timer, CC queue, 6665, see Section 5.2), secure shell (RFC 4235), and REFER
CC entity (CCE), failed call, original call, retain option, (RFC 3515, see Section 2.5). The callers using the UA(s) can
Signaling System 7 (SS7), CC on not logged-in (CCNL), and indicate to the caller’s agent when they wish to avail them-
subscriber (see Section 2.2). selves of CC for a recently made call that the callers deter-
mined to be unsuccessful.
16.2.12.1 Solution The caller’s agent monitors the status of the caller’s UA(s)
to determine when they are available to be used for a CC recall.
16.2.12.1.1 Completion of Call Architecture
The caller’s agent can communicate to the caller’s UA(s) that
The CC architecture augments each caller’s UA (or UA cli- a CC recall is in progress, and inquire if the relevant caller is
ent [UAC]) wishing to use the CC features with a CC agent available for the CC recall. The callee’s monitor may utilize
(also written as caller’s agent). It augments each callee’s UA several methods to monitor the status of the callee’s UA(s) or
(or UAS) wishing to be the target of the CC features with their users for availability to receive a CC call. This can be
a CC monitor (also written as callee’s monitor). The caller’s achieved through monitoring calls made to the callee’s UA(s)
agent and callee’s monitor functions can be integrated into to determine the callee’s status, the identity of callers, and
the respective UAs, be independent end systems, or be pro- the final responses for incoming calls. In a system with rich
vided by centralized application servers. The two functions, presence information, the presence information may directly
though associated with the two UAs (caller and callee), also provide this status. In a more restricted system, this determi-
may be provided as services by the end points’ home proxies nation can depend on the mode of the CC call in question,
or by other network elements. Though it is expected that a which is provided by the URI “m” parameter. For example,
UA that implements CC will have both functions so that a UA is considered available for CCBS (m=BS) when it is not
it can participate in CC as both caller and callee, the two busy, but a UA is considered available for CCNR (m=NR)
functions are independent of each other. A caller’s agent may when it becomes not busy after being busy with an estab-
service more than one UA as a collective group if a caller lished call. The callee’s monitor maintains information about
or population of users will be shared between the UAs, and the set of INVITEs received by the callee’s UA(s) considered
Call Services in SIP ◾ 571
unsuccessful by the caller. In practice, the callee’s monitor a CC event update (cc-state: ready) via a NOTIFY request
may remove knowledge about an incoming dialog from its to the selected subscription of the caller’s agent, telling it to
set if local policy at the callee’s monitor establishes that the begin the CC call to the callee’s UA. When the caller’s agent
dialog is no longer eligible for CC activations. receives this update, it initiates a CC recall by calling the
caller’s UA and then starts the CC call to the callee’s UA,
using third-party call control (3PCC) procedures in accor-
16.2.12.1.2 Completion of Call Procedures
dance with RFC 3725 (see Section 18.3). The caller’s agent
The caller’s UA sends an INVITE to a request-URI. One or can also check by other means whether the caller is available
more forks of this request reach one or more of the callee’s to initiate the CC call to the callee’s UA. If the caller is avail-
UAs. If the CC feature is available, the callee’s monitor (note able, the caller’s agent directs the caller’s UA to initiate the
there can be a monitor for each of the callee’s UAs) inserts a CC call to the callee’s UA.
Call-Info header field with its URI and with purpose=call- The caller’s agent marks the CC call as such by adding a
completion in appropriate non-100 provisional or final specific SIP URI parameter to the Request-URI, so it can be
responses to the initial INVITE and forwards them to the given precedence by the callee’s monitor in reaching the cal-
caller. The provisional response should be sent reliably if the lee’s UA. If the caller is not available on receipt of the ready for
INVITE contained a Supported header field with the option recall notification, the caller’s agent suspends the CC request
tag 100rel. On receipt of a non-100 provisional or a final at the callee’s monitor by sending a PUBLISH request con-
response with the indication that the CC feature is available, taining presence information to the presence server of the
the calling user can invoke the CC feature. callee’s monitor, informing the server that the presence sta-
The caller indicates to the caller’s agent that he wishes tus is closed. Once the caller becomes available for a CC call
to invoke CC services on the recent call. Note that from the again, the caller’s agent resumes the CC request by sending
SIP point of view, the INVITE may have been successful, another PUBLISH request to the callee’s monitor, inform-
but from the user’s point of view, the call may have been ing the monitor that the presence status is open. On receipt
unsuccessful. For example, the call may have connected to of the suspension request, the callee’s monitor performs the
the callee’s voice mail, which would return a 200 status to monitoring for the next nonsuspended CC request in the
the INVITE but from the caller’s point of view is no reply. queue. On receipt of the resume from the previously sus-
To receive information necessary for the caller to complete pended caller’s agent that was at the top of the queue, the
the call at the callee, the caller’s agent subscribes to the call- callee’s monitor performs callee monitoring for this caller’s
completion event package at the callee’s monitor. agent.
The possibility of the caller completing the call at the cal- When the CC call fails, there are two possible options:
lee is also known as the CC state (cc-state) of the caller. The the CC feature has to be activated again by the caller’s agent
cc-states comprehend the values queued and ready (for CC). subscribing to the callee’s monitor, or CC remains activated
To receive information from all destinations where the cal- and the original CC request retains its position in the queue if
lee will be reachable, the caller’s agent sends a SUBSCRIBE the retain option is supported. The retain option (see Section
request for the call-completion event package to the original 1.3) determines the behavior of the callee’s monitor when a
destination URI of the call, and to all known URIs of the CC call fails. If the retain option is supported, CC remains
callees’ monitors (which are provided by Call-Info header activated, and the original CC request retains its position
fields in provisional and final responses to the INVITE). in the queue. Otherwise, the CC feature is deactivated, and
Each callee’s monitor uses the subscription as an indication the caller’s agent would have to subscribe again to reactivate
that the caller is interested in using the CC feature with it. A monitor that supports the retain option provides the
regard to the particular callee. cc-service-retention header in its CC events. A caller’s agent
Each callee’s monitor keeps a list or queue of subscrip- that also supports the retain option uses the presence of this
tions from callers’ agents, representing the requests from header to know not to generate a new CC request after a
the callers’ agents to the callee’s monitor for CC services. failed CC call.
These subscriptions are created, refreshed, and terminated Monitors not supporting the retain option do not provide
according to the procedures of RFC 6665. Upon receiving the cc-service-retention header. A failed CC call causes the
a SUBSCRIBE request from the caller’s agent, the callee’s CC request to be deleted from the queue, and these moni-
monitor instantiates a presence state for the caller’s UA that tors will terminate the corresponding subscription of the
can be modified by the caller’s UA to indicate its availability caller’s agent to inform that agent that its CC request is no
for the CC call. Upon instantiation, the caller’s presence sta- longer in the queue. A caller’s agent that does not support
tus at the callee’s monitor is open. When the callee’s monitor the retain option can also terminate its subscription when a
determines that the callee or callee’s UA is available for a CC call fails, so it is possible that both the caller’s agent and
CC call, it selects a caller to execute the CC call and sends the callee’s monitor may be signaling the termination of the
572 ◾ Handbook on Session Initiation Protocol
subscription concurrently. This is a normal SIP events (RFC 16.2.12.2 Completion of Call Queue Model
6665, see Section 5.2) scenario. After the subscription is ter-
The callee’s monitor manages CC for a single URI. This URI
minated, the caller’s agent may create a new subscription to
is likely to be a published AOR, or more likely non-voice-
reactivate the CC feature for the original call.
mail AOR, but it may be as narrowly scoped as a single UA’s
contact URI. The callee’s monitor manages a dynamic set of
16.2.12.1.3 Automatic Redial as a Fallback CC entities (called CCEs), which represent CC requests, or
equivalently, the existing incoming CC subscriptions. This
Automatic Redial is a simple end-to-end design. An set is also called a queue, because a queue data structure
Automatic Redial scenario is described in RFC 5359. This often aids in implementing the policies of the callee’s moni-
solution is based on the usage of the dialog event package. If tor for selecting CCEs for CC recall. Each CCE has an avail-
the callee is busy when the call arrives, then the caller sub- ability state, determined through the caller’s presence status
scribes to the callee’s call state. The callee’s UA sends a noti- at the callee’s monitor. A presence status of open represents
fication when the callee’s call state changes. This means the a CCE’s availability state of available, and a presence status
caller is also notified when the callee’s call state changes to of closed represents a CCE’s availability state of unavailable.
terminated. The caller is alerted, then the caller’s UA starts a Each CCE has a recall state that is visible via subscriptions.
call establishment to the callee again. If several callers have The recall state is either queued or ready.
subscribed to a busy callee’s call state, they will be notified at Each CCE carries the From URI of the SUBSCRIBE
the same time that the call state has changed to terminated. request that caused its creation. CC subscriptions arrive at
The problem with this solution is that it might happen that the callee’s monitor by addressing the URIs the callee’s mon-
several recalls are started at the same time. This means it is itor returns in Call-Info header fields. The request-URI of
a heuristic approach with no guarantee of success. There is the SUBSCRIBE request determines the queue to which the
no interaction between CC and Automatic Redial, as there resulting CCE is added. The resulting subscription reports
is a difference in the behavior of the callee’s monitor and the the status of the queue. The base event data is the status of all
caller when using the dialog event package for receiving dia- the CCEs in the queue, but the data returned by each sub-
log information or for aggregating a CC state. scription is filtered to report only the status of that subscrip-
tion’s CCE. (Further standardization may define means for
obtaining more comprehensive information about a queue.)
16.2.12.1.4 Differences from SS7
When a CCE is created, it is given the availability state
SIP CC differs in some ways from the CCBS and CCNR available and recall state queued. When the callee’s monitor
features of SS7, which is used in the PSTN. For ease of receives Presence Information Data Format (PIDF) bodies
understanding, we enumerate some of the differences here. (RFC 3863) via PUBLISH requests (RFC 3903, see Section
As there is no equivalent to the forking mechanism in SS7, 5.2), these PUBLISH requests are expected to be sent by sub-
in the PSTN, calls can be clearly differentiated as success- scribers to indirectly suspend and resume their CC requests
ful or unsuccessful. Owing to the complex forking situations by modifying its CCE availability state.
that are possible in SIP, a call may fail from the point of A CCE is identified by the request-URI (if it was taken
view of the user and yet have a success response from SIP’s from a CC event notification that identifies the CCE) or the
point of view. (This can occur even in simple situations, e.g., From URI of the request (matching the From URI recorded
a call to a busy user that falls over to his voice mail receives a in the CCE). Receipt of a PUBLISH with status open sets
SIP success response, even though the caller may consider it the availability state of the CCE to available (resume); status
busy subscriber.) Thus, the caller must be able to invoke CC closed sets the availability state of the CCE to unavailable
even when the original call appeared to succeed. To support (suspend). A CC request is eligible for recall only when its
this, the caller’s agent must record successful calls as well as CCE’s availability state is available and the m value of the
unsuccessful calls. CCE also indicates an available state. The callee’s monitor
In SIP, only the caller’s UA or service system on the must not select for recall any CC requests that fail to meet
originating side and the callee’s UA or service system on the those criteria. Within that constraint, the selections made by
terminating side need to support CC for CC to work success- the callee’s monitor are determined by its local policy. Often,
fully between the UAs. Intermediate SIP systems (proxies or a callee’s monitor will choose the acceptable CCE that has
back-to-back user agents [B2BUAs]) do not need to imple- been in the queue the longest.
ment CC; they only need to be transparent to the usual range When the callee’s monitor has selected a CCE for
of SIP messages. In the PSTN, additionally, intermediate recall, it changes the CCE’s recall state from queued to
nodes like media gateway controllers have to implement the ready, which triggers a notification on the CCE’s subscrip-
CC service. tion. If a selected subscriber then suspends its request by
Call Services in SIP ◾ 573
sending a PUBLISH with the presence status closed, the governing proxy generates a 487 response because the proxy
CCE becomes unavailable, and the callee’s monitor changes canceled the INVITE to the UA when it rang too long with-
the CCE’s recall state to queued. This may cause another out an answer. The 487 Request Terminated response carries
CCE (e.g., a CCE that has been in the queue for less time) a Call-Info header field with purpose=call-completion. The
to be selected for recall. The caller’s presence status at the Call-Info header field positively indicates that CC is avail-
callee’s monitor is terminated when the caller completes its able for this failed fork of the call. The m=NR parameter
CC call or when the subscription of the caller’s agent at the indicates that it failed due to no-response, which is useful
callee’s monitor is terminated. for PSTN interworking and assessing presence information
in the callee’s monitor. The URI in the Call-Info header
field (<sip:[email protected]>) is where the caller’s agent should
16.2.12.3 Examples
subscribe for CC processing. Ideally, it is a globally routable
A basic call flow, with only the most significant messages URI for the callee’s monitor. In practice, it may be the cal-
of a CC activation and invocation is shown in Figure 16.19 lee’s AOR, and the SUBSCRIBE will be routed to the cal-
(please note that this is an example, and there may be varia- lee’s monitor only because it specifies Event: call-completion.
tions in the failure responses). CC is activated by sending a SUBSCRIBE to all known cal-
The original call is an ordinary INVITE. It fails due lee’s monitor URIs. These can be provided by the Call-Info
to no-response (ring-no-answer). In this case, the callee’s header field in the response to the INVITE.
Caller Callee
sip:[email protected] sip:[email protected]
Figure 16.19 Basic call flow of CC activation and invocation (only the most significant messages). (Copyright IETF.
Reproduced with permission.)
574 ◾ Handbook on Session Initiation Protocol
Additionally, the caller’s agent needs to include the orig- was no such URI or if the caller’s agent cannot remember it,
inal request-URI in its set of callee’s monitor URIs, because it may use the original request-URI. The caller adds the m
the call may have forked to additional callees whose responses parameters (if possible), to specify CC processing. Finally,
the caller has not seen. (A SUBSCRIBE to the request- the subscription for the CC request is terminated by the cal-
URI alone is used in cases where the caller’s agent has not lee’s monitor.
received or cannot remember any callee’s monitor URI.) The Another flow, with only the most significant messages of
caller’s agent adds to these URIs an m parameter (if pos- CC suspension and resumption shown, is demonstrated in
sible). In this case, the caller’s agent forks the SUBSCRIBE Figure 16.20.
to two destinations as defined by RFC 3261 (see Section The caller is selected for CC and is informed of this via
3.1.3.2.2), with appropriate Request-Disposition. The first a NOTIFY request containing cc-state: ready. At this time,
SUBSCRIBE is to the URI from Call-Info. The second the caller is not available for the CC recall. For updating its
SUBSCRIBE is to the original request-URI and reaches the presence event state at the callee’s presence server, the caller
same callee’s monitor. Because it has the same Call-Id as sends a PUBLISH request informing the presence server
the SUBSCRIBE that has already reached the callee’s moni- that the PIDF state is closed. The PUBLISH request is sent
tor, the callee’s monitor rejects it with a 482, thus avoiding (in order of preference) as follows: (F1) out-of-dialog to the
redundant subscriptions. CC URI as received in the NOTIFY, (F2) within the cor-
The initial NOTIFY for the successful SUBSCRIBE has responding SUBSCRIBE dialog, (F3) out-of-dialog to the
cc-state: queued in its body. Eventually, this caller is selected corresponding callee’s monitor URI received in the Call-
for CC and is informed of this via a NOTIFY containing Info header field of the NOTIFY, or (F4) out-of-dialog to the
cc-state: ready. This NOTIFY carries a URI to which the remote Contact address of the corresponding SUBSCRIBE
INVITE for the CC call should be sent. In practice, this dialog. When the caller is again available for the CC recall,
may be the AOR of the callee. The caller generates a new the caller updates his presence event state at the callee’s pres-
INVITE to the URI specified in the NOTIFY, or if there ence server by generating a PUBLISH request informing the
Caller Callee
sip:[email protected] sip:[email protected]
F1. NOTIFY
(CC notification, caller
sip:[email protected]
not available for
Body: cc-state: ready
CC recall)
URI: sip:[email protected]
F2. 200 OK
F4. 200 OK
F6. 200 OK
Figure 16.20 Basic call flow of CC suspension and resumption (only the most significant messages). (Copyright IETF.
Reproduced with permission.)
Call Services in SIP ◾ 575
server that the PIDF state is open; this request will otherwise package. The parameter may have a value that describes the
be constructed in the same way as the suspended PUBLISH type of the CC operation, as described in this specification.
request.
Name of the parameter: m
Predefined values: yes
16.2.12.4 IANA Considerations Reference: RFC 6910
16.2.12.4.1 SIP Event Package Registration for CC
This specification registers an event package, based on the reg- 16.2.12.4.4 purpose Parameter Value
istration procedures defined in RFC 6665 (see Section 5.2). call-completion
The following information is required for such a registration:
This specification adds a new predefined value call-completion
for the purpose header field parameter of the Call-Info header
Package name: call-completion
field. This modifies the registry header field parameters and
Is this registration for a Template-Package: No
parameter values by adding this RFC as a reference to the
Published specification: RFC 6910
line for header field Call-Info and parameter name purpose:
Unified Messaging, Third-Party Voice Mail, and Automatic by the IETF to convey redirection information. They are also
Call Distribution (ACD). SIP UAs and SIP proxies that recommended in the Communication Diversion (CDIV)
receive diversion information may use this as supplemen- service Third Generation Partnership Project (3GPP) specifi-
tal information for feature invocation decisions. RFC 5589 cation [10]. Originally, the Diversion header was described in
(see Section 16.2) provides similar services to call diversion a document that was submitted to the SIP Working Group.
indication using REFER, Refer-To, Referred-By, NOTIFY/ It has been published now as RFC 5806 (see above, Section
SUBSCRIBE, and other methods and headers. Examples 16.3.1) for the historical record and to provide a reference for
provided in this RFC may be compared with those of the this RFC 6044. This header contains a list of diverting URIs
call diversion indication described here. and associated information providing specific information
as the reason for the call diversion. Most existing SIP-based
implementations have implemented the Diversion header
16.3.2 Diversion and History-Info
when no standard solution was ready to deploy.
Header Interworking in SIP The IETF has finally standardized the History-Info
Although the SIP History-Info header is the solution adopted header, partly because it can transport general history infor-
in IETF, the nonstandard Diversion header is nevertheless mation. This allows the receiving part to determine how and
widely implemented and used for conveying call-diversion- why the session is received. As the History-Info header may
related information in SIP signaling. This document describes a contain further information than call diversion information,
recommended interworking guideline between the Diversion it is critical to avoid losing information and be able to extract
header and the History-Info header to handle call diversion the relevant data using the retargeting cause URI parameter
information. In addition, an interworking policy is proposed described in RFC 4458 (see Section 4.4) for the transport of
to manage the headers’ coexistence. The History-Info header the diversion reason. The Diversion header and the History-
is described in RFC 4244 (see Section 2.8) and the non- Info header have different syntaxes, described below. Note
standard Diversion header is described, as Historic, in RFC that the main difference is that the History-Info header is a
5806 (see above, Section 16.3.1). Since the Diversion header chronological writing header, whereas the Diversion header
is used in many existing network implementations for the applies a reverse chronology (i.e., the first diversion entry
transport of call diversion information, its interworking with read corresponds to the last diverting user).
the SIP History-Info standardized solution is needed. This
work is intended to enable the migration from nonstandard
implementations and deployment toward IETF specification- 16.3.2.2 Problem Statement
based implementations and deployment. 16.3.2.2.1 Interworking Requirements and Scope
For some Voice over IP (VoIP)-based services (e.g., voice
mail, Interactive Voice Recognition, or ACD), it is helpful for This section provides the baseline terminology used in the
the called SIP UA to identify from whom and why the ses- rest of the document and defines the scope of interworking
sion was diverted. For this information to be used by various between the Diversion header and the History-Info header.
service providers or by applications, it needs to pass through There are many ways in which SIP signaling can be used
the network. This is possible with two different SIP headers: to modify a session destination before it is established, and
the History-Info header defined in RFC 4244 (see Sections there are many reasons for doing so. The behavior of the SIP
2.4.1 and 2.8) and the historic Diversion header defined in entities that will have to further process the session down-
RFC 5806 (see above, Section 16.3.1), which are both able to stream will sometimes vary depending on the reasons that
transport diversion information in SIP signaling. Although lead to changing the destination, for example, whether it is
the Diversion header is not standardized, it is widely used. for a simple proxy to route the session or for an application
Therefore, it is useful to have guidelines make this header server to provide a supplementary service. The Diversion
interwork with the standard History-Info header. Note that header and the History-Info header differ in the approach
the new implementation and deployment of the Diversion and scope of addressing this problem. For clarity, the follow-
header is strongly discouraged. RFC 6044 provides a mecha- ing vocabulary is used in this document:
nism for header-content translation between the Diversion
header and the History-Info header. ◾◾ Retargeting/redirecting: refers to the process of a
proxy server/UAC changing a URI in a request and
thus changing the target of the request. These terms are
16.3.2.1 Background
defined in RFC 4244. The History-Info header is used
The History-Info header (RFC 4244, see Section 2.8) and to capture retargeting information.
its extension for forming SIP service URIs (including Voice- ◾◾ Call forwarding/call diversion/communication diver-
Mail URI) (RFC 4458, see Section 4.4) are recommended sion: these terms are equivalent and refer to the
Call Services in SIP ◾ 577
CDIV supplementary services, based on the ISDN should be performed by a specific SIP border device
Communication diversion supplementary services and that is aware, by configuration, that it is at the border
defined in 3GPP [10]. They are applicable to entities between two regions, one using History-Info header
that are intended to modify the original destination of and one using Diversion header. As History-Info
an IP multimedia session during or before the session header is a standard solution, a network using the
establishment. Diversion header must be able to provide informa-
tion to a network using the History-Info header. In
This document does not intend to describe when or how this case, to avoid header coexistence, it is required
History-Info or Diversion headers should be used. Hereafter, to replace, as often as possible, the Diversion header
we provide a clarification on the context in which interworking with the History-Info header in the INVITE request
is required. The Diversion header has exactly the same scope as during the interworking. Since the History-Info
the call diversion service, and each header entry reflects a call header has a wider scope than the Diversion header,
diversion invocation. The Diversion header is used for record- it may be used for other needs and services than call
ing call-forwarding information, which could be useful to diversion.
network entities downstream. Today, this SIP header is imple- In addition to trace call diversion information, the
mented by several manufacturers and is deployed in networks. History-Info header also acts as a session history and
The History-Info header is used to store all retargeting can store all successive R-URI values. Consequently,
information, including call diversion information. In prac- even if it should be better to remove the History-Info
tice, the History-Info header (RFC 4244, see Section 2.8) is header after the creation of the Diversion header to
used to convey call-diversion-related information by using a avoid confusion, the History-Info header must remain
cause URI parameter (RFC 4458, see Section 4.4) in the rel- unmodified in the SIP signaling if it contains supple-
evant entry. Note, however, that the use of cause URI param- mentary (nondiversion) information. It is possible to
eter (RFC 4458) in a History-Info entry for a call diversion is have History-Info headers that do not have values that
specific to the 3GPP specification [10]. RFC 4458 focuses on can be mapped into the Diversion header. In this case,
retargeting toward a voice-mail server and does not specify no interworking with the Diversion header should be
whether the cause URI parameter should be added in a URI performed, and it must be defined per implementation
for other cases. As a consequence, implementations that do what to do in this case. This point is left out of the
not use the cause URI parameter for call-forwarding infor- scope of this document. As a conclusion, it is recom-
mation are not considered for the mapping described in this mended to have local policies minimizing the loss of
document. Nevertheless, some recommendations are given information and find the best way to keep it up to the
in the next sections on how to avoid the loss of nonmapped terminating UA. The following paragraphs describe
information at the boundary between a network region using the basic common use case.
the History-Info header and one using the Diversion header. SIP Network/Terminal Using Diversion to SIP Network/
Since both headers address call-forwarding needs, divert- Terminal Using History-Info Header:
ing information could be mixed up or be inconsistent if both When the Diversion header is used to create a
are present in an uncoordinated fashion in the INVITE History-Info header, the Diversion header must be
request. Thus, the Diversion and History-Info headers must removed in the outgoing INVITE. It is considered
not independently coexist in the same session signaling. This that all of the information present in the Diversion
document addresses how to convert information between the header is transferred in the History-Info header. If
Diversion header and the History-Info header, and when and a History-Info header is present in the incoming
how to preserve both headers to cover additional cases. For INVITE (in addition to Diversion header), the
the transportation of consistent diversion information down- Diversion header and History-Info header present
stream, it is necessary to make the two headers interwork. must be mixed, and only the diversion information
Interworking between the Diversion header and the History- not yet present in the History-Info header must be
Info header is introduced in Sections 2.1 and 2.2. Since the inserted as a last entry (more recent) in the existing
coexistence scenario may vary from one use case to another, History-Info header, as recommended in RFC 4244
guidelines regarding headers interaction are proposed. (see Section 2.8).
SIP Network/Terminal Using History-Info Header to
SIP Network/Terminal Using Diversion Header:
16.3.2.2.2 Interworking Recommendations
When the History-Info header is interpreted to
Interworking function: create a Diversion header, some precautions must be
In a normal case, the network topology assumption taken. If the History-Info header contains only call-
is that the interworking described in this document forwarding information, then it must be deleted after
578 ◾ Handbook on Session Initiation Protocol
could contain the gr parameter. Regardless of the rules con- diversion-screen = "screen" EQUAL
cerning the gr parameter defined in Ref. [10], which must be ("yes"/"no"/token/
quoted-string)
applied, this parameter has no impact on the mapping and diversion-extension = token [EQUAL (token/
must only be copied with the served user address. quoted-string)]
Example:
Note that the Diversion header could be used in the
History-Info: comma-separated format, as described below, and in a
header-separated format. Both formats could be combined a
<sip: diverting_user1_addr?Privacy=none?Reaso received INVITE as recommended in RFC 3261.
n=SIP%3Bcause%3D302>;index=1,
Example:
<sip: diverting_user2_addr;cause=480?Privacy=
history>;index=1.1,
<sip:last_diversion_target;cause=486>; Diversion:
index=1.1.1
diverting_user2_addr; reason="user-busy";
Policy concerning histinfo option tag in Supported counter=1; privacy=full,
diverting_user1_addr; reason="unconditional";
header: according to RFC 4244 (see Section 2.8), a proxy
counter=1; privacy=off
that receives a Request with the histinfo option tag in the
Supported header should return captured History-Info in
subsequent, provisional, and final responses to the Request. 16.3.2.4 Headers in SIP Method
The behavior depends on whether or not the local policy sup-
The recommended interworking presented in this docu-
ports the capture of History-Info.
ment should apply only for INVITE requests. In 3xx
responses, both headers could be present. When a proxy
16.3.2.3.2 Diversion Header Syntax wants to interwork with a network supporting the other
header field, it should apply the interworking between
The following text is restating the exact syntax that the pro- the Diversion header and the History-Info header in the
duction rules in RFC 5806 define, but using RFC 5234 3xx response. When a recursing proxy redirects an initial
ABNF: INVITE after receiving a 3xx response, it should add as a
last entry either a Diversion header or a History-Info header
Diversion = "Diversion" HCOLON (according to its capabilities) in the forwarded INVITE.
diversion-params
*(COMMA
Local policies could apply to send the received header
diversion-params) in the next INVITE. Other messages where History-Info
diversion-params = name-addr *(SEMI could be present are not used for the call-forwarding service
(diversion-reason/ and should not be changed into a Diversion header. The
diversion-counter/ destination network must be transparent to the received
diversion-limit/ History-Info header. Note that the following mapping is
diversion-privacy/
diversion-screen/
inspired from the ISDN User Part (ISUP) to the SIP inter-
diversion-extension)) working described in Ref. [11].
diversion-reason = "reason" EQUAL
("unknown"/"user-busy"/
"no-answer" 16.3.2.5 Diversion Header
/"unavailable"
/"unconditional"/"time- to History-Info Header
of-day"/"do-not- The following text is valid only if no History-Info is pres-
disturb"/"deflection"
ent in the INVITE request. If at least one History-Info
/"follow-me"/"out-of-
service"/"away"/token/ header is present, the interworking function must adapt
quoted-string) its behavior to respect the chronological order. For N
diversion-counter = "counter" EQUAL Diversion entries, N + 1 History-Info entries must be cre-
1*2DIGIT ated. To create the History-Info entries in the same order
diversion-limit = "limit" EQUAL 1*2DIGIT than during a session establishment, the Diversion entries
diversion-privacy = "privacy" EQUAL
must be mapped from the bottommost until the topmost.
("full"/"name"/"uri"/
"off"/token/ Each Diversion entry shall be mapped into a History-Info
quoted-string) entry. An additional History-Info entry (the last one) must
580 ◾ Handbook on Session Initiation Protocol
be created with the diverted-to party address present in the Superior to 1 Create N − 1
R-URI of the received INVITE. The mapping is described (i.e., N) placeholder
below. History entry with
The first entry created in the History-Info header contains the previous index
incremented with
.1. Then, the
◾◾ A hi-targeted-to-uri with the name-addr parameter of
History-Info
the bottommost Diversion header.
header created
◾◾ If a privacy parameter is present in the bottommost with the Diversion
Diversion entry, then a Privacy header could be escaped entry with the
in the History-Info header as described below. previous index
◾◾ An index set to 1. incremented with
.1
For each following Diversion entry (from bottom to top), Privacy Privacy header escaped in
the History-Info entries are created as follows (from top to the hi-targeted-to-uri
bottom):
full history
diverting_user2_address;reason=user- built using all SIP functional capabilities that are described
busy;counter=1;privacy=full, in all SIP RFCs. The scope of building scalable SIP network-
diverting_user1_address;reason=no-
answer;counter=1;privacy=off
ing architecture especially for the global carrier network is a
huge one. A single book or several books may not be good
Mapped into: enough to address all the details of implementation issues of
SIP-based networking. One example can be provided to see
History-Info: the enormous efforts that 3GPP is offering for decades in
building the SIP-based implementation architectural speci-
<sip: diverting_user1_address; privacy=none>; fications. However, we have taken an exception to provide
index=1, a brief description in building the scalable architecture of a
<sip: diverting_user2_address; cause=408?priv single SIP-based functional entity such as the session border
acy=history>;index=1.1,
<sip: diverting_user3_address; cause=486?priv
controller (SBC). We have described the SIP SBC in Section
acy=none>;index=1.1.1, 14.2 that is used by customers in various networking envi-
<sip: last_diverting_target; ronments, including enterprise networks, for the following
cause=302>;index=1.1.1.1 reasons: topology hiding, media traffic management, fixing
capability mismatches, maintaining SIP-related NAT bind-
ings, access control, protocol repair, and media encryption.
16.3.2.7.2 Example with History-Info Header We have also explained that the SBC has two separate
Changed into Diversion Header distinct functions: signaling control and media control. We
have described the overall end-to-end networking architec-
ture considering a single monolithic functional entity along
History-Info: with their advantages and disadvantages. Despite the emer-
gence of the IP/Internet telephony over two decades, only a
<sip: diverting_user1_address?privacy=
small percentage of the total telephony traffic is passed over
history>; index=1,
<sip: diverting_user2_address; cause=302? the pure IP/Internet. More than 90% of the total IP/Internet
privacy=none>;index=1.1, telephony traffic may be passed over the time division mul-
<sip: last_diverting_target; tiplexing (TDM)/circuit-switched based access lines that are
cause=486>;index=1.1.1 not considered high-speed. In recent years, with the advent
of everything-over-IP networking environments, all RT and
Mapped into: near-RT multimedia applications with new customer experi-
ences that were unimaginable and unheard of in plain old
Diversion: telephone service (POTS) are becoming SIP based. With
respect to signaling traffic, the SIP-based presence informa-
diverting_user2_address; reason=user-busy;
counter=1; privacy=off, tion and forking of SIP calls due to use of multiple devices by
diverting_user1_address; a given user are expected to cause an increase of the SIP sig-
reason=unconditional; counter=1; naling traffic enormously in the future because the seamless
privacy=full use of SIP for all kinds of real-time (RT) and near-RT mul-
timedia applications for simplicity and scalable operations is
Note that RFC 6044 also provides call flow examples for removing the bottleneck of multiprotocol applications.
the following scenario: two SIP networks using a History- On the other hand, with the demise of page-mode short
Info header that interworks with a SIP network using a message service (SMS), the session-mode Message Session
Diversion header. We leave these to readers as an exercise. Relay Protocol (MSRP) that may include audio and video in
addition to text/graphics will increase the media traffic even
for mobile environments, as high-capacity long-term evolu-
tion (LTE) wireless networks are becoming a reality. As a
16.4 Call Services Using Session result, the mix of SIP signaling traffic and the media traffic
Border Controller generated by the SIP-based sessions that will be carried over
the SBC are becoming highly disproportionate. The media
16.4.1 Overview traffic that may be flowing over the SBC in the future may
We have modeled the outlines of the book to bind all SIP become a couple of times higher than that of the signaling
RFCs within the framework of RFC 3261 as if the book traffic depending on networking configurations and mix of
itself were a super-RFC 3261. We have deferred to provide SIP users. It implies that a monolithic single physical entity
how the SIP-based scalable networking architecture can be of an SBC carrying both signaling and media traffic will not
Call Services in SIP ◾ 583
be scalable under all networking environments. A distributed For example, when the SIP session is progressed to a point
SBC architecture meeting all the customer requirements as where a media path needs to be established, the signaling
described earlier will be the logical choice for providing scal- plane sends an H.248 command for opening the gate with
ability. In this section, we describe the different scenarios of hole punching for NAT crossing. However, the same opera-
the distributed architectures of the SBC. tion is also true for media plane whether there is NAT or not.
The logical call flows still remain the same for all operations
as shown in Section 14.2, except that we have not included
16.4.2 Distributed SBC Architecture the operations of H.248 protocol as shown here.
The distributed SBC architecture separates the signaling and More important, the implication of separation between
media plane function into two physical entities known as the signaling and media plane goes much deeper in the
signaling controller and media controller, respectively, as distributed IP/Internet telephony networking architecture
depicted in Figure 16.21. Note that the SBC signaling plane should customers want to deploy SBCs for the large-scale
acts as the SIP B2BUA. We have shown that the communi- global network meeting those requirements as described in
cation between these two functional entities is done using the beginning, despite their limitations in being a closed
a separate protocol like H.248 [8]. The H.248 protocol was network. For instance, the traffic and geographical distri-
standardized in ITU-T at the time when H.323 [9] had been bution can be such that a single SBC signaling controller
developed primarily for interfacing the TDM-based PSTN/ may need to control multiple media controllers for scalable
ISDN telephony network to the IP-based IP/Internet H.323- operations and traffic route optimization creating a one-to-
and SIP-based VoIP telephony network providing scalability many or many-to-one communications topology. In other
and interoperability. This SBC architecture is also popularly examples, there can be many signaling controllers as well as
known as the SoftSwitch architecture. many media controllers communicating in many-to-many
Some also opine that SIP can also be used instead of fashion for scalable operations of the network. In the next
H.248 as SIP has much more powerful functional capabili- section, we provide some examples of those distributed SBC
ties than those of H.248. However, no standard has been architectures resulting the following benefits: cost-reduction
published yet for using SIP for communications between with high availability configurations and scalability of both
the signaling and media controllers. The separation between signaling and media capacity.
the signaling and media plane has solved the most impor-
tant problem—the scaling of the signaling and media plane
16.4.2.1 Single Signaling Controller
independently because there is an enormous mismatch
with Multiple Media Controllers
between the signaling and media traffic as explained earlier.
Moreover, the routing of signaling and media traffic can be The regional and local geographical topology and traffic con-
optimized independently as required without forcing one to centration of a SIP network for any enterprise is such that a
follow the other. The NAT crossing function that handles single signaling controller is good enough to meet the opti-
the media traffic can also be integrated in the media plane mum network design criteria; the distributed SBC-based SIP
operating under the control of the signaling plane of SBC. network architecture may look like as shown in Figure 16.22.
SBC
Figure 16.21 Distributed SBC architecture with separation of signaling and media control function.
584 ◾ Handbook on Session Initiation Protocol
Regional center
SIP
To other SBC
network signaling
controller
RTP,
MSRP SBC
media
controller H.248
RTP,
SBC MSRP SBC
media media
controller controller
Figure 16.22 Distributed SBC-based SIP network architecture consisting of a single signaling controller with multiple
media controllers.
Note that this network can be thought as being geograph- unlike Figure 16.21, there are multiple signaling controllers
ically distributed with logically centralized control from the that increase the reliability and availability of the signaling
SIP signaling point of view. Sometimes, this kind of network traffic in case of signaling controller failures.
configuration is known as on-net and off-net call distribu- Note that the communications protocol between the
tion services. This architecture shows how the media path can signaling controllers is also SIP. In this way, many kinds
be optimized for communications, thereby reducing costs of SBC-based distributed SIP network architectures can be
between the SIP end points independently when the signaling designed meeting all the requirements that SBCs need to
controller is physically decoupled from the media controller. meet from enterprise and home networking environments.
Although there are multiple media controllers that can take Next, we will explain the SIP trunking architecture briefly
care of one another in case of any media controller failures using an example.
increasing the reliability and availability of media communi-
cations, the same is not true for the SIP signaling traffic.
16.4.2.3 SBC-Based SIP Trunking
Network Architecture
16.4.2.2 Multiple Signaling
We now explain one of the most popular SIP network archi-
and Media Controllers tectures, popularly known as the SIP trunking network archi-
We now provide an example of the distributed SBC archi- tectures. The initial most successful deployment of SIP has
tecture that uses multiple signaling and media controllers been based on this SBC-based distributed network architec-
as depicted in Figure 16.23. This architecture shows that, ture as depicted in Figure 16.24.
Call Services in SIP ◾ 585
SIP SIP
SIP SIP
end point end point
Figure 16.23 Distributed SBC-based SIP network architecture with multiple signaling and media controllers.
Note that we have shown a high-level SoftSwitch archi- implementations, although the basic objectives may remain
tecture that uses decoupling SBC-based signaling and media the same.
function along with SIP-ISUP/CAS signaling interworking As more and more users start using SIP-based IP tele-
function (IWF) for interfacing both TDM and IP network. phony, this architecture also offers a graceful migration as
Again, there can be one or more media controllers that may time goes by. For example, Figure 16.25 provides a distrib-
be located near the signaling controller or may be distributed uted architecture where newer SIP-based IP telephony end
geographically depending on the network configurations. points are coexisting with the SBC-based trunking network-
However, a SoftSwitch usually interfaces both the TDM and ing. In this distributed SIP-based IP telephony architecture,
IP networks and is acting like a gateway. The SIP trunking the IP network itself offers the valuable, much richer RT and
architecture has historically been a successful implementation near-RT multimedia services directly to its native SIP-based
architecture from the volume point of view, interconnecting end points.
multiple geographically disparate PSTN/ISDN networks Both SIP-based IP and PSTN/ISDN telephony users can
over the large-scale IP backbone network improving econo- also communicate, simultaneously interworking with each
mies of scale. In Figure 16.24b, SIP trunking network con- other. We have not explained each of these RT and near-RT
figuration, isolated PSTN/ISDN circuit-switched networks multimedia services for both kinds of IP and PSTN/ISDN end
are interconnected via packet-switched IP backbone network points for the sake of brevity. Note that, like the distributed
offering multiplexing gain with audio silence suppression and SBC architecture, each kind of SIP-based application server also
busty data traffic of multimedia applications. Moreover, the needs to be geographically distributed while acting as a single
centralized least-cost routing of the multimedia traffic over logically centralized application server for scalability and reli-
the IP backbone network will offer economic benefits. The ability of the large-scale global SIP-based IP/Internet telephony
bandwidth-saving codecs, audio transcoding, and audio/ network. It is needless to explain how much urgent it is to do
video media bridging that require specialized hardware so, especially for the audio and video bridging media servers.
will benefit from the geographically distributed SBC-based All of this special attention is provided to the next generation of
media controllers under the control of the logically central- SIP-based distributed RT and near-RT multimedia application
ized signaling controller loosely termed as the distributed service architecture for each kind of application for building
logical SoftSwitch architecture. All PSTN/ISDN telephony scalable networks for offering SIP-based services. The recent
users may not be aware about the use of the SIP trunking completion of IETF standard activities related to the initial sets
over the long-haul IP backbone network. There can be many of multimedia conferencing services is pointing development in
variations of this star-based SIP trunking architecture in this direction. The experiences of implementations of these few
586 ◾ Handbook on Session Initiation Protocol
ISUP/CAS
(a)
IP backbone
network
PSTN/ISDN RTP
PSTN/ISDN
SoftSwitch SoftSwitch
SIP SIP
SIP RTP,
proxy MSRP
RTP
SIP
SIP
RTP SoftSwitch
SoftSwitch
PSTN/ISDN
PSTN/ISDN
(b)
Figure 16.24 SBC-based SIP trunking architecture: (a) high-level SoftSwitch architecture and (b) trunking architecture.
Call Services in SIP ◾ 587
Real-time
Audio/video/ Presence/IM
multimedia Transcoding
data bridging application
application server
server server server
SIP SIP
SIP
IP backbone network
SBC SBC
media media
controller RTP, MSRP controller
SIP
SIP RTP, RTP, RTP, RTP,
MSRP MSRP MSRP MSRP
Local center
Figure 16.25 Graceful migration of the distributed SBC-based SIP trunking architecture to SIP-based IP telephony
network.
sets of multimedia conferencing architectures will usher much the physical separation between the signaling and the media
greater multimedia-rich experiences for new commercial and entities of SBC makes it possible to do so where the signal-
residential users. In turn, it will create a new demand for build- ing entity needs to control each of those geographically dis-
ing much better multimedia applications and corresponding tributed media entities acting as a single logically centralized
service architectures in the future. signaling controller. However, it brings another issue in that
we need to use a protocol for communications between the
signaling and media controllers. We have taken H.248 [8] as
16.4.3 Conclusion the protocol standardized in the ITU-T. Although SIP could
We have explained how the separation between the signal- be another alternative to H.248 with much richer capabili-
ing and media plane of the SBC architecture provides scal- ties, no SIP standard has been published thus far. We have
ability, enhanced reliability, and economy of scale for the left all the call flows using the distributed SBC architecture
RT and near-RT multimedia services. The primary reason as an exercise for the following (as discussed in Section 14.2):
has been that the distribution of signaling and media traf- topology hiding, media traffic management, fixing capability
fic for SIP-based multimedia applications differ significantly, mismatches, maintaining SIP-related NAT bindings, access
especially media traffic becomes much higher in proportion control, protocol repair, and media encryption.
to that of the signaling traffic. It demands that the signal- In the context of the distributed SBC-architecture, we
ing and media entities require scaling independently. Only have explained how the so-called SoftSwitch architecture has
588 ◾ Handbook on Session Initiation Protocol
evolved a long time ago while ITU-T developed the H.248 service invokes the requested SIP method to each of the
protocol. Later on, the SoftSwitch-based SIP trunking archi- targets contained in the list. This type of URI-list service
tecture has been the most successful in the early deployment is referred to as a REFER-Recipient throughout this docu-
that allows the evolution of the large-scale SIP-based native ment. This document defines an extension to the SIP REFER
IP telephony services over the IP network offering a graceful method that allows a SIP UAC to include a URI list (RFC
migration path for the legacy POTS/PSTN/ISDN telephony 4826) of REFER-Targets in a REFER request and send it
services. Even after almost two decades of IP telephony ser- to a REFER-Recipient. The REFER-Recipient creates a new
vices, the low-speed circuit-switched lines are still carrying SIP request for each entry in the URI list and sends it to each
more than 90% of the IP telephony traffic because of the REFER-Recipient.
existing huge amount of legacy telephony lines built over 100 The URI list that contains the list of targets is used in con-
years worldwide. With the emergence of new SIP-based all- junction with RFC 5364 to allow the sender to indicate the role
IP telephony with much richer multimedia RT and near-RT (e.g., to, cc, or anonymous) in which REFER-Target is involved in
broadband applications, users with much richer multimedia the signaling. We represent multiple targets of a REFER request
experiences throughout the world are causing this situation using a URI list as specified in RFC 4826. A REFER-Issuer
to change. We have pointed out how more distributed scal- that wants to refer a REFER-Recipient to a set of destinations
able and reliable SIP-based services architecture needs to be creates a SIP REFER request. The Refer-To header contains a
developed in the future. pointer to a URI list, which is included in a body part, and
an option tag in the Require header field: multiple-refer. This
option tag indicates the requirement to support the functional-
ity described in this specification. When the REFER-Recipient
16.5 Referring Call receives such a request, it creates a new request per REFER-
to Multiple Resources Target and sends them, one to each REFER-Target. This docu-
ment does not provide any mechanism for REFER-Issuers to
16.5.1 Overview
find out about the results of a REFER request containing mul-
We describe RFC 5368 that defines extensions to the SIP tiple REFER-Targets. Furthermore, it does not provide support
REFER method so that it can be used to refer to multiple for the implicit subscription mechanism that is part of the SIP
resources in a single request. These extensions include the REFER method. The way REFER-Issuers are kept informed
use of pointers to URI lists in the Refer-To header field and about the results of a REFER is service specific. For example,
the multiple-refer SIP option tag. The SIP REFER method a REFER-Issuer sending a REFER request to invite a set of
allows a UA to request a second UA to send a SIP request to participants to a conference can discover which participants
a third party. For example, if Alice is in a call with Bob, and were successfully brought into the conference by subscribing to
decides that Bob needs to talk to Carol, Alice can instruct the conference state event package specified in RFC 4575 (see
her SIP UA to send a REFER request to Bob’s UA providing Sections 2.2 and 2.4.4.1).
Carol’s SIP Contact information. Assuming Bob has given
it permission, Bob’s UA will attempt to call Carol using that
16.5.3 Multiple-Refer SIP Option Tag
contact. That is, it will send an INVITE request to that con-
tact. A number of applications need to request this second A new SIP option tag is defined for the Require and Supported
UA to initiate transactions toward a set of destinations. In header fields: multiple-refer. A UA including the multiple-
one example, the moderator of a conference may want the refer option tag in a Supported header field indicates com-
conference server to send BYE requests to a group of partici- pliance with this specification. A UA generating a REFER
pants. In another example, the same moderator may want the with a pointer to a URI list in its Refer-To header field must
conference server to INVITE a set of new participants. We include the multiple-refer option tag in the Require header
define an extension to the REFER method so that REFER field of the REFER.
requests can be used to refer other UAs (such as conference
servers) to multiple destinations. In addition, this mecha- 16.5.4 Suppressing REFER’s
nism uses the suppression of the REFER method implicit
subscription specified in RFC 4488 (see Section 2.8.2).
Implicit Subscription
REFER requests with a single REFER-Target establish
implicitly a subscription to the refer event. The REFER-
16.5.2 Operation Issuer is informed about the result of the transaction toward
We describe an application of URI-list services (RFC 5363, the REFER-Target through this implicit subscription. As
see Section 19.7) that allows a URI-list service to receive a described in RFC 3515 (see Section 2.5), NOTIFY requests
SIP REFER request containing a list of targets. The URI-list sent as a result of an implicit subscription created by a REFER
Call Services in SIP ◾ 589
F1. REFER
F4. BYE
F5. BYE
F6. 200 OK
F7. 200 OK
F8. 200 OK
16.5.6 Behavior of SIP REFER-Issuers includes resources tagged with the copyControl attribute set
to a value of to or cc, and if the request is appropriate for
As indicated in Sections 1.4 and 2.1, a SIP REFER-Issuer the service, for example, it is not received mid-dialog, the
that creates a REFER request with multiple REFER-Targets REFER-Recipient should include a URI list in each of the
includes a multiple-refer and norefersub option tags in the outgoing requests. This list should be formatted according
Require header field and, if appropriate, a Refer-Sub header to RFCs 4826 and 5364. The REFER-Recipient must follow
field set to false. The REFER-Issuer includes the set of the procedures specified in RFC 4826 with respect to han-
REFER-Targets in a recipient-list body whose disposition dling of the anonymize, count, and copyControl attributes.
type is recipient-list (RFC 5363, see Section 19.7). The URI- RFC 5363 (see Section 19.7) discusses cases when dupli-
list body is further described earlier. The Refer-To header cated URIs are found in a URI list. To avoid duplicated
field of a REFER request with multiple REFER-Targets requests, REFER-Recipients must take those actions speci-
must contain a pointer (i.e., a Content-ID URL as per RFC fied in RFC 5363 into account to avoid sending a duplicated
2392, see Section 2.8.2) that points to the body part that request to the same target. If the REFER-Recipient includes
carries the URI list. The REFER-Issuer should not include a URI list in an outgoing request, it must include a Content-
any particular URI more than once in the URI list. RFC Disposition header field, specified in RFC 2183 (see Section
4826 provides features such as hierarchical lists and the abil- 2.8.2), with the value set to recipient-list-history and a han-
ity to include entries by reference relative to the XCAP root dling parameter, specified in RFC 3204, set to optional.
URI. However, these features are not needed by the multiple Since the multiple REFER service does not use hierarchi-
REFER service defined in this document. Therefore, when cal lists nor may lists that include entries by reference to
using the default resource list document, SIP REFER-Issuers the XCAP root URI, a REFER-Recipient receiving a URI
generating REFER requests with multiple REFER-Targets list with more information than what has been described in
should use flat lists (i.e., no hierarchical lists) and should not Section 2.2 discard all the extra information. The REFER-
use <entry-ref> elements. Recipient follows the rules in RFC 3515 (see Section 2.5) to
generate the necessary requests toward the REFER-Targets,
acting as if it had received a regular (no URI list) REFER per
16.5.7 Behavior of REFER-Recipients
each URI in the URI list.
The REFER-Recipient follows the rules of RFC 3515 (see
Section 2.5) to determine the status code of the response
16.5.8 Example
to the REFER. The REFER-Recipient should not create an
implicit subscription, and should add a Refer-Sub header Figure 16.27 shows an example flow where a REFER-Issuer
field set to false in the 200 OK response. The incoming sends a multiple-REFER request to the focus of a conference,
REFER request typically contains a URI-list document which acts as the REFER-Recipient. The REFER-Recipient
or reference with the actual list of targets. If this URI list generates a BYE request per REFER-Target. Details for using
F1. REFER
F4. BYE
F5. BYE
F6. 200 OK
F7. 200 OK
F8. 200 OK
Figure 16.27 Example flow of a REFER request containing multiple REFER-Targets. (Copyright IETF. Reproduced with
permission.)
Call Services in SIP ◾ 591
REFER request to remove participants from a conference are the method of the SIP request that the REFER-Recipient
specified in RFC 4579 (see Sections 2.2 and 2.4.4.1). generates.
The REFER request (F1) contains a Refer-To header field Figure 16.29 shows an example of the BYE request (F3)
that includes a pointer to the message body, which carries a that the REFER Recipient sends to the first REFER-Target.
list with the URIs of the REFER-Targets. In this example, the
URI list does not contain the copyControl attribute exten-
sion. The REFER’s Require header field carries the multiple- 16.6 Call Services
refer and norefersub option tags. The Request-URI is set to
a GRUU (as a guarantee that the REFER request will not
with Content Indirection
fork). The Refer-Sub header field is set to false to request the 16.6.1 Overview
suppression of the implicit subscription. Figure 16.28 shows
an example of this REFER request. The resource list docu- SIP is a signaling protocol that is used to create, modify, or ter-
ment contains the list of REFER-Target URIs along with minate sessions with one or more participants. SIP messages,
like HTTP, are syntactically composed of a start line, one or
more headers, and an optional body. Unlike HTTP, SIP is not
designed as a general-purpose data transport protocol. There
are numerous reasons why it might be desirable to specify
the content of the SIP message body indirectly. For band-
width-limited applications such as cellular wireless, indirec-
tion provides a means to annotate the (indirect) content with
meta-data, which may be used by the recipient to determine
whether or not to retrieve the content over a resource-limited
link. It is also possible that the content size to be transferred
might overwhelm intermediate signaling proxies, thereby
Figure 16.29 BYE request. unnecessarily increasing network latency. For time-sensitive
592 ◾ Handbook on Session Initiation Protocol
F4. image/jpeg
16.6.2 Use-Case Examples
There are several examples of using the content indirection
mechanism. These are examples only and are not intended to Figure 16.31 MESSAGE method with content indirection.
limit the scope or applicability of the mechanism. (Copyright IETF. Reproduced with permission.)
Call Services in SIP ◾ 593
remote party. Carrying such a document directly in the ◾◾ It should be possible to ensure the integrity and confi-
MESSAGE request is not an appropriate use of the signal- dentiality of the URI when it is received by the remote
ing channel. Furthermore, the document to be shared may party.
reside on a completely independent server from that of the ◾◾ It must be possible to process the content indirection
originating party. without human intervention.
In this example, a user UAC wishes to exchange a JPEG ◾◾ It must allow for indirect transference of content in any
image that she has stored on her web server with user UAS SIP message that would otherwise carry that content
with whom she has an IM conversation. She intends to ren- as a body.
der the JPEG inline in the IM conversation. The recipient of
the MESSAGE request launches an HTTP GET request to
the web server to retrieve the JPEG image. 16.6.4 Application of MIME-URI
Standard to Content Indirection
16.6.3 Requirements We have explained earlier that RFC 2017 provides most of
the functionality needed for a SIP content indirection mech-
◾◾ It must be possible to specify the location of content via anism; however, RFC 2017 by itself is not a complete solu-
a URI. Such URIs must conform to RFC 3986. tion. We describe the following text of RFC 2017 that meets
◾◾ It must be possible to specify the length of the indirect the requirements for content indirection in SIP.
content.
◾◾ It must be possible to specify the type of the indirect
content. 16.6.4.1 Specifying Support
◾◾ It must be possible to specify the disposition of each for Content Indirection
URI independently.
A UAC/UAS indicates support for content indirection by
◾◾ It must be possible to label each URI to identify if
including the message/external-body MIME type in the
and when the content referred to by that URI has
Accept header. The UAC/UAS may supply additional values
changed. Applications of this mechanism may send
in the Accept header to indicate the content types that it
the same URI more than once. The intention of this
is willing to accept, either directly or through content indi-
requirement is to allow the receiving party to deter-
rection. UAs supporting content indirection must support
mine whether the content referenced by the URI has
content indirection of the application/sdp MIME type. For
changed, without having to retrieve that content.
example:
Examples of ways the URI could be labeled include
a sequence number, time stamp, and version num- Accept: message/external-body, image/*,
ber. When used with HTTP, the entity-tag (ETAG) application/sdp
mechanism, as defined in RFCs 7230–7235, may be
appropriate. Note that we are labeling not the URI
itself but the content to which the URI refers, and 16.6.4.2 Mandatory Support for HTTP URI
that the label is therefore effectively metadata of the
content itself. Applications that use this content indirection mechanism
◾◾ It must be possible to specify the time span for which must support the HTTP URI scheme. Additional URI
a given URI is valid. This may or may not be the same schemes may be used, but a UAC/UAS must support receiv-
as the lifetime for the content itself. ing a HTTP URI for indirect content if it advertises support
◾◾ It must be possible for the UAC and the UAS to indi- for content indirection. The UAS may advertise alternate
cate support of this content indirection mechanism. A access schemes in the schemes parameter of the Contact
fallback mechanism should be specified in the event header in the UAS response to the UAC’s session establish-
that one of the parties is unable to support content ment request (e.g., INVITE, SUBSCRIBE), as described in
indirection. RFC 3840 (see Section 3.4).
◾◾ It must be possible for the UAC and UAS to negotiate
the type of the indirect content when using the content 16.6.4.3 Rejecting Content Indirection
indirection mechanism.
◾◾ It must be possible for the UAC and UAS to negotiate If a UAS receives a SIP request that contains a content indi-
support for any URI scheme to be used in the content rection payload and the UAS cannot or does not wish to
indirection mechanism. This is in addition to the abil- support such a content type, it must reject the request with
ity to negotiate the content type. a 415 Unsupported Media Type response as defined in SIP
594 ◾ Handbook on Session Initiation Protocol
RFC 3261 (see Section 2.6). In particular, the UAC should time. Rather, the supplier of the URI must specify the time
note the absence of the message/external-body MIME type period for which this URI is valid and accessible. This is done
in the Accept header of this response to indicate that the through an EXPIRATION parameter of the Content-Type.
UAS does not support content indirection, or the absence The format of this expiration parameter is an RFC 1123 date–
of the particular MIME type of the requested comment time value. This is further restricted in this application to use
to indicate that the UAS does not support the particular only GMT time, consistent with the Date: header in SIP. This
media type. is a mandatory parameter. Note that the date–time value can
range from minutes to days or even years. For example:
way to resolve both user level and UA level incompatibilities. to situations where two transcoders are introduced (one by the
Thus, the invocation mechanisms described in this docu- offerer and one by the answerer) in a session that would not
ment are generally applicable to any type of incompatibility need any transcoding services at all. An example of the situ-
related to how the information that needs to be communi- ation above is a call between two Global System for Mobile
cated is encoded. Furthermore, although this framework Communications (GSM) phones (without using transcod-
focuses on transcoding, the mechanisms described are appli- ing-free operation). Both phones use a GSM codec, but the
cable to media manipulation in general. It would be pos- speech is converted from GSM to Pulse Code Modulation
sible to use them, for example, to invoke a server that simply (PCM) by the originating Mobile Switching Center (MSC),
increases the volume of an audio stream. This document does and from PCM back to GSM by the terminating MSC. Note
not describe media server discovery. That is an orthogonal that transcoding services can be symmetric (e.g., speech-
problem that one can address using UA provisioning or other to-text plus text-to-speech) or asymmetric (e.g., a one-way
methods. speech-to-text transcoding for a hearing-impaired user that
can talk).
In the 3PCC transcoding model defined here, the UA our 3PCC flows, we have followed the general principle that
invoking the transcoding service has a signaling relation- a 200 OK response from the transcoding service has to be
ship with the transcoder and another signaling relation- received before contacting the callee. This tries to ensure
ship with the remote UA. There is no signaling relationship that the transcoding service will be available when the callee
between the transcoder and the remote UA, as shown in accepts the session.
Figure 16.32. Still, the transcoding service does not know the exact
This model is suitable for advanced end points that are able type of transcoding it will be performing until the callee
to perform 3PCC. It allows end points to invoke transcoding accepts the session. Thus, there is always the chance of failing
services on a stream basis. That is, the media streams that to provide transcoding services after the callee has accepted
need transcoding are routed through the transcoder, while the session. A system with more stringent requirements could
the streams that do not need it are sent directly between the use preconditions to avoid this situation. When precondi-
end points. This model also allows invoking one transcoder tions are used, the callee is not alerted until everything is
for the sending direction and a different one for the receiv- ready for the session. We define some terms here just only for
ing direction of the same stream. Invoking a transcoder in explaining some example call flows of the third-party trans-
the middle of an ongoing session is also quite simple. This coding services as follows:
is useful when session changes occur (e.g., an audio session
is upgraded to an audio/video session) and the end points ◾◾ SDP A: A session description generated by A. It con-
cannot cope with the changes (e.g., they had common audio tains, among other things, the transport address (IP
codecs but no common video codecs). The privacy level that address and port number) where A wants to receive
is achieved using 3PCC is high, since the transcoder does not media for each particular stream.
see the signaling between both end points. In this model, the ◾◾ SDP B: A session description generated by B. It con-
transcoder only has access to the information that is strictly tains, among other things, the transport address where
needed to perform its function. B wants to receive media for each particular stream.
◾◾ SDP A+B: A session description that contains, among
16.7.2.2 Transcoding Call Control Flows other things, the transport address where A wants to
receive media and the transport address where B wants
Given two UAs (A and B) and a transcoding server (T), the to receive media.
invocation of a transcoding service consists of establishing ◾◾ SDP TA: A session description generated by T and
two sessions: A–T and T–B. How these sessions are estab- intended for A. It contains, among other things, the
lished depends on which party, caller (A) or the callee (B), transport address where T wants to receive media from
invokes the transcoding services. Next, we describe the cal- A.
lee and caller invocation of the transcoding service. In all ◾◾ SDP TB: A session description generated by T and
intended for B. It contains, among other things, the
transport address where T wants to receive media from
B.
Media transcoding ◾◾ SDP TA+TB: A session description generated by T
server T that contains, among other things, the transport
(example.com) address where T wants to receive media from A and
the transport address where T wants to receive media
from B.
SIP
Transcoded
media
16.7.2.3 Callee’s Invocation
stream
In this scenario, B receives an INVITE from A, and B decides
SIP to introduce T in the session. Figure 16.32 shows the call
flow for this scenario. In Figure 16.33, A can both hear and
Bob Joe
Party A Party B speak, and B is a deaf user with a speech impairment. A pro-
SIP UA 1 SIP UA 2 poses to establish a session that consists of an audio stream
[email protected] [email protected] (F1). B wants to send and receive only text, so it invokes a
transcoding service T that will perform both speech-to-text
Figure 16.32 3PCC model. (Copyright IETF. Reproduced and text-to-speech conversions (F2). The session descriptions
with permission.) of Figure 16.33 are partially shown below.
Call Services in SIP ◾ 599
When either A or B decides to terminate the session, it Figure 16.34 Callee’s invocation after initial INVITE with-
sends a BYE indicating that the session is over. If the first out SDP. (Copyright IETF. Reproduced with permission.)
600 ◾ Handbook on Session Initiation Protocol
of transcoding services are encouraged to return the same ses- 16.7.2.4 Caller’s Invocation
sion description in (F8) as in (F3) in this type of scenario. The
In this scenario, A wishes to establish a session with B using a
session descriptions of this flow are shown below:
transcoding service. A uses 3PCC to set up the session between
T and B. The call flow we provide here is slightly different
(F2) INVITE SDP A+B
from the ones in RFC 3725 (see Section 18.3). In 3PCC, the
m=audio 20000 RTP/AVP 0 controller establishes a session between two UAs, which are
c=IN IP4 0.0.0.0 the ones deciding the characteristics of the streams. Here, A
m=text 40000 RTP/AVP 96 wants to establish a session between T and B, but A wants to
c=IN IP4 B.example.com decide how many and which types of streams are established.
a=rtpmap:96 t140/1000
That is why A sends its session description in the first INVITE
(F1) to T, as opposed to the media-less initial INVITE recom-
(F3) 200 OK SDP TA+TB
mended by the 3PCC model. Figure 16.35 shows the call flow
m=audio 30000 RTP/AVP 0 for this scenario. We do not include the session descriptions of
c=IN IP4 T.example.com this flow since they are very similar to those in Figure 16.34.
m=text 30002 RTP/AVP 96 In this flow, if T returns the same SDP TA+TB in (F8) as in
c=IN IP4 T.example.com (F2), messages (F9), (F10), and (F11) can be skipped.
a=rtpmap:96 t140/1000
of the speech-to-text conversion). There are various possible to ask for clarifications (e.g., I did not get that, could you
solutions for this problem. One solution consists of using the repeat, please?) to the party he or she is receiving media from.
SDP group attribute with Flow Identification (FID) semantics
(RFC 3351). FID allows requesting that a stream is sent to two (F1) INVITE SDP AT1
different transport addresses in parallel, as shown below:
m=text 20000 RTP/AVP 96
a=group:FID 1 2 c=IN IP4 A.example.com
m=audio 20000 RTP/AVP 0 a=rtpmap:96 t140/1000
c=IN IP4 A.example.com a=sendonly
a=mid:1 m=audio 20000 RTP/AVP 0
m=audio 30000 RTP/AVP 0 c=IN IP4 0.0.0.0
c=IN IP4 T.example.com a=recvonly
a=mid:2
(F2) INVITE SDP AT2
The problem with this solution is that the majority of the
m=text 20002 RTP/AVP 96
SIP UAs do not support FID. Moreover, only a small frac-
c=IN IP4 A.example.com
tion of the few UAs that support FID also support sending a=rtpmap:96 t140/1000
simultaneous copies of the same media stream at the same a=recvonly
time. In addition, FID forces both copies of the stream to use m=audio 20000 RTP/AVP 0
the same codec. Therefore, we recommend that T (instead of c=IN IP4 0.0.0.0
a UA) replicates the media stream. The transcoder T receiv- a=sendonly
ing the following session description performs speech-to-
text and text-to-speech conversions between the first audio (F3) 200 OK SDP T1A+T1B
stream and the text stream. In addition, T copies the first m=text 30000 RTP/AVP 96
audio stream to the second audio stream and sends it to A. c=IN IP4 T1.example.com
a=rtpmap:96 t140/1000
m=audio 40000 RTP/AVP 0 a=recvonly
c=IN IP4 B.example.com m=audio 30002 RTP/AVP 0
m=audio 20000 RTP/AVP 0 c=IN IP4 T1.example.com
c=IN IP4 A.example.com a=sendonly
a=recvonly
m=text 20002 RTP/AVP 96 (F5) 200 OK SDP T2A+T2B
c=IN IP4 A.example.com
a=rtpmap:96 t140/1000 m=text 40000 RTP/AVP 96
c=IN IP4 T2.example.com
a=rtpmap:96 t140/1000
16.7.2.6 Transcoding Services in Parallel a=sendonly
m=audio 40002 RTP/AVP 0
Transcoding services sometimes consist of human relays c=IN IP4 T2.example.com
(e.g., a person performing speech-to-text and text-to-speech a=recvonly
conversions for a session). If the same person is involved in
both conversions (i.e., from A to B and from B to A), he/ (F7) INVITE SDP T1B+T2B
she has access to all of the conversation. To provide some
degree of privacy, sometimes two different persons are allo- m=audio 30002 RTP/AVP 0
c=IN IP4 T1.example.com
cated to do the job (i.e., one person handles A→B and the
a=sendonly
other B→A). This type of disposition is also useful for auto- m=audio 40002 RTP/AVP 0
mated transcoding services, where one machine converts text c=IN IP4 T2.example.com
to synthetic speech (text-to-speech) and another performs a=recvonly
voice recognition (speech-to-text). The scenario described
above involves four different sessions: A-T1, T1-B, B-T2, (F8) 200 OK SDP BT1+BT2
and T2-A. Figure 16.36 shows the call flow where A invokes
T1 and T2. Note this example uses unidirectional media m=audio 50000 RTP/AVP 0
streams (i.e., sendonly or recvonly) to clearly identify which c=IN IP4 B.example.com
a=recvonly
transcoder handles media in which direction. Nevertheless, m=audio 50002 RTP/AVP 0
nothing precludes the use of bidirectional streams in this sce- c=IN IP4 B.example.com
nario. They could be used, for example, by a human relay a=sendonly
602 ◾ Handbook on Session Initiation Protocol
A T1 T2 B
F4. ACK
F6. ACK
F9. INVITE
F10. INVITE
F15. ACK
Media Media
Media Media
Figure 16.36 Transcoding services in parallel. (Copyright IETF. Reproduced with permission.)
F3. ACK
F6. ACK
F9. ACK
F10. INVITE
F11. INVITE
F12. 200 OK
(SDP T1A+T1T2)
F17. ACK
Figure 16.37 Transcoding services in serial. (Copyright IETF. Reproduced with permission.)
604 ◾ Handbook on Session Initiation Protocol
16.7.3.2.3 Example
Transcoded
media stream Figure 16.39 shows the message flow for the caller’s invoca-
tion of a transcoder T. The caller A sends an INVITE (F1)
to the transcoder (T) to establish the session A–T. Following
Bob Joe
Party A Party B the procedures in RFC 5366, the caller A adds a body part
SIP UA 1 SIP UA 2 whose disposition type is recipient-secured URI-list (RFC
[email protected] [email protected] 5363, see Section 19.7).
The following example shows an INVITE with two body
Figure 16.38 Conference bridge model. (Copyright IETF. parts: an SDP (RFC 4566, see Section 7.7) session descrip-
Reproduced with permission.) tion and a URI-list.
Call Services in SIP ◾ 605
Media transcoding
Party A Party B
server T
F5 ACK
F7 ACK
Figure 16.39 Successful invocation of a transcoder by the caller. (Copyright IETF. Reproduced with permission.)
Media transcoding
Party A Party B
server T
F4 603 Decline
F5 ACK
F6 603 Decline
F7 ACK
Figure 16.40 Unsuccessful session establishment. (Copyright IETF. Reproduced with permission.)
the History-Info header field (RFC 4244, see Section 2.8) the INVITE request from the caller, and it would generate
between the transcoder and the caller resolves the previous an outgoing INVITE request toward the callee. The caller
ambiguity. Note that this ambiguity problem could also have would get information about the result of the latter INVITE
been resolved by having transcoders act as a pure conference request by subscribing to the conference event package (RFC
bridge. The transcoder would respond with a 200 OK to 4575) at the transcoder. Although this flow would have
Media transcoding
Party A Party B
server T
F3 ACK
F8 ACK
F10 ACK
Figure 16.41 Callee’s invocation of a transcoder. (Copyright IETF. Reproduced with permission.)
Call Services in SIP ◾ 607
resolved the ambiguity problem without requiring support willing to receive INFO requests for any INFO Package,
for the History-Info header field, it is more complex, requires while still informing other UAs that it supports the INFO
a higher number of messages, and introduces higher session Package mechanism. When a UA sends an INFO request, it
setup delays. That is why it was not chosen to implement uses the Info-Package header field to indicate which INFO
transcoding services. Package is associated with the request. One particular INFO
request can only be associated with a single INFO Package.
16.7.3.3 Callee’s Invocation
16.8.2 Motivation
If a UA receives an INVITE with a session description that is
not acceptable, it can redirect it to the transcoder by using a A number of applications, standardized and proprietary, make
302 Moved Temporarily response. The Contact header field use of the INFO method as it was previously defined in RFC
of the 302 Moved Temporarily response contains the URI 2976 (obsoleted by RFC 6087), here referred to as legacy INFO
of the transcoder plus a ?body= parameter. This parameter usage. These include but are not limited to the following:
contains a recipient-list body with B’s URI. Note that some
escaping (e.g., for CRLFs) is needed to encode a recipient-list ◾◾ RFC 3372 (see Section 14.4.2) specifies the encapsula-
body in such a parameter. Figure 16.41 shows the message tion of ISDN User Part in SIP message bodies. ITU-T
flow for this scenario. and the 3GPP have specified similar procedures.
Note that the syntax resulting from encoding a body ◾◾ ECMA-355 [12] specifies the encapsulation of QSIG
into a URI as described earlier is quite complex. It is actually in SIP message bodies.
simpler for callees to invoke transcoding services using the ◾◾ RFC 5022 specifies how INFO is used as a transport
3PCC transcoding model as described earlier instead. mechanism by the Media Server Control Markup
Language (MSCML) protocol. MSCML uses an
option tag in the Require header field to ensure that
the receiver understands the INFO content.
16.8 INFO Method—Mid-Call ◾◾ RFC 5707 specifies how INFO is used as a transport
Information Transfer mechanism by the Media Server Markup Language
(MSML) protocol.
16.8.1 Overview ◾◾ Companies have been using INFO messages in order
RFC 6086 that is described here defines a method, INFO, to request fast video update. Currently, a standardized
for the SIP. The purpose of the INFO message is to carry mechanism, based on the RTCP, has been specified in
application-level information between end points, using the RFC 5168.
SIP dialog signaling path. Note that the INFO method is not ◾◾ Companies have been using INFO messages in order
used to update the characteristics of a SIP dialog or session, but to transport DTMF tones. All mechanisms are propri-
to allow the applications that use the SIP session to exchange etary and have not been standardized.
information (which might update the state of those applica-
tions). Use of the INFO method does not constitute a separate Some legacy INFO usages are also recognized as being
dialog usage. INFO messages are always part of, and share the shortcuts to more appropriate and flexible mechanisms.
fate of, an invite dialog usage (RFC 5057, see Sections 3.6.5 and Furthermore, RFC 2976 did not define mechanisms that
16.2). INFO messages cannot be sent as part of other dialog would enable a SIP UA to indicate (a) the types of appli-
usages, or outside an existing dialog. This document also defines cations and contexts in which the UA supports the INFO
an INFO Package mechanism. An INFO Package specification method or (b) the types of applications and contexts with
defines the content and semantics of the information carried which a specific INFO message is associated. Because legacy
in an INFO message associated with the INFO Package. The INFO usages do not have associated INFO Packages, it is
INFO Package mechanism also provides a way for UAs to indi- not possible to use the Recv-Info and Info-Package header
cate for which INFO Packages they are willing to receive INFO fields with legacy INFO usages. That is, a UA cannot use the
requests, and which INFO Package a specific INFO request is Recv-Info header field to indicate for which legacy INFO
associated with. A UA uses the Recv-Info header field, on a per- usages it is willing to receive INFO requests, and a UA can-
dialog basis, to indicate for which INFO Packages it is willing not use the Info-Package header field to indicate with which
to receive INFO requests. A UA can indicate an initial set of legacy INFO usage an INFO request is associated.
INFO Packages during dialog establishment and can indicate a Owing to the problems described above, legacy INFO
new set during the lifetime of the invite dialog usage. usages often require static configuration to indicate the
Note that a UA can use an empty Recv-Info header types of applications and contexts for which the UAs sup-
field (a header field without a value) to indicate that it is not port the INFO method, and the way they handle application
608 ◾ Handbook on Session Initiation Protocol
information transported in INFO messages. This has caused on the types of applications for which the use of INFO is
interoperability problems in the industry. To overcome these appropriate. This section describes how a UA handles INFO
problems, the SIP Working Group has spent significant requests and responses, as well as the message bodies included
discussion time over many years, coming to an agreement in INFO messages.
on whether it was more appropriate to fix INFO (by defin-
ing a registration mechanism for the ways in which it was
16.8.4.2 INFO Request
used) or to deprecate it altogether (with the usage described
in RFC 3398 being grandfathered as the sole legitimate 16.8.4.2.1 INFO Request Sender
usage). Although it required substantial consensus building
An INFO request can be associated with an INFO Package
and concessions by those more inclined to completely depre-
described later, or associated with a legacy INFO usage. The
cate INFO, the eventual direction of the working group was
construction of the INFO request is the same as any other
to publish a framework for registration of INFO Packages as
nontarget refresh request within an existing invite dialog
defined in this specification.
usage as described in RFC 3261 (see Section 2.8.2). When
a UA sends an INFO request associated with an INFO
16.8.2.1 Applicability and Backwards Package, it must include an INFO-Package header field
Compatibility that indicates which INFO Package is associated with the
request. A specific INFO request can be used only for a sin-
We describe here an RFC 6086-specified method, INFO,
gle INFO Package. When a UA sends an INFO request asso-
for the SIP (RFC 3261), and an INFO Package mechanism.
ciated with a legacy INFO usage, there is no INFO Package
As RFC 6086 obsoletes RFC 2976, for backwards compat-
associated with the request, and the UA must not include an
ibility, we also specify a legacy mode of usage of the INFO
Info-Package header field in the request. The INFO request
method that is compatible with the usage previously defined
must not contain a Recv-Info header field. A UA can only
in RFC 2976, here referred to as legacy INFO Usage. For
indicate a set of INFO Packages for which it is willing to
backwards compatibility purposes, this document does not
receive INFO requests by using the SIP methods (and their
deprecate legacy INFO usages, and does not mandate users
responses) listed in Section 16.8.5. A UA must not send an
to define INFO Packages for such usages. However,
INFO request outside an invite dialog usage and must not
send an INFO request for an INFO Package inside an invite
◾◾ A UA must not insert an Info-Package header field in
dialog usage if the remote UA has not indicated willingness
a legacy INFO request (as described later, an INFO
to receive that INFO Package within that dialog.
request associated with an INFO Package always con-
If a UA receives a 469 Bad INFO Package response to an
tains an Info-Package header field).
INFO request, based on RFC 5057 (see Sections 3.6.5 and
◾◾ Any new usage must use the INFO Package mechanism
16.2), the response represents a Transaction Only failure, and
defined in this specification, since it does not share the
the UA must not terminate the invite dialog usage. Owing
issues associated with legacy INFO usage, and since
to the possibility of forking, the UA that sends the initial
INFO Packages can be registered with IANA.
INVITE request must be prepared to receive INFO requests
from multiple remote UAs during the early dialog phase. In
addition, the UA must be prepared to receive different Recv-
16.8.3 UAs Are Allowed to Enable Both Info header field values from different remote UAs. Note that
Legacy INFO Usages and Info if the UAS (receiver of the initial INVITE request) sends an
INFO Package usages are a part of the same invite dialog INFO request just after it has sent the response that creates
usage, but UAs shall not mix legacy INFO usages and INFO the dialog, the UAS needs to be prepared for the possibil-
Package usages in order to transport the same application- ity that the INFO request will reach the UAC before the
level information. If possible, UAs shall prefer the usage of dialog-creating response, and might therefore be rejected by
an INFO Package. the UAC. In addition, an INFO request might be rejected
due to a race condition, if a UA sends the INFO request at
the same time that the remote UA sends a new set of INFO
16.8.4 INFO Method Packages for which it is willing to receive INFO requests.
16.8.4.1 General
The INFO method (RFC 6086) provides a mechanism for
16.8.4.2.2 INFO Request Receiver
transporting application-level information that can further If a UA receives an INFO request associated with an INFO
enhance a SIP application. Later, we provide more details Package that the UA has not indicated willingness to receive,
Call Services in SIP ◾ 609
the UA must send a 469 Bad INFO Package response types in accordance with that INFO Package. However, in
described later, which contains a Recv-Info header field with accordance with RFC 3261, the UA still indicates the sup-
INFO Packages for which the UA is willing to receive INFO ported MIME types using the Accept header.
requests. The UA must not use the response to update the set of
INFO Packages, but simply to indicate the current set. In the
terminology of multiple dialog usages (RFC 5057, see Sections 16.8.4.3.2 INFO Response Message Body
3.6.5 and 16.2), this represents a Transaction Only failure, and A UA must not include a message body associated with an
does not terminate the invite dialog usage. If a UA receives INFO Package in an INFO response. Message bodies asso-
an INFO request associated with an INFO Package, and the ciated with INFO Packages must only be sent in INFO
message-body part with Content-Disposition Info-Package requests. A UA may include a message body that is not asso-
described below has a MIME type that the UA supports but ciated with an INFO Package in an INFO response.
not in the context of that INFO Package, it is recommended
that the UA send a 415 Unsupported Media Type response.
The UA may send other error responses, such as Request 16.8.4.4 Order of Delivery
Failure (4xx), Server Failure (5xx), and Global Failure (6xx), The INFO Package mechanism does not define a delivery
in accordance with the error-handling procedures (see Section order mechanism. INFO Packages can rely on the CSeq
2.6). Otherwise, if the INFO request is syntactically correct header field to detect if an INFO request is received out of
and well structured, the UA must send a 200 OK response. order. If specific applications need additional mechanisms for
Note that if the application needs to reject the information order of delivery, those mechanisms, and related procedures,
that it received in an INFO request, that needs to be done on are specified as part of the associated INFO Package (e.g., the
the application level. That is, the application needs to trigger a use of sequence numbers within the application data).
new INFO request that contains information that the previ-
ously received application data was not accepted. Individual
INFO Package specifications need to describe the details for 16.8.5 INFO Packages
such procedures. 16.8.5.1 General
An INFO Package specification defines the content and seman-
16.8.4.2.3 SIP Proxies tics of the information carried in an INFO message associated
Proxies need no additional behavior beyond that described in with an INFO Package. The INFO Package mechanism pro-
RFC 3261 to support INFO. vides a way for UAs to indicate for which INFO Packages they
are willing to receive INFO requests, and with which INFO
Package a specific INFO request is associated.
16.8.4.3 INFO Message Body
16.8.4.3.1 INFO Request Message Body 16.8.5.2 UA Behavior
The purpose of the INFO request is to carry application-level 16.8.5.2.1 General
information between SIP UAs. The application information
data is carried in the payload of the message body of the This section describes how a UA handles INFO Packages,
INFO request. Note that an INFO request associated with an how a UA uses the Recv-Info header field, and how the UA
INFO Package can also include information associated with acts in re-INVITE rollback situations.
the INFO Package using Info-Package header field param-
eters. If an INFO request associated with an INFO Package
16.8.5.2.2 UA Procedures
contains a message-body part, the body part is identified by
a Content-Disposition header field Info-Package value. The A UA that supports the INFO Package mechanism must
body part can contain a single MIME type, or it can be a indicate, using the Recv-Info header field, the set of INFO
multipart (RFC 5621) that contains other body parts asso- Packages for which it is willing to receive INFO requests for
ciated with the INFO Package. UAs MUST support mul- a specific session. A UA can list multiple INFO Packages in
tipart body parts in accordance with RFC 5621. Note that a single Recv-Info header field, and the UA can use multiple
an INFO request can also contain other body parts that are Recv-Info header fields. A UA can use an empty Recv-Info
meaningful within the context of an invite dialog usage but header field, that is, a header field without any header field val-
are not specifically associated with the INFO method and the ues. A UA provides its set of INFO Packages for which it is will-
application concerned. When a UA supports a specific INFO ing to receive INFO requests during the dialog establishment.
Package, the UA must also support message-body MIME A UA can update the set of INFO Packages during the invite
610 ◾ Handbook on Session Initiation Protocol
dialog usage. If a UA is not willing to receive INFO requests specification can define an option tag associated with the
for any INFO Packages, during dialog establishment or later specific INFO Package, as described later.
during the invite dialog usage, the UA must indicate this by
including an empty Recv-Info header field. This informs other
16.8.5.2.3 Recv-Info Header Field Rules
UAs that the UA still supports the INFO Package mechanism.
Example: If a UA has previously indicated INFO Packages foo The text below defines rules on when a UA is required to
and bar in a Recv-INFO header field, and the UA during the include a Recv-Info header field in SIP messages. Later, we
lifetime of the invite dialog usage wants to indicate that it does list the SIP methods for which a UA can insert a Recv-Info
not want to receive INFO requests for any INFO Packages header field in requests and responses.
anymore, the UA sends a message with an empty Recv-Info
header field. Once a UA has sent a message with a Recv-Info ◾◾ The sender of an initial INVITE request must include
header field containing a set of INFO Packages, the set is valid a Recv-Info header field in the initial INVITE request,
until the UA sends a new Recv-Info header field containing a even if the sender is not willing to receive INFO
new, or empty, set of INFO Packages. requests associated with any INFO Package.
Once a UA has indicated that it is willing to receive ◾◾ The receiver of a request that contains a Recv-Info
INFO requests for a specific INFO Package, and a dialog has header field must include a Recv-Info header field in
been established, the UA must be prepared to receive INFO a reliable 18x/2xx response to the request, even if the
requests associated with that INFO Package until the UA request contains an empty Recv-Info header field, and
indicates that it is no longer willing to receive INFO requests even if the header field value of the receiver has not
associated with that INFO Package. For a specific dialog changed since the previous time it sent a Recv-Info
usage, a UA must NOT send an INFO request associated header field.
with an INFO Package until it has received an indication ◾◾ A UA must not include a Recv-Info header field in a
that the remote UA is willing to receive INFO requests for response if the associated request did not contain a
that INFO Package, or after the UA has received an indica- Recv-Info header field.
tion that the remote UA is no longer willing to receive INFO
requests associated with that INFO Package. Note that when Note that in contrast to the rules for generating SDP
a UA sends a message that contains a Recv-Info header field answers, the receiver of a request is not restricted to generat-
with a new set of INFO Packages for which the UA is will- ing its own set of INFO Packages as a subset of the INFO
ing to receive INFO requests, the remote UA might, before Package set received in the Info-Package header field of the
it receives the message, send an INFO request based on the request. As with SDP answers, the receiver can include the
old set of INFO Packages. In this case, the receiver of the same Recv-Info header field value in multiple responses
INFO requests rejects, and sends a 469 Bad INFO Package (18x/2xx) for the same INVITE/re-INVITE transaction,
response to, the INFO request. but the receiver MUST use the same Recv-Info header field
If a UA indicates multiple INFO Packages that provide value (if included) in all responses for the same transaction.
similar functionality, it is not possible to indicate a prior-
ity order of the INFO Packages, or to indicate that the UA
16.8.5.2.4 INFO Package Fallback Rules
wishes to only receive INFO requests for one of the INFO
Packages. It is up to the application logic associated with If the receiver of a request that contains a Recv-Info header
the INFO Packages, and particular INFO Package specifi- field rejects the request, both the sender and receiver of the
cations, to describe application behavior in such cases. For request must roll back to the set of INFO Packages that was
backwards compatibility purposes, even if a UA indicates used before the request was sent. This also applies to the case
support of the INFO Package mechanism, it is still allowed where the receiver of an INVITE/re-INVITE request has
to enable legacy INFO usages. In addition, if a UA indicates included a Recv-Info header field in a provisional response,
support of the INFO method using the Allow header field but later rejects the request. Note that the dialog state roll-
(RFC 3261), it does not implicitly indicate support of the back rules for INFO Packages might differ from the rules
INFO Package mechanism. Per RFC 3261, a UA must use for other types of dialog state information (SDP, target, and
the Recv-Info header field to indicate that it supports the others).
INFO Package mechanism. Likewise, even if a UA uses the
Recv-Info header field to indicate that it supports the INFO
16.8.5.3 REGISTER Processing
Package mechanism, in addition the UA still indicates sup-
port of the INFO method using the Allow header. This doc- The INFO method (RFC 6086) allows a UA to insert a Recv-
ument does not define a SIP option tag (see Section 2.10) for INFO header field in a REGISTER request. However, a UA
the INFO Package mechanism. However, an INFO Package shall not include a header value for a specific INFO Package
Call Services in SIP ◾ 611
unless the particular INFO Package specification describes use case can cause negative effects in SIP networks where the
how the header field value shall be interpreted and used mechanism is used.
by the registrar, for example, to determine request targets.
Rather than using the Recv-Info header field to determine
request targets, it is recommended to use more appropriate 16.8.7.3 INFO Request Rate and Volume
mechanisms, for example, based on RFC 3840 (see Section INFO messages differ from many other sorts of SIP mes-
3.4). However, this document does not define a feature tag sages in that they carry application information, and the
for the INFO Package mechanism, or a mechanism to define size and rate of INFO messages are directly determined by
feature tags for specific INFO Packages. the application. This can cause application information traf-
fic to interfere with other traffic on that infrastructure, or
to self-interfere when data rates become too high. There is
16.8.6 Formal INFO Method Definition no default throttling mechanism for INFO requests. Apart
and Header Fields from the SIP session establishment, the number of SIP mes-
16.8.6.1 INFO Method sages exchanged during the lifetime of a normal SIP session
is rather small. Some applications, like those sending DTMF
We have defined all SIP methods including the new INFO tones, can generate a burst of up to 20 messages per second.
method per RFC 6086 in Section 2.5, replacing the defini- Other applications, like constant GPS location updates,
tion and registrations found in RFC 2976 (obsoleted by RFC could generate a high rate of INFO requests during the life-
6086). We have described all SIP header fields including this time of the invite dialog usage.
new the INFO header fields per RFC 6086 in Section 2.8. A designer of an INFO Package, and the application
that uses it, need to consider the impact that the size and
16.8.6.2 Info-Package Header Field the rate of the INFO messages have on the network and on
other traffic, since it normally cannot be ensured that INFO
We add Info-Package per RFC 6086 to the definition of
messages will be carried over a congestion-controlled trans-
the element message-header in the SIP message grammar.
port protocol end-to-end. Even if an INFO message is sent
We have described the Info-Package header field usage ear-
over such a transport protocol, a downstream SIP entity
lier. For the purposes of matching Info-Package types indi-
might forward the message over a transport protocol that
cated in Recv-Info with those in the Info-Package header
does not provide congestion control. Furthermore, SIP mes-
field value, one compares the Info-package-name portion of
sages tend to be relatively small, on the order of 500 bytes to
the Info-package-type portion of the Info-Package header
32,000 bytes. SIP is a poor mechanism for direct exchange of
field octet by octet with that of the Recv-Info header field
bulk data beyond these limits, especially if the headers plus
value. That is, the INFO Package name is case sensitive.
body exceed the UDP maximum transmission unit (RFC
Info-package-param is not part of the comparison check-
0768). Appropriate mechanisms for such traffic include
ing algorithm. This document does not define values for
the HTTP (RFCs 7230-7235), the MSRP (RFC 4975, see
Info-Package types. Individual INFO Package specifications
Section 6.3.3), or other media plane data transport mecha-
define these values.
nisms. RFC 5405 provides additional guidelines for applica-
tions using UDP that may be useful background reading.
16.8.7 INFO Package Considerations
16.8.7.1 General 16.8.8 Alternative Mechanisms
This section covers considerations to take into account when 16.8.8.1 Alternative SIP Signaling
deciding whether the usage of an INFO Package is appropriate Plane Mechanisms
for transporting application information for a specific use case.
16.8.8.1.1 General
16.8.7.2 Appropriateness of INFO This subsection describes some alternative mechanisms for
Package Usage transporting application information on the SIP signaling
plane, using SIP messages.
When designing an INFO Package, for application-level infor-
mation exchange, it is important to consider: is signaling,
using INFO requests, within a SIP dialog, an appropri-
16.8.8.1.2 SUBSCRIBE/NOTIFY
ate mechanism for the use case? Is it because it is the most An alternative for application-level interaction is to use
reasonable and appropriate choice, or merely because “it is subscription-based events (RFC 6665, see Section 20.2) that
easy”? Choosing an inappropriate mechanism for a specific use the SIP SUBSCRIBE and NOTIFY methods. Using that
612 ◾ Handbook on Session Initiation Protocol
mechanism, a UA requests state information, such as keypad as a Transmission Control Protocol (TCP) (RFC 0793) or
presses from a device to an application server, or key-map Stream Control Transmission Protocol (SCTP) (RFC 4960),
images from an application server to a device. Event Packages stream is established.
(RFC 6665) perform the role of disambiguating the context of
a message for subscription-based events. The INFO Package
16.8.8.2.3 Message Session Relay Protocol
mechanism provides similar functionality for application
information exchange using invite dialog usages (RFC 5057, MSRP (RFC 4975, see Section 6.3.3) defines session-based
see Section 3.6.5 and 16.2). While an INFO request is always instant messaging as well as bulk file transfer and other such
part of, and shares the fate of, an existing invite dialog usage, large-volume uses.
a SUBSCRIBE request creates a separate dialog usage (RFC
5057) and is normally sent outside an existing dialog usage.
The subscription-based mechanism can be used by SIP
16.8.8.3 Alternative Non-SIP-Related
entities to receive state information about SIP dialogs and Mechanisms
sessions, without requiring the entities to be part of the route Another alternative is to use a SIP-independent mechanism,
set of those dialogs and sessions. As SUBSCRIBE/NOTIFY such as HTTP (RFCs 7230—7035). In this model, the UA
messages traverse through stateful SIP proxies and B2BUAs, knows about a rendezvous point to which it can direct HTTP
the resource impact caused by the subscription dialogs needs requests for the transfer of information. Examples include
to be considered. The number of subscription dialogs per encoding of a prompt to retrieve in the SIP Request URI
user also needs to be considered. As for any other SIP sig- (RFC 4240, see Section 4.4.2) or the encoding of a SUBMIT
naling plane-based mechanism for transporting application target in a VoiceXML (see Section 17.2.3) script.
information, the SUBSCRIBE/NOTIFY messages can put a
significant burden on intermediate SIP entities that are part
of the dialog route set, but do not have any interest in the 16.8.9 INFO Package Requirements
application information transported between the end users. 16.8.9.1 General
This section provides guidance on how to define an INFO
16.8.8.1.3 MESSAGE Package, and what information needs to exist in an INFO
The MESSAGE method (RFC 3428, see Sections 2.5 and Package specification. If, for an INFO Package, there is a need
6.3.1) defines one-time instant message exchange, typically to extend or modify the behavior described in this document,
for sending MIME contents for rendering to the user. that behavior must be described in the INFO Package speci-
fication. It is bad practice for INFO Package specifications to
repeat procedures defined in this document, unless needed for
16.8.8.2 Alternative SIP Media purposes of clarification or emphasis. INFO Package specifi-
Plane Mechanisms cations must not weaken any behavior designated with should
or must in this specification. However, INFO Package specifi-
16.8.8.2.1 General
cations may strengthen should, may, or recommended require-
In SIP, media plane channels associated with SIP dialogs are ments to must if applications associated with the INFO Package
established using SIP signaling, but the data exchanged on require it. INFO Package specifications must address the issues
the media plane channel does not traverse SIP signaling inter- defined in the following subsections, or document why an issue
mediates; thus, if there will be a lot of information exchanged, is not applicable to the specific INFO Package. Earlier, we have
and there is no need for the SIP signaling intermediaries to described alternative mechanisms that should be considered as
examine the information, it is recommended to use a media part of the process for solving a specific use case, when there is
plane mechanism rather than a SIP signaling-based mecha- a need for transporting application information.
nism. A low-latency requirement for the exchange of infor-
mation is one strong indicator for using a media channel.
Exchanging information through the SIP routing network
16.8.9.2 Overall Description
can introduce hundreds of milliseconds of latency. The INFO Package specification must contain an overall
description of the INFO Package: what type of information is
carried in INFO requests associated with the INFO Package,
16.8.8.2.2 Media Resource Control Protocol
and for what types of applications and functionalities UAs can
One mechanism for media plane exchange of application data use the INFO Package. If the INFO Package is defined for
is the Media Resource Control Protocol (MRCP) (see Section a specific application, the INFO Package specification must
7.6), where a media plane connection-oriented channel, such state which application UAs can use the INFO Package with.
Call Services in SIP ◾ 613
mechanisms that UAs need to use to provide the required Contact: <sip:[email protected]>
security. If the INFO Package specification does not require Content-Type: application/sdp
Content-Length: ...
any additional security, other than what the underlying ...
SIP provides, this must be stated in the INFO Package
specification.
The UAS sends a 200 OK response back to the UAC,
Note that in some cases, it may not be sufficient to man-
where the UAS indicates that it is willing to receive INFO
date TLS (RFC 5246) in order to secure the INFO Package
requests for INFO Packages R and T.
payload, since intermediaries will have access to the payload,
and because beyond the first hop, there is no way to assure SIP/2.0 200 OK
subsequent hops will not forward the payload in clear text. Via: SIP/2.0/TCP pc33.example.
The best way to ensure secure transport at the application com;branch=z9hG4bK776;
level is to have the security at the application level. One way received=192.0.2.1
of achieving this is to use end-to-end security techniques To: Bob <sip:[email protected]>;tag=a6c85cf
From: Alice <sip:[email protected]>
such as S/MIME (see Section 19.6). ;tag=1928301774
Call-ID: [email protected]
CSeq: 314159 INVITE
16.8.9.11 Implementation Details
Contact: <sip:[email protected]>
It is strongly recommended that the INFO Package specifica- Recv-Info: R, T
tion define the procedure regarding how implementers shall Content-Type: application/sdp
Content-Length: ...
implement and use the INFO Package, or refer to other loca- ...
tions where implementers can find that information. Note
that sometimes an INFO Package designer might choose to The UAC sends an ACK request.
not reveal the details of an INFO Package. However, in order
to allow multiple implementations to support the INFO ACK sip:[email protected] SIP/2.0
Package, INFO Package designers are strongly encouraged Via: SIP/2.0/TCP pc33.example.
to provide the implementation details. com;branch=z9hG4bK754
Max-Forwards: 70
To: Bob <sip:[email protected]>;tag=a6c85cf
16.8.10 Examples From: Alice <sip:[email protected]>
;tag=1928301774
It is recommended that the INFO Package specification pro- Call-ID: [email protected]
vide demonstrative message flow diagrams, paired with com- CSeq: 314159 ACK
Content-Length: 0
plete messages and message descriptions. Note that example
flows are by definition informative, and do not replace nor-
mative text. 16.8.10.1.2 Target Refresh
The UAC sends an UPDATE request within the invite dialog
16.8.10.1 Indication of Willingness usage, where the UAC indicates (using an empty Recv-Info
to Receive INFO Packages header field) that it is not willing to receive INFO requests
16.8.10.1.1 Initial INVITE Request for any INFO Packages.
The UAC sends an initial INVITE request, where the UAC UPDATE sip:[email protected] SIP/2.0
indicates that it is willing to receive INFO requests for INFO Via: SIP/2.0/TCP pc33.example.
Packages P and R. com;branch=z9hG4bK776
Max-Forwards: 70
INVITE sip:[email protected] SIP/2.0 To: Bob <sip:[email protected]>;tag=a6c85cf
Via: SIP/2.0/TCP pc33.example. From: Alice <sip:[email protected]>
com;branch=z9hG4bK776 ;tag=1928301774
Max-Forwards: 70 Call-ID: [email protected]
To: Bob <sip:[email protected]> CSeq: 314163 UPDATE
From: Alice <sip:[email protected]> Recv-Info:
;tag=1928301774 Contact: <sip:[email protected]>
Call-ID: [email protected] Content-Type: application/sdp
CSeq: 314159 INVITE Content-Length: ...
Recv-Info: P, R ...
Call Services in SIP ◾ 615
The UAS sends a 200 OK response back to the UAC, Info-Package: foo
where the UAS indicates that it is willing to receive INFO Content-Type: multipart/
mixed;boundary="theboundary"
requests for INFO Packages R and T. Content-Length: ...
SIP/2.0 200 OK --theboundary
Via: SIP/2.0/TCP pc33.example.
com;branch=z9hG4bK893; Content-Type: application/mumble
received=192.0.2.1 ...
To: Bob <sip:[email protected]>;tag=a6c85cf <mumble stuff>
From: Alice <sip:[email protected]> --theboundary
;tag=1928301774 Content-Type: application/foo-x
Call-ID: [email protected] Content-Disposition: Info-Package
CSeq: 314163 INVITE Content-length: 59
Contact: <sip:[email protected]>
Recv-Info: R, T I am a foo-x message type, and I belong to
Content-Type: application/sdp Info Package foo
Content-Length: ...
... --theboundary--
--theboundary
16.8.10.3 Multipart INFO with INFO Package
Content-Type: application/foo-x
16.8.10.3.1 Non-Info Package Body Part
Content-length: 59
SIP extensions can sometimes add body part payloads into an
INFO request, independent of the INFO Package. In this case, I am a foo-x message type, and I belong to
the INFO Package payload gets put into a multipart MIME Info Package foo
body, with a Content-Disposition header field that indicates <mumble stuff>
which body part is associated with the INFO Package.
--theboundary
INFO sip:[email protected] SIP/2.0 Content-Type: application/foo-y
Via: SIP/2.0/UDP 192.0.2.2:5060;branch=z9hG4b Content-length: 59
Knabcdef
To: Alice <sip:[email protected]>;tag=1234567 I am a foo-y message type, and I belong to
From: Bob <sip:[email protected]>;tag=abcdefg Info Package foo
Call-Id: [email protected]
CSeq: 314400 INFO --theboundary--
616 ◾ Handbook on Session Initiation Protocol
16.8.10.3.3 Single Body Part inside UUI is defined this way for two reasons. First, this
Multipart Body Part definition supports a strict layering of protocols and data.
Providing information and understanding of the UUI to the
The body part payload associated with the INFO Package transport layer (SIP in this case) would not provide any ben-
can have a Content-Disposition header field value other than efits and instead could create cross-layer coupling. Second, it
Info-Package. In this case, the body part is put into a multi is neither feasible nor desirable for a SIP UA to understand
part MIME body, with a Content-Disposition header field the information; instead, the goal is for the UA to simply
that indicates which body part is associated with the INFO pass the information as efficiently as possible to the appli-
Package. cation that does understand the information. An important
INFO sip:[email protected] SIP/2.0 application is the interworking with UUI in ISDN, specifi-
Via: SIP/2.0/UDP 192.0.2.2:5060;branch=z9hG4b cally the transport of the call control-related ITU-T Q.931
Knabcdef UUI Element (UUIE) [3] and ITU-T Q.763 UUI Parameter
To: Alice <sip:[email protected]>;tag=1234567 [2] data in SIP. ISDN UUI is widely used in the PSTN
From: Bob <sip:[email protected]>;tag=abcdefg today in contact centers and call centers. These applications
Call-Id: [email protected]
are currently transitioning away from using ISDN for session
CSeq: 314423 INFO
Info-Package: foo establishment to using SIP. Native SIP end points will need
Content-Type: multipart/ to implement a similar service and be able to interwork with
mixed;boundary="theboundary" this ISDN service.
Content-Disposition: Info-Package Note that the distinction between call control UUI and
Content-Length: ... non-call control UUI is very important. SIP already has a
mechanism for sending arbitrary UUI data between UAs
--theboundary
during a session or dialog—the SIP INFO (RFC 6086, see
Content-Type: application/foo-x Section 16.8) method. Call control UUI, in contrast, must be
Content-Disposition: icon exchanged at the time of setup and needs to be carried in the
Content-length: 59 INVITE and a few other methods and responses. Applications
I am a foo-x message type, and I belong to that exchange UUI but do not have a requirement that it be
Info Package foo
transported and processed during call setup can simply use
--theboundary--
SIP INFO and do not need a new SIP extension. In this docu-
ment, four different use-case call flows are discussed. Next,
the requirements for call control UUI transport are discussed.
16.9 SIP Call Control UUI
Transfer Services
16.9.2 Requirements for UUI Transport
16.9.1 Overview
RFC 6567 states the requirements for the transport of call
RFC 6567 that is described here defines SIP UUI data as control UUI solution that needs to be met as follows:
application-specific information that is related to a session
being established using SIP. It is assumed that the application ◾◾ REQ-1: The mechanism will allow UAs to insert and
is running in both end points in a two-party session. That is, receive UUI data in SIP call setup requests and responses.
the application interacts with both the UAs in a SIP session. SIP messages covered by this include INVITE requests
To function properly, the application needs a small piece of and end-to-end responses to the INVITE, that is, 18x
information, the UUI, to be transported at the time of ses- and 200 responses. UUI data may also be inserted in
sion establishment. This information is essentially opaque 3xx responses to an INVITE. However, if a 3xx response
data to SIP—it is unrelated to SIP routing, authentication, is recursed on by an intermediary proxy, the result-
or any other SIP function. This application can be considered ing INVITE will not contain the UUI data from only
to be operating at a higher layer on the protocol stack. As a one 3xx response. In a scenario where a proxy forks an
result, SIP should not interpret, understand, or perform any INVITE to multiple UASs that include UUI data in
operations on the UUI. Should this not be the case, then the 3xx responses, if a 3xx response is the best response sent
information being transported is not considered UUI, and upstream by the proxy, it will contain the UUI data from
another SIP-specific mechanism will be needed to transport only one 3xx response.
the information (such as a new header field). In particular, ◾◾ REQ-2: The mechanism will allow UAs to insert and
this mechanism creates no requirements on intermediaries receive UUI data in SIP dialog terminating requests
such as proxies, B2BUAs, and SBCs. and responses.
Call Services in SIP ◾ 617
◾◾ Q.931 UUI supports inclusion in release and release non-ISDN services to allow UUI to be larger than 128
completion messages. SIP messages covered by this octets. However, users of the mechanism will need be
include BYE and 200 OK responses to a BYE. cognizant of the size of SIP messages and the ability of
◾◾ REQ-3: The mechanism will allow UUI to be inserted parsers to handle extremely large values.
and retrieved in SIP redirects and referrals. SIP mes- ◾◾ REQ-12: The recipient of UUI will be able to deter-
sages covered by this include REFER requests and 3xx mine the entity that inserted the UUI. It is acceptable
responses to INVITE requests. that this is performed implicitly where it is known that
◾◾ REQ-4: The mechanism will allow UUI to be able to there is only one other end UA involved in the dia-
survive proxy retargeting or redirection of the request. log. Where that does not exist, some other mechanism
Retargeting is a common method of call routing in SIP will need to be provided. The UUI mechanism does
and must not result in the loss of UUI. not introduce stronger authorization requirements for
◾◾ REQ-5: The mechanism should not require processing SIP; instead, the mechanism needs to be able to utilize
entities to dereference a URL in order to retrieve the existing SIP approaches for request and response iden-
UUI data. Passing a pointer or link to the UUI data tity. This requirement comes into play during redirec-
will not meet the RT processing considerations and tion, retargeting, and referral scenarios.
would complicate interworking with the PSTN. ◾◾ REQ-13: The mechanism will allow integrity protec-
◾◾ REQ-6: The mechanism will support interworking tion of the UUI. This allows the UAS to be able to
with call-control related DSS1 information elements know that the UUI has not been modified or tam-
or QSIG information elements and ISUP parameters. pered with by intermediaries. Note that there are
◾◾ REQ-7: The mechanism will allow a UAC to learn that trade-offs between this requirement and requirement
a UAS understands the UUI mechanism. REQ-9 for proxies and border elements to remove
◾◾ REQ-8: The mechanism will allow a UAC to require UUI. One possible way to satisfy both of these
that a UAS understands the call control UUI mech- requirements is to utilize hop-by-hop protection.
anism and have a request routed on the basis of this This property is not guaranteed by the protocol in
information. If the request cannot be routed to a UAS the ISDN application.
that understands the UUI mechanism, the request will ◾◾ REQ-14: The mechanism will allow end-to-end pri-
fail. This could be useful in ensuring that a request des- vacy of the UUI. Some UUI may contain private or
tined for the PSTN is routed to a gateway that supports sensitive information and may require different security
the UUI mechanism rather than an otherwise equiva- handling from the rest of the SIP message. Note that
lent PSTN gateway that does not support the ISDN this property is not available in the ISDN application.
mechanism. Note that support of the UUI mechanism ◾◾ REQ-15: The mechanism will allow both end-to-end
does not, by itself, imply that a particular application is and hop-by-hop security models. The hop-by-hop
supported (see REQ-10). model is required by the ISDN UUI service.
◾◾ REQ-9: The mechanism will allow proxies to remove
a particular application usage of UUI data from a 16.9.3 Possible Approaches
request or response. This is a common security func-
tion provided by border elements to header fields such
for UUI Transport in SIP
as Alert-Info or Call-Info URIs. There is no require- Two other possible mechanisms for transporting UUI data
ment for UAs to be able to determine if a particular will be described: MIME body and URI parameter transport.
usage of UUI data has been removed from a request
or response.
16.9.3.1 Why INFO Is Not Used
◾◾ REQ-10: The mechanism will provide the ability for
a UA to discover which application usages of UUI Since the INFO method (RFC 6086, see Section 16.8) was
another UA understands or supports. The creation of a developed for ISUP interworking of UUI, it might seem to
registry of application usages for the UUI mechanism is be the logical choice here. For non-call control UUI, INFO
implied by this requirement. The ISDN service utilizes can be utilized for end-to-end transport. However, for trans-
a field known as the protocol discriminator, which is port of call control UUI, INFO cannot be used. As the call
the first octet of the ISDN UUI data, for this purpose. flows in RFC 6567 show, the information is related to an
◾◾ REQ-11: The UUI is a sequence of octets. The solution attempt to establish a session and needs to be passed with the
will provide a mechanism of transporting at least 128 session setup request (INVITE), responses to that INVITE,
octets of user data and a one-octet protocol discrimina- or session termination requests. As a result, it is not possible
tor, that is, 129 octets in total. There is the potential for to use INFO in these cases.
618 ◾ Handbook on Session Initiation Protocol
16.9.3.2 Why Other Protocol As such, the MIME body approach meets REQ-1, REQ-
Encapsulations Are Not Used 2, REQ-4, REQ-5, REQ-7, REQ-11, REQ-13, and REQ-
14. Meeting REQ-12 seems possible, although the authors
Other protocols have the ability to transport UUI data. For do not have a specific mechanism to propose. Meeting
example, consider the ITU-T Recommendation Q.931 User– REQ-3 is problematic but not impossible for this mecha-
user information element [3] and the ITU-T Recommenda nism. However, this mechanism does not seem to be able to
tion Q.763 UUI parameter [2]. In addition, the Narrowband meet REQ-9.
Signaling System (NSS) [1] is also able to transport UUI data.
Should one of these protocols be in use, and present in both
UAs, then utilizing these other protocols to transport UUI 16.9.3.4 Why URI Parameter Is Not Used
data might be a logical solution. Essentially, this is just adding Another proposed approach is to encode the UUI data as a
an additional layer in the protocol stack. In these cases, SIP URI parameter. This UUI parameter could be included in a
is not transporting the UUI data; it is encapsulating another Request-URI or in the Contact URI or Refer-To URI. It is
protocol, and that protocol is transporting the UUI data. not clear how it could be transported in a response that does
Once a mechanism to transport that other protocol using SIP not have a Request-URI, or in BYE requests or responses.
exists, the UUI data transport function is essentially obtained
without any additional effort or work. <allOneLine>
Contact: <sip:+12125551212@gateway.
16.9.3.3 Why MIME Is Not Used example.com;uui=ZeGl9i2icVqaNVailT
6F5iJ90m6mvuTS4OK05M0vDk0Q4Xs>
One method of transport is to use a MIME body. This is in </allOneLine>
keeping with the SIP-T architecture (RFC 3372, see Section
2.8.2) in which MIME bodies are used to transport ISUP An INVITE sent to this Contact URI would contain
information. Since the INVITE will normally have a SDP UUI data in the Request-URI of the INVITE. The URI
message body, the resulting INVITE with SDP and UUI parameter has a drawback in that a URI parameter carried
data will be multipart MIME. This is not ideal as many SIP in a Request-URI will not survive retargeting by a proxy as
UAs do not support multipart MIME INVITEs. A bigger shown in RFC 6567 described later. That is, if the URI is
problem is the insertion of a UUI message body by a redirect included with an AOR instead of a Contact URI, the URI
server or in a REFER. The body would need to be encoded parameter in the Request-URI will not be copied over to the
in the Contact URI of the 3xx response or the Refer-To URI Contact URI, resulting in the loss of the information. Note
of a REFER. Currently, the authors are not aware of any that if this same URI were present in a Refer-To header field,
UAs that support this capability today for any body type. As the same loss of information would occur. The URI param-
such, the complete set of semantics for this operation would eter approach would meet REQ-3, REQ-5, REQ-7, REQ-9,
need to be determined and defined. Some issues will need to and REQ-11. It is possible the approach could meet REQ-
be resolved, such as, do all the Content-* header fields have 12 and REQ-13. The mechanism does not appear to meet
to be included as well? What if the included Content-Length REQ-1, REQ-2, REQ-4, and REQ-14.
does not agree with the included body? Since proxies cannot
remove a body from a request or response, it is not clear how 16.9.3.5 Why SIP Extensions
this mechanism could meet REQ-9. for UUI Transport Are Used
The requirement for integrity protection could be met by
the use of an S/MIME signature over the body, as defined This section describes how the User-to-User header field meets
in Securing MIME bodies, Section 23.3 of RFC 3261 (see the requirements in RFC 6567 described earlier. The header
Section 2.8.2). Alternatively, this could be achieved using field can be included in INVITE requests and responses and
RFC 4474 (see Sections 2.8 and 19.4.8). The requirement for BYE requests and responses, meeting REQ-1 and REQ-2.
end-to-end privacy could be met using S/MIME encryption For redirection and referral use cases and REQ-3, the header
or using encryption at the application layer. However, note field is included (escaped) within the Contact or Refer-To
that neither S/MIME nor RFC 4474 enjoys deployment in URI. Since SIP proxy forwarding and retargeting does not
SIP today. An example: affect header fields, the header field meets REQ-4. The UUI
header field will carry the UUI data and not a pointer to the
<allOneLine> data, so REQ-5 is met. Since the basic design of the UUI
Contact: <sip:+12125551212@gateway.
header field is similar to the ISDN UUI service, interwork-
example.com?Content-Type =
application/uui&body=ZeGl9i2icVqaNVail ing with PSTN protocols is straightforward and is docu-
T6F5iJ90m6mvuTS4OK05M0vDk0Q4Xs> mented in a separate specification in RFC 7434 described
</allOneLine> in the next section, meeting REQ-6. Requirements REQ-7,
Call Services in SIP ◾ 619
REQ-8, and REQ-10 relate to discovery of the mechanism from ISDN to SIP. The UUI extension for SIP may also
and supported packages, and hence applications. REQ-7 be used for native SIP UAs implementing similar services
relates to support of the UUI header field, while REQ-8 and to interwork with ISDN services. Note that, in most
relates to routing based on support of the UUI header field. cases, there is an a priori understanding between the UAs
REQ-7 is met by defining a new SIP option tag uui. The in regard to what to do with received UUI data. This docu-
use of a Require:uui in a request or Supported:uui in a SIP ment enables the definition of packages and related attributes
OPTIONS response could be used to require or discover that can make such understandings more explicit. The UUI
support of the mechanism. The presence of a Supported:uui mechanism is designed to meet the use cases, requirements,
or Require:uui header field can be used by proxies to route and call flows for SIP call control UUI detailed in RFC 6567
to an appropriate UA, meeting REQ-8. However, note that described later.
only UAs are expected to understand the UUI data—proxies The mechanism is a new SIP header field, along with a
and other intermediaries do not. REQ-10 is met by utilizing new SIP option tag. The header field carries the UUI data,
SIP feature tags (RFC 3840, see Section 2.11). For example, along with parameters indicating the encoding of the UUI
the feature tag sip.uui-isdn could be used to indicate support data, the UUI package, and optionally the content of the
of the ISDN UUI package, or sip.uui-pk1 could be used to UUI data. The package definition contains details about how
indicate support for a particular package, pk1. Proxies com- a particular application can utilize the UUI mechanism. The
monly apply policy to the presence of certain SIP header header field can be included (sometimes called escaped) into
fields in requests by either passing them or removing them URIs supporting referral and redirection scenarios. In these
from requests. REQ-9 is met by allowing proxies and other scenarios, the History-Info header field is used to indicate
intermediaries to remove UUI header fields in a request or the inserter of the UUI data. The SIP option tag can be used
response based on policy. Carrying UUI data elements of at to indicate support for the header field. Support for the UUI
least 129 octets is trivial in the UUI header field, meeting header field indicates that a UA is able to extract the infor-
REQ-11. Note that avoiding having very large UUI data ele- mation in the UUI data and pass it up the protocol stack.
ments is a good idea, as SIP header fields have traditionally Individual packages using the UUI mechanism can utilize
not been large. SIP media feature tags to indicate that a UA supports a par-
To meet REQ-12 for the redirection and referral use ticular UUI package. Guidelines for defining UUI packages
cases, the History-Info header field (RFC 7044, see Section are provided.
2.8) can be used. In these retargeting cases, the changed
Request-URI will be recorded in the History-Info header
field along with the identity of the element that performed
16.9.5 Normative Definition
the retargeting. The requirement for integrity protection in RFC 7433 defines a new SIP header field User-to-User to
REQ-13 could be met by the use of an S/MIME signature transport call control UUI data to meet the requirements
over a subset of header fields, as defined in “SIP Header in RFC 6567 described earlier. To help tag and identify the
Privacy and Integrity using S/MIME: Tunneling SIP” of UUI data used with this header field, purpose, content, and
RFC 3261 (see Section 19.6). Note that the lack of deploy- encoding header field parameters are defined. The purpose
ment of S/MIME with SIP means that, in general, REQ-13 header field parameter identifies the package that defines the
is not met. The requirement of REQ-14 for end-to-end pri- generation and usage of the UUI data for a particular appli-
vacy could be met using S/MIME or using encryption at the cation. The value of the purpose parameter is the package
application layer. Note that the use of S/MIME to secure the name, as registered in the UUI Packages subregistry with
UUI data will result in an additional body being added to IANA. For the case of interworking with the ISDN UUI
the request. Hop-wise TLS (RFC 5246) allows the header service, the ISDN UUI service interworking package is used.
field to meet REQ-15 for hop-by-hop security. The default value for the purpose header field is isdn-uui,
as defined here per RFC 7434. If the purpose header field
parameter is not present, the ISDN UUI must be used. The
16.9.4 SIP Extensions for UUI Transport content header field parameter identifies the actual content
RFC 7433 that is described next specifies the transport of of the UUI data. If not present, the default content defined
UUI data using SIP. It defines a mechanism for the transport for the package must be used.
of general-application UUI data and for the transport of the Newly defined UUI packages must define or reference
call control-related ITU-T Recommendation Q.931 user– at least a default content value. The encoding header field
user information element [3] and ITU-T Recommendation parameter indicates the method of encoding the informa-
Q.763 UUI parameter [2] data in SIP. UUI data is widely tion in the UUI data associated with a particular content
used in the PSTN today for contact centers and call centers. value. This specification only defines encoding=hex. If the
There is also a trend for the related applications to transition encoding header field parameter is not present, the default
620 ◾ Handbook on Session Initiation Protocol
encoding defined for the package must be used. UUI data is in the triggered INVITE). The UA processing the REFER
considered an opaque series of octets. This mechanism must or the 3xx response to the INVITE should support the UUI
not be used to convey a URL or URI, since the Call-Info mechanism. If the REFER or redirect target does not sup-
header field in RFC 3261 (see Section 2.8) already supports port UUI, the UUI header will be discarded as per RFC
this use case. 3261. However, this may limit the utility of use cases that
depend on the UUI being supported by all elements. Here is
an example of an included User-to-User header field from the
16.9.5.1 Syntax for UUI Header Field
redirection response F2 of Figure 16.43 in RFC 6567:
The UUI header field can be present in INVITE requests
<allOneLine>
and responses and in BYE requests and responses. Note that Contact: <sip:+12125551212@gateway.
when the UUI header is used in responses, it can only be example.com?User-to-User =
utilized in end-to-end responses, for example, 1xx (excluding 56a390f3d2b7310023a2%3Bencoding%3Dhex%
100 Trying), 2xx, and 3xx responses. The following syntax 3Bpurpose%3Dfoo%3B
specification uses the ABNF as described in RFC 5234 and content%3Dbar>
</allOneLine>
extends RFC 3261 (where token, quoted-string, and generic-
param are defined) here again for convenience although all The resulting INVITE (F4) would contain
SIP header syntaxes are provided in Section 2.4.1:
User-to-User: 56a390f3d2b7310023a2;encoding=
UUI="User-to-User" HCOLON uui-value *(COMMA hex;purpose=foo;content=bar
uui-value)
uui-value=uui-data *(SEMI uui-param)
uui-data=token/quoted-string 16.9.5.2 Hex Encoding Definition
uui-param=pkg-param/cont-param/enc-param/
generic-param This specification defines hex encoding of UUI data. When
pkg-param="purpose" EQUAL pkg-param-value the value of hex is used in the encoding parameter of a
pkg-param-value=token header field, the data is encoded using base16 encoding
cont-param="content" EQUAL cont-param-value
according to Section 8 of RFC 4648. The hex-encoded
cont-param-value=token
enc-param="encoding" EQUAL enc-param-value value is normally represented using the token construction
enc-param-value=token/"hex" from RFC 3261, although the quoted-string construction
is permitted, in which case the quotes must be ignored. If
Each package defines how many User-to-User header a canonicalized version of a normally case-insensitive hex
fields of each package may be present in a request or a encoded UUI data object is needed for a digital signature or
response. A sender may include multiple User-to-User header integrity checking, then the base16 encoding with all upper
fields, and a receiver must be prepared to receive multiple case must be used.
User-to-User header fields. Consistent with the rules of SIP
syntax, the syntax defined in this document allows any com-
16.9.5.3 Source Identity of UUI Data
bination of individual User-to-User header fields or User-to-
User header fields with multiple comma separated UUI data It is important for the recipient of UUI data to know the
elements. Any size limitations on the UUI data for a particu- identity of the UA that inserted the UUI data. In a request
lar purpose are to be defined by the related UUI package. without a History-Info header field, the identity of the entity
UAs shall ignore UUI data from packages or encoding that that inserted the UUI data will be assumed to be the source
they do not understand. For redirection use cases, the header of the SIP message. For a SIP request, typically this is the
field is included (escaped) within the Contact URI. For refer- UA identified by the URI in the From header field or a
ral use cases, the header field is included (escaped) within the P-Asserted-Identity (RFC 3325, see Sections 2.8, 10.4, and
Refer-To URI. For example, if a UA supports this specifica- 20.3) header field. In a request with a History-Info header
tion, it should include any UUI data included in a redirec- field, the recipient needs to parse the Targeted-to-URIs pres-
tion URI (if the UUI data and encoding is understood). ent (hi-targeted-to-uri defined in RFC 7044, see Section
Note that redirection can occur multiple times to a 2.8) to see if any included User-to-User header fields are
request. Currently, UAs that support attended transfer sup- present. If an included User-to-User header field is present
port the ability to include a Replaces header field (RFC 3891, and matches the UUI data in the request, this indicates that
see Section 2.8.2) into a Refer-To URI, and when acting redirection has taken place, resulting in the inclusion of UUI
upon this URI, UAs add the Replaces header field to the trig- data in the request.
gered INVITE. This sort of logic and behavior is utilized for The inserter of the UUI data will be the UA identified
the UUI header field (i.e., the UUI header field is included by the Targeted-to-URI of the History-Info element before
Call Services in SIP ◾ 621
the element with the included UUI data. In a response, the ◾◾ The information is generated and consumed by an
inserter of the UUI data will be the identity of the UA that application during session setup using SIP, but the
generated the response. Typically, this is the UA identified in application is not necessarily SIP aware.
the To header field of the response. Note that any updates to ◾◾ The behavior of SIP entities that support it is not signifi-
this identity by use of the SIP connected identity extension cantly changed (as discussed in Section 4 of RFC 5727).
(RFC 4916, see Section 20.4) or other identity modifiers will ◾◾ UAs are the generators and consumers of the UUI
update this information. For an example of History-Info and data. Proxies and other intermediaries may route on
redirection, consider Figure 16.43 from RFC 6567 where the basis of the presence of a User-to-User header field
the Originating UA is Carol, the Redirector Bob, and the or a particular package tag but do not otherwise con-
Terminating UA Alice. The INVITE F4 containing UUI sume or generate the UUI data.
data could be ◾◾ There are no privacy issues associated with the informa-
tion being transported (e.g., geolocation or emergency-
INVITE sips:[email protected] SIP/2.0 related information are examples of inappropriate UUI
Via: SIP/2.0/TLS lab.example.com:5061 data).
;branch=z9hG4bKnashds9 ◾◾ The UUI data is not being utilized for User-to-User
To: Bob <sips:[email protected]>
Remote Procedure Calls (RPCs).
From: Carol <sips:[email protected]>
;tag=323sf33k2
Call-ID: dfaosidfoiwe83ifkdf UUI packages define the semantics for a particular appli-
Max-Forwards: 70 cation usage of UUI data. The content defines the syntax
Contact: <sips:[email protected]> of the UUI data, while the encoding defines the encoding
Supported: histinfo of the UUI data for the content. Each content is defined as
User-to-User: 342342ef34;encoding=hex
a stream of octets, which allows multiple encodings of that
History-Info: <sips:[email protected]>;index=1
content. For example, packages may define
<allOneLine>
History-Info: <sips:alice@example. ◾◾ The SIP methods and responses in which the UUI data
com?Reason=SIP%3Bcause%3D302 may be present.
&User-to-User=342342ef34%3Bencoding%3D ◾◾ The maximum number of UUI data elements that may
hex>;index=1.1;rc=1 be inserted into a request or response. The default is
</allOneLine> one per encoding. Note that a UA may still receive a
request with more than this maximum number due to
Without the redirection captured in the History-Info redirection. The package needs to define how to handle
header field, Alice would conclude that the UUI data was this situation.
inserted by Carol. However, the History-Info containing ◾◾ The default values for content and encoding if they
UUI data (index = 1.1) indicates that the inserter was Bob are not present. If the same UUI data may be inserted
(index = 1). To enable maintaining a record of the inserter multiple times with different encodings, the package
identity of UUI data, UAs supporting this mechanism should needs to state this. A package may support and define
support History-Info (RFC 7044, see Section 2.8) and include multiple contents and their associated encodings and
Supported: hist-info in all requests and responses. If a border reuse contents defined by other packages.
element such as a proxy or a B2BUA removes a History-Info ◾◾ Any size limitations on the UUI data. Size needs to
header field containing a User-to-User parameter, the UA be specified in terms of the octet stream output of the
consuming the UUI data may not be able at the SIP level to content, since the size of the resulting uui-data element
identify the source of the UUI data. will vary depending on the encoding scheme.
F5. 200 OK
16.9.6.1 Extensibility
F6. ACK
New content values must describe the semantics of the
UUI data and valid encodings, and give some example use F7. ACK
cases. A previously defined UUI content value can be used
in a new package. In this case, the semantics and usage
of the content by the new package is defined within the Figure 16.42 Call flow with UUI exchanged between orig-
new package. New UUI content types cannot be added to inating and terminating UAs. (Copyright IETF. Reproduced
with permission.)
existing packages—instead, a new package would need to
be defined. New content values that are defined are added or as a lookup for a screen pop. In this case, the proxy does
to the IANA registry with a Standards Track RFC, which not need to understand the UUI mechanism; however, nor-
needs to discuss the issues in this section. If no new encod- mal proxy rules should result in the UUI being forwarded
ing value is defined for a content, the encoding defaults to without modification. This call flow is shown in Figure 16.42
hex as defined in this document. In this case, the hex value (RFC 6567).
will be explicitly stated via the encoding parameter as the
encoding for the content.
New encoding values associated with a new content 16.9.7.2 Proxy Retargeting
must reference a specific encoding scheme (such as hex, In this scenario, the originating UA includes UUI in the
which is defined in this specification) or define the new INVITE request sent through a proxy to the terminating
encoding scheme. A previously defined UUI encoding UA. The proxy retargets the INVITE request, changing its
value can be used with a newly defined content. In this Request-URI to a URI that addresses the terminating UA.
case, the usage of the encoding is defined by the content The UUI data is then received and processed by the termi-
definition. New UUI encodings cannot be added to exist- nating UA. This call flow is identical to Figure 16.32 except
ing contents—instead, a new content would need to be that the proxy retargets the request, that is, changes the
defined. Newly defined encoding values are added to the Request-URI as directed by some unspecified process. The
IANA registry with a Standards Track RFC, which needs UUI in the INVITE request needs to be passed unchanged
to discuss the issues in this section. through this proxy retargeting operation. Note that the con-
tents of the UUI are not used by the proxy for routing, as the
UUI has only end-to-end significance between UAs.
16.9.7 Use Cases
We now discuss four use cases (UA-to-UA, Retargeting,
Redirection, and Referral) for the transport of call control
16.9.7.3 Redirection
UUI specified in RFC 6567 (see Section 16.9.1). These use In this scenario, UUI is inserted by an application that uti-
cases will help motivate the requirements for SIP call control lizes a SIP Redirect Server. The UUI is then included in the
UUI. INVITE request sent by the originating UA to the terminat-
ing UA. In this case, the originating UA does not necessar-
ily need to support the UUI mechanism but does need to
16.9.7.1 User Agent to User Agent
support the SIP redirection mechanism used to include the
In this scenario, the originating UA includes UUI in the UUI data. Two examples of UUI with redirection (transfer
INVITE sent through a proxy to the terminating UA. The and diversion) are defined in Refs. [6] and [7]. Note that
terminating UA can use the UUI in any way. If it is an ISDN this case may not precisely map to an equivalent ISDN ser-
gateway, it could map the UUI into the appropriate DSS1 vice use case. This is because there is no one-to-one map-
[4] information element, QSIG [5] information element, or ping between elements in a SIP network and elements in an
ISUP parameter. Alternatively, the using application might ISDN network. Also, there is no exact one-to-one mapping
render the information to the user, or use it during alerting between SIP call control and ISDN call control. However,
Call Services in SIP ◾ 623
F7. ACK
16.9.7.4 Referral
In this scenario, the application uses a UA to initiate a referral, 7 8 9 C 852
which causes an INVITE request to be generated between
the originating UA and terminating UA with UUI data
inserted by the referrer UA. Note that this REFER method * 0 # D 941
could be part of a transfer operation, or it might be unrelated
to an existing call, such as out-of-dialog REFER request. In
1209 1336 1477 1633
some cases, this call flow is used in place of the redirection
call flow: the referrer immediately answers the call and then Column (high group) frequencies (Hz)
sends the REFER request. This scenario is shown in Figure
16.44 (RFC 6567). Figure 16.45 DTMF keypad system.
624 ◾ Handbook on Session Initiation Protocol
Also, the frequencies for DTMF are so chosen that none have Companion documents add event codes to this registry relat-
a harmonic relationship with the others and that mixing the ing to modem, fax, text telephony, and channel-associated
frequencies would not produce sum or product frequencies signaling events. The remainder of the event codes defined
that could mimic another valid tone. The high-group fre- in RFC 2833 are conditionally reserved in case other docu-
quencies (the column tones) are slightly louder than the low ments revive their use.
group to compensate for the high-frequency roll-off of voice RFC 4733 provides a number of clarifications to RFC
audio systems. DTMF tones are able to represent 1 of the 16 2833. However, it specifically differs from RFC 2833 by
different states or symbols on the keypad. This is equivalent removing the requirement that all compliant implementa-
to 4 bits of data, also known as nibble. DTMF has clearly tions support the DTMF events. Instead, compliant imple-
been extended to purposes beyond simply dialing a telephone mentations taking part in out-of-band negotiations of media
number. Interactive Voice Systems prompt us for all sorts of stream content indicate what events they support. This
things that we answer with button presses. We log into our memo adds three new procedures to the RFC 2833 frame-
voice-mail systems and retrieve our messages with DTMF. If work: subdivision of long events into segments, reporting
so inclined, even music can be played using DTMF. of multiple events in a single packet, and the concept and
Depending on the origin of the DTMF signals, they can reporting of state events. RFC 4734 updates RFC 4733 to
start out in a separate stream, or that separate stream might add event codes for modem, fax, and text telephony signals
be created by stripping the tones out of an audio conversa- when carried in the telephony event RTP payload. It super-
tion. An example of the latter would be a gateway that con- sedes the assignment of event codes for this purpose in RFC
verts analog to SIP. Problems can arise from this stripping 2833, and therefore obsoletes that part of RFC 2833.
that need to be considered. The converter must hear the tone Furthermore, RFC 5244 updates RFC 4733 to add event
before stripping it out, and sometimes there is leakage where codes for telephony signals used for channel-associated sig-
the very beginning of the tone makes its way through. This naling when carried in the telephony event RTP payload. It
might cause a voice-mail system to hear two tones for a single supersedes and adds to the original assignment of event codes
tone. One would come from the RFC 2833 stream and the for this purpose in RFC 2833. Some of the RFC 2833 events
other in the voice stream. Fortunately, conversion hardware have been deprecated because their specification was ambig-
is getting better and better, and these problems have become uous, erroneous, or redundant. In fact, the degree of change
less common (albeit a bear to debug when they occur). Thus, of RFC 2833 is such that implementations of the present
in terms of SIP, how is this RFC 2833 stream created and document are fully backwards compatible with RFC 2833
managed? Through SDP, of course. SDP is used to describe implementations only in the case of full ABCD-bit signal-
the voice stream (e.g., G.729) and it is also used to inform ing. RFC 5244 further expands and improves the coverage
the recipient that RFC 2833 is available. Specifically, it uses of signaling systems compared with RFC 2833. The details
something called telephone event. Here is an example of an of RFCs 4733, 4734, and 5244 have not been addressed here
SDP media description that you might see in the body of an for the sake of brevity.
Invite message. Note the format of 0–15. This represents the
10 digits plus *, #, A, B, D, E, and Flash.
accuracy. Unlike PSTN, the support of emergency calls over Figure 16.46 shows only two root domains: SOS and
packet switched networks like IP using new call control pro- Counseling. Some of the subdomains of these root domains
tocols like SIP has not yet emerged due to the lack of inter- that have been registered with the IANA are also depicted.
national standards that require time for their development. Note that many more subdomains of these root and subroot
Of late, the IETF has created a host of standards domains can be created as required. RFCs 6116 and 6117
primarily defining the Emergency Telecommunications have specified the DNS mechanisms for the dynamic discov-
Service (ETS) requirements and frameworks in RFCs ery and delegation system that can be used for the emergency
3523, 3689, 3690, 4190, 4375, 4542, 4958, 5012, 5031, call information of the SOS and the Counseling DNS tree
5069, 6061, 6443, 6881, and others for IP network infra- for the location information. The delegation tree branch may
structures like IP-based PSAPs, policies for supporting even include the street address, street number, and cube and
single and multiple administrative domains, URNs for floor number. The support for the PASPs over the IP/Internet
emergency services, dealing with security and threats is described in RFC 6443. Section 16.11 describes in detail
faced by emergency calls, best current practice (BCP) for how the DNS NAPTR records for SOS or Counseling can
supporting ETS, extensions in SIP/SDP, call preemption, support automatic routing.
and many other functional capabilities needed for sup- We have described that a SIP call is routed using the
port. Going forward, many more standards need to be Request-URI. Chapter 9 described in detail the routing in
created to achieve the full potential for supporting media- SIP. The SOS URI shall be used for routing the emergency
rich, functionality-rich multimedia emergency calls over calls. The feature tag will indicate the type of service that
IP networks. The detailed description of all functionalities has been requested by the caller. RFC 6443 describes how
for supporting emergency calls using SIP over IP/Internet SIP UAs, proxy servers, and PSAPs support processing and
infrastructures is beyond the scope of this section. A routing for emergency calls, including intermediate devices
brief outline is provided in the next section, limiting the that exist between end devices or applications and the access
description to the context of the IP/Internet. network.
Figure 16.46 Initial IANA registration for emergency and counseling services.
626 ◾ Handbook on Session Initiation Protocol
prioritized call handling service such that in times of emer- of emergency calls, are also articulated. In the future, more
gency. The precedence types, in descending order of prior- innovative intelligent RT networked multimedia services can
ity level, are Flash Override Override, Flash Override, Flash, be created using SIP signaling mechanisms.
Immediate, Priority, and Routine.
The resource priority that needs to be provided by SIP
functional entities like UAs and proxies are defined in RFC PROBLEMS
3487 (see Section 15.2.1). Consequently, the SIP will route 1. What is the basic call transfer in SIP? Explain with
the emergency calls via the IP-based PSAPs handling, pro- call flow examples using REFER to achieve call trans-
cessing, and routing the calls serving with appropriate fer under following conditions: successful transfer and
priorities, even preempting lower-priority calls if needed transfer with dialog reuse (failed transfer: target busy
between the source and the destination. QOS signaling pro- and transfer target does not answer).
tocols, such as Resource Reservation Protocol, can be used 2. What is call transfer with consultation hold? Explain
for network resource reservation-based priority precedence. using SIP call flows for the following scenarios: expos-
The security features for emergency calls are described in ing and protecting transfer target, attended transfer,
RFC 5069. The description of interworking between the IP/ recovery when one party does not support REFER,
Internet and PSTN emergency calls is beyond the scope of attended transfer using GRUU known for routing
this chapter. the call to a unique UA, semi-attended transfer, and
attended transfer fallback to basic transfer.
3. Explain the use of SIP call flows for the following cir-
cumstances: call transfer with Referred-By, transfer
16.12 Summary as an ad hoc conference and transfer with multiple
We have described some basic call services that are being cre- parties.
ated using SIP signaling messages. Service creation using SIP 4. What are the issues in call transfer with a gateway?
signaling messages is a huge area, and more intelligent multi- Explain with SIP call flows the call transfer issues
media-rich applications can be created in the future by new related to the gateway: coerce gateway hairpins to the
innovators. We have shown how a variety of different call ser- same gateway and consultative-turned-blind glare.
vices can be created using only the REFER method. Unlike 5. What is the call diversion indication specified in RFC
non-RT services, the creation of RT multimedia services 5806? Explain with detailed call flows. Why is RFC
demands a different kind of solutions. For example, to transfer 5806 an informational RFC published although not
a call to another party, we need to consider whether the other used in SIP? How is this capability offered in SIP by
party is available, busy, or willing to accept a call at that partic- other means that are standardized by Standards Track
ular moment in real time, including security/privacy. Even if RFCs?
the transfer is not successful, the next course of action needs to 6. Why is the distributed SBC architecture needed in
be decided on the basis of the kind of unsuccessful call trans- SIP? Explain the logical operation of distributed SBC
fer. In addition, if the call is transferred to multiple parities, it architecture if there are only two physical entities:
creates more complexities in view of dealing with each of these signaling controller (SC) and media controller (MC).
parties separately, knowing the status of each one. Which protocol would you choose between the SC and
We have explained why call services for content indirec- MC, and why? Describe the logical functional capa-
tion is needed as SIP is not a general-purpose data trans- bilities of H.248 protocol briefly. Will H.248 protocol
fer protocol. Third-party transcoding call services provide satisfy the needs of SBC?
STT, TTS, and other media transcoding. The SIP INFO 7. Develop call flows for the distributed SIP-based SBC
method and User-to-User header are used in call services for assuming that there is only a single physical entity
transferring user information during the mid-call after the of SC and MC for the following, choosing your own
session establishment and at the time of session setup time, networking topology along the line of Section 14.2:
respectively. DTMF call services have also been discussed topology hiding, media traffic management, fixing
briefly. Finally, we have described the emerging standards capability mismatches, maintaining SIP-related NAT
for supporting emergency calls over the IP/Internet. The bindings, access control, protocol repair, and media
emergency and counseling root URN and their subdomains encryption.
are explained. The emergency SOS URI address resolution 8. What is a so-called SoftSwitch architecture that uses
and the automatic routing of the call using the DNS are dis- the distributed SIP-based SBC? Explain the SIP trunk-
cussed. Emergency call priority using MLPP and resource- ing architecture, developing your own topology along
priority, which are essential for the successful completion with call flows.
Call Services in SIP ◾ 627
9. Explain the evolution of the so-called SoftSwitch- Packages (INVITE and Target Refresh), INFO request
based SIP trunking architecture evolution to the all associated with INFO Package (Single INFO Payload
SIP-based IP telephony architectures along with RT and Multipart INFO [non-Info Package body part,
and near-RT SIP application server architecture from multiple body parts inside multiple body part, and sin-
the legacy POTS/PSTN/ISDN telephony architec- gle body part inside multipart body part]).
ture along with call flows choosing your own network 18. What are the differences between the INFO method
architecture. and User-to-User header in SIP? What are the require-
10. Develop a distributed SIP-based RT and near-RT ser- ments that UUI needs to meet in SIP? Why are INFO,
vice architecture for the large-scale global network MIME, and URI not used in transferring the call con-
architecture. Extend this SIP-based server architecture trol UUI? Explain the guidelines for UUI packages.
for providing end-to-end SIP-based telephony services Explain with call flows UUI use cases for the following:
along with distributed SBC architecture in the local UA-to-UA, proxy retargeting, redirection, and referral.
centers along with call flows. 19. Explain with detailed SIP call flows the call services
11. Define an operation scenario for referring call to mul- using DTMF.
tiple parties. Why is the multiple-refer SIP option tag 20. Explain with detailed SIP call flows how emergency
useful for this purpose? Explain with SIP call flows services are offered using SIP over the IP network
the following for referring calls to multiple resources: including the use of emergency URNs and MLPP.
suppressing REFER’s implicit subscription, behav- 21. How does the emergency call over the IP/Internet dif-
ior of SIP REFER-Issuers, and behavior of REFER fer from that over the PSTN network?
recipients. 22. What are the IETF standards that are being stan-
12. Why is content indirection needed in SIP? Explain dardized for supporting emergency calls over the IP/
using use cases: presence notification, and document Internet? Explain each one of these standards briefly.
sharing. How is RFC 2017 used in solving the content 23. Explain the emergency root SOS URN. How can the
indirection problems in SIP? Explain in detail. subdomains of the root SOS domain be created? How
13. What is the media transcoding service? Why is it is the emergency URI resolved using the DNS? How
needed in SIP? Explain with SIP call flows the third- does the DNS NAPTAR record help in routing emer-
party transcoding call services for the following: cal- gency calls?
lee’s and caller’s invocation, receiving the original 24. Describe how the SIP Request-URI helps in routing
stream, and transcoding services in parallel, and mul- emergency calls over the SIP network.
tiple transcoding services in series. Explain the con- 25. What is MLLP? Describe each kind of priority level as
ferencing mode transcoding call services with detailed described in RFC 4542.
SIP call flows. 26. Describe how emergency calls are routed over the IP/
14. Why is the INFO method needed in SIP? What are Internet with SIP using MLLP and resource priority.
the special characteristics of the INFO method that 27. Describe the security aspects of emergency calls over
make it fundamentally different from other SIP meth- the IP/Internet.
ods? Explain the behavior of all SIP entities in pro- 28. Develop an architecture that interworks between the
cessing the INFO method. Describe in detail the IP/Internet and the PSTN.
INFO request and response message body and order of
delivery.
15. How are the INFO Packages created? Explain the References
SIP UA behavior related to INFO Packages: general, 1. ITU-T, “The Narrowband Signaling Syntax (NSS)—Syntax
procedures, Recv-Info header field rules, and INFO definition,” ITU-T Recommendation Q.1980.1, Available at
Package fallback rules. http://www.itu.int/itudoc/itu-t/aap/sg11aap/history/q1980.1
16. Explain in detail the things that need to be considered /q1980.1.html.
for the INFO Package: appropriateness of usage, request 2. ITU-T, “Signaling System No. 7—ISDN User Part formats
and codes,” ITU-T Recommendation Q.763, Available at
rate and volume, alternative SIP signaling mecha-
http://www.itu.int/rec/T-REC-Q.763-199912-I/en.
nisms (SUBSCRIBE/NOTIFY and MESSAGE), and 3. ITU-T, “ISDN user-network interface layer 3 specification
alternative SIP media plane mechanisms (MRCP and for basic call control,” ITU-T Recommendation Q.931,
MSRP). Available at http://www.itu.int/rec/T-REC-Q.931-199805-I
17. What are the requirements of the INFO Package? /en.
Explain with SIP call flows for the INFO method for 4. ITU-T, “ISDN Digital Subscriber Signaling System No. 1
the following scenarios: willingness to receive INFO (DSS1)—Signaling specifications for frame mode switched
628 ◾ Handbook on Session Initiation Protocol
and permanent virtual connection control and status moni- 9. Recommendation ITU-T H.323 (2009): Packet-based mul-
toring,” ITU-T Recommendation Q.933, Available at http:// timedia communications.
www.itu.int/rec/T-REC-Q.933/en. 10. Third Generation Partnership Project, “Technical Specification
5. ECMA, “Private Integrated Services Network (PISN)— Group Core Network and Terminals; Communication
Circuit Mode Bearer Services—Inter-exchange Signaling Diversion (CDIV) using IP Multimedia (IM)Core Network
Procedures and Protocol (QSIG-BC),” Standard ECMA- (CN) subsystem; Protocol specification (Release 8), 3GPP TS
143, December 2001. 24.604,” December 2008.
6. ANSI, “Telecommunications–Integrated Services Digital 11. Third Generation Partnership Project, “Technical Specifi
Network (ISDN)–Explicit Call Transfer Supplementary cation Group Core Network and Terminals; Interworking
Service,” ANSI T1.643-1995. between the IP Multimedia (IM) Core Network (CN) Sub
7. ETSI, “Integrated Services Digital Network (ISDN); Diver system and Circuit Switched (CS) networks (Release 8),”
sion supplementary services,” ETSI ETS 300 207-1, Ed. 1, December 2008.
1994. 12. Standard ECMA-355 (www.ecma-international.org):
8. Recommendation ITU-T H.248.1 (03/2013): Gateway con- Corporate Telecommunication Networks-Tunnelling of
trol protocol: Version 3. QSIG over SIP 3rd edition, June 2008.
Chapter 17
629
630 ◾ Handbook on Session Initiation Protocol
17.2.2 Use Cases
17.2.2.2 PSTN IVR Service Node
The VoiceXML media service user in this document is
generically referred to as an application server. In practice, While this section is intended to enable enhanced use of
it is intended that the interface defined by this document be VoiceXML as a component of larger systems and services, it is
applicable across a wide range of use cases. Several intended intended that devices that are completely unaware of this speci-
use cases are described below. fication remain capable of invoking VoiceXML services offered
by a VoiceXML media server compliant with this document.
A typical configuration for this use case is shown in Figure 17.2.
17.2.2.1 IVR Services with Application Servers Note also that beyond the invocation and termination of
SIP application servers provide services to users of the net- a VoiceXML dialog, the semantics defined for call transfers
work. Typically, there may be several application servers in using REFER are intended to be compatible with standard,
the same network, each specialized in providing a particu- existing Internet Protocol (IP)/PSTN gateways.
lar service. Throughout this specification and without loss
of generality, we posit the presence of an application server SIP
VoiceXML
specialized in providing IVR services. A typical configura- IP/PSTN
media
gateway RTP/SRTP
tion for this use case is illustrated in Figure 17.1. server
Assuming the application server also supports HTTP,
the VoiceXML application may be hosted on it and served Figure 17.2 IVR services to IP/PSTN gateway from SIP net-
up via HTTP (RFCs 7230-7235). Note, however, that the work. (Copyright IETF. Reproduced with permission.)
Media Server Interfaces in SIP ◾ 631
17.2.3 VoiceXML Session Establishment If omitted, the VoiceXML media server will use a
and Termination default value.
◾◾ method: Used to set the HTTP method applied in the
This section describes how to establish a VoiceXML session, fetch of the initial VoiceXML document. Allowed val-
with or without preparation, and how to terminate a session. ues are get or post (case insensitive). The default is get.
This section also addresses how session information is made ◾◾ postbody: Used to set the application/x-www-form-
available to VoiceXML applications. urlencoded encoded [5] HTTP body for post requests
(or is otherwise ignored).
◾◾ ccxml: Used to specify a JSON value (RFC 4627) that
17.2.3.1 Service Identification
is mapped to the session.connection.ccxml VoiceXML
The SIP Request-URI is used to identify the VoiceXML session variable.
media service. The user part of the SIP Request-URI is fixed ◾◾ aai: Used to specify a JSON value (RFC 4627) that is
to dialog. This is done to ensure compatibility with RFC 4240, mapped to the session.connection.aai VoiceXML ses-
since this document extends the dialog interface defined in sion variable.
that specification and because this convention from RFC
4240 (see Section 4.4.2) is widely adopted by existing media Other application-specific parameters may be added to
servers. Standardizing the SIP Request-URI including the the Request-URI and are exposed in VoiceXML session vari-
user part also improves interoperability between applica- ables. Formally, the Request-URI for the VoiceXML media
tion servers and media servers, and reduces the provisioning service has a fixed user part dialog. Seven URI parameters
overhead that would be required if use of a media server by are defined (see the definition of uri-parameter in augmented
an application server required an individually provisioned Backus–Naur Form in Section 2.4.1).
Uniform Resource Identifier (URI). In this respect, this doc-
ument (and RFC 4240) do not add semantics to the user part, dialog-param = "voicexml=" vxml-url;
vxml-url follows the URI
but rather standardize the way that targets on media servers
; syntax defined in RFC 3986
are provisioned. Furthermore, since application servers—and maxage-param = "maxage=" 1*DIGIT
not human beings—are generally the clients of media serv- maxstale-param = "maxstale=" 1*DIGIT
ers, issues such as interpretation and internationalization do method-param = "method=" ("get"/"post")
not apply. Exposing a VoiceXML media service with a well- postbody-param = "postbody=" token
known address may enhance the possibility of exploitation: ccxml-param = "ccxml=" json-value
aai-param = "aai=" json-value
the VoiceXML media server is RECOMMENDED to use
json-value = false/
standard SIP mechanisms to authenticate end points as dis- null/
cussed in Section 2.5. The initial VoiceXML document is true/
specified with the voicexml parameter. In addition, param- object/
eters are defined that control how the VoiceXML media array/
server fetches the specified VoiceXML document. The list of number/
string ; defined in RFC 7158
parameters defined by this specification is as follows (note
(obsoletes RFC 4627)
that the parameter names are case insensitive):
In addition, RFC 5552 has registered the following
◾◾ voicexml: URI of the initial VoiceXML document parameters in the SIP/SIPS URI parameters registry of the
to fetch. This will typically contain an HTTP URI, Internet Assigned Numbers Authority, in accordance to pol-
but may use other URI schemes, for example, to refer icy of RFC 3969:
to local, static VoiceXML documents. If the voicexml
parameter is omitted, the VoiceXML media server may
Parameter Name Predefined Values Reference
select the initial VoiceXML document by other means,
such as by applying a default, or may reject the request. maxage No RFC 5552
◾◾ maxage: Used to set the max-age value of the Cache-
maxstale No RFC 5552
Control header in conjunction with VoiceXML docu-
ments fetched using HTTP, as per RFCs 7230–7235. method get/post RFC 5552
If omitted, the VoiceXML media server will use a
postbody No RFC 5552
default value.
◾◾ maxstale: Used to set the max-stale value of the Cache- ccml No RFC 5552
Control header in conjunction with VoiceXML docu-
aai No RFC 5552
ments fetched using HTTP, as per RFCs 7230–7235.
Media Server Interfaces in SIP ◾ 633
Parameters of the Request-URI in subsequent re-INVITEs Certain header values in the INVITE message to the
are ignored. One consequence of this is that the VoiceXML VoiceXML media server are mapped into VoiceXML session
media server cannot be instructed by the application server variables and are specified in Section 17.2.3.4. On receipt
to change the executing VoiceXML application after a of the INVITE, the VoiceXML media server issues a provi-
VoiceXML session has been started. Special characters con- sional response, 100 Trying, and commences the fetch of the
tained in the dialog-param, postbody-param, ccxml-param, initial VoiceXML document. The 200 OK response indicates
and aai-param values must be URL-encoded (escaped) as that the VoiceXML document has been fetched and parsed
required by the SIP URI syntax, for example, “?” (%3f), “=” correctly and is ready for execution. Application execution
(%3d), and “;” (%3b). The VoiceXML media server MUST commences on receipt of the ACK (except if the dialog is
therefore unescape these parameter values before making being prepared as specified later). Note that the 100 Trying
use of them or exposing them to running VoiceXML appli- response will usually be sent on receipt of the INVITE in
cations. It is important that the VoiceXML media server accordance with RFC 3261, since the VoiceXML media
only unescape the parameter values once since the desired server cannot, in general, guarantee that the initial fetch will
VoiceXML URI value could itself be URL encoded, for complete in less than 200 milliseconds. However, certain
example. implementations may be able to guarantee response times to
Since some applications may choose to transfer con- the initial INVITE, and thus may not need to send a 100
fidential information, the VoiceXML media server must Trying response. As an optimization, before sending the 200
support the sips: scheme. Informative note: With respect OK response, the VoiceXML media server may execute the
to the postbody-param value, since the application/x-www- application up to the point of the first VoiceXML waiting
form-urlencoded content itself escapes non-alphanumeric state or prompt flush.
characters by inserting %HH replacements, the escaping A VoiceXML media server, like any SIP UA, may be
rules above will result in the % characters being further unable to accept the INVITE request for a variety of reasons.
escaped in addition to the & and = name/value separators. For instance, a Session Description Protocol (SDP) offer
As an example, the following SIP Request-URI identifies contained in the INVITE might require the use of codecs
the use of VoiceXML media services, with http://appserver that are not supported by the media server. In such cases,
.example.com/promptcollect.vxml as the initial VoiceXML the media server should respond as defined by RFC 3261.
document, to be fetched with max-age/max-stale values of However, there are error conditions specific to VoiceXML,
3600s/0s, respectively: as follows:
implementations must take care either to use a transport – reason—Set verbatim to the value of the Reason
appropriate to these larger messages (such as Transmission parameter of hi-targeted-to-uri session.connection.
Control Protocol) or to use alternative means of passing the protocol.name: evaluates to sip. Note that this is
required information to the VoiceXML dialog (such as sup- intended to reflect the use of SIP in general, and
plying a unique session identifier in the initial VoiceXML does not distinguish between whether the media
URI and later using that identifier as a key to retrieve data server was accessed via SIP or SIPS procedures
from the HTTP server). ◾◾ session.connection.protocol.version: Evaluates to 2.0.
◾◾ session.connection.protocol.sip.headers: This is an
associative array where each key in the array is the
17.2.3.3 Preparing a VoiceXML Session
noncompact name of a SIP header in the initial
In certain scenarios, it is beneficial to prepare a VoiceXML INVITE converted to lowercase (note the case conver-
session for execution before running it. A previously pre- sion does not apply to the header value). If multiple
pared VoiceXML session is expected to execute with mini- header fields of the same field name are present, the
mal delay when instructed to do so. If a media-less SIP dialog values are combined into a single comma-separated
is established with the initial INVITE to the VoiceXML value. Implementations must at a minimum include
media server, the VoiceXML application will not execute the Call-ID header and may include other headers.
after receipt of the ACK. To run the VoiceXML application, For example, session.connection.protocol.sip.headers
the application server must issue a re-INVITE to establish evaluates to the Call-ID of the SIP dialog.
a media session. A media-less SIP dialog can be established ◾◾ session.connection.protocol.sip.requesturi: This is
by sending an SDP containing no media lines in the ini- an associative array where the array keys and val-
tial INVITE. Alternatively, if no SDP is sent in the initial ues are formed from the URI parameters on the SIP
INVITE, the VoiceXML media server will include an offer Request-URI of the initial INVITE. The array key is
in the 200 OK message, which can be responded to with an the URI parameter name converted to lowercase (note
answer in the ACK with the media port(s) set to 0. Once a the case conversion does not apply to the parameter
VoiceXML application is running, a re-INVITE that dis- value). The corresponding array value is obtained by
ables the media streams (i.e., sets the ports to 0) will not evaluating the URI parameter value as a JSON value
otherwise affect the executing application (except that recog- (RFC 4627) in the case of the ccxml-param and aai-
nition actions initiated while the media streams are disabled param values and otherwise as a string. In addition,
will result in noinput timeouts). the array’s toString() function returns the full SIP
Request-URI. For example, assuming a Request-URI
of sip:[email protected];voicexml=http://example
17.2.3.4 Session Variable Mappings
.com;aai=%7b”x”:1%2c”y”:true%7d, then session.con
The standard VoiceXML session variables are assigned values nection.protocol.sip.requesturi[“voicexml”] evaluates
according to to http://example.com, session.connection.protocol
.sip.requesturi[“aai”].x evaluates to 1 (type Number),
◾◾ session.connection.local.uri: Evaluates to the SIP session.connection.protocol.sip.requesturi[“aai”].y evalu-
URI specified in the To header of the initial INVITE. ates to true (type Boolean), and session.connection
◾◾ session.connection.remote.uri: Evaluates to the SIP .protocol.sip.requesturi evaluates to the complete
URI specified in the From header of the initial INVITE. Request-URI (type String) sip:[email protected]
◾◾ session.connection.redirect: This array is populated ;voicexml=http://example.com;aai={“x”:1,”y”:true}.
by information contained in the History-Info (RFC ◾◾ session.connection.aai: Evaluates to session.connection
4244) header in the initial INVITE or is otherwise .protocol.sip.requesturi[“aai”].
undefined. Each entry (hi-entry) in the History-Info ◾◾ session.connection.ccxml: Evaluates to session
header is mapped, in reverse order, into an element .connection.protocol.sip.requesturi[“ccxml”].
of the session.connection.redirect array. Properties of ◾◾ session.connection.protocol.sip.media: This is an
each element of the array are determined as follows: array where each array element is an object with the
– uri—Set to the hi-targeted-to-uri value of the following properties:
History-Info entry – type—This required property indicates the type of
– pi—Set to true if hi-targeted-to-uri contains a the media associated with the stream. The value
Privacy=history parameter, or if the INVITE is a string. It is strongly recommended that the
Privacy header includes history; false otherwise following values are used for common types of
– si—Set to the value of the si parameter if it exists, media: audio for audio media and video for video
undefined otherwise media.
Media Server Interfaces in SIP ◾ 635
– direction—This required property indicates the receipt of a BYE in the context of an existing VoiceXML
directionality of the media relative to session con- session, the VoiceXML media server MUST send a 200 OK
nection originator. Defined values are sendrecv, response and must throw a connection.disconnect.hangup
sendonly, recvonly, and inactive. event to the VoiceXML application. If the Reason header
– session.connection.originator—Defined values are (RFC 3326, see Section 2.8) is present on the BYE request,
sendrecv, sendonly, recvonly, and inactive. then the value of the Reason header is provided verbatim via
– format—This property is optional. If defined, the the _message variable within the catch element’s anonymous
value of the property is an array. Each array ele- variable scope. The VoiceXML media server may also initiate
ment is an object that specifies information about termination of the session by issuing a BYE request. This will
one format of the media (there is an array element typically occur as a result of encountering a <disconnect> or
for each payload type on the m-line). The object <exit> in the VoiceXML application, due to the VoiceXML
contains at least one property called name whose application running to completion, or due to unhandled
value is the Multipurpose Internet Mail Extension errors within the VoiceXML application.
(MIME) subtype of the media format (MIME
subtypes are registered in RFC 4855). Other prop- 17.2.3.6 Examples
erties may be defined with string values; these
correspond to required and, if defined, optional 17.2.3.6.1 Basic Session Establishment
parameters of the format. Figure 17.4 illustrates an application server setting up a
VoiceXML session on behalf of a UA.
As a consequence of this definition, there is an array entry
in session.connection.protocol.sip.media for each nondisabled
m-line for the negotiated media session. Note that this session 17.2.3.6.2 VoiceXML Session Preparation
variable is updated if the media session characteristics for the Figure 17.5 demonstrates the preparation of a VoiceXML
VoiceXML session change (i.e., due to a re-INVITE). For session. In this example, the VoiceXML session is prepared
example, consider a connection with bidirectional G.711 before placing an outbound call to a UA, and is started as
mu-law audio sampled at 8 kHz. In this case, session.connec soon as the UA answers. The [answer1:0] notation is used to
tion.protocol.sip.media[0].type evaluates to audio, session.con indicate an SDP answer with the media ports set to 0.
nection.protocol.sip.media[0].direction to sendrecv, session Implementation detail: offer2’ is derived from offer2—
.connection.protocol.sip.media[0].format[0].name evaluates to it duplicates the m-lines and a-lines from offer2. However,
audio/PCMU, and session.connection.protocol.sip.media[0] offer2’ differs from offer2 since it must contain the same
.format[0].rate evaluates to 8000. Note that when accessing o-line as used in answer1:0 but with the version number
SIP headers and Request-URI parameters via the session incremented. Also, if offer1 has more m-lines than offer2,
.connection.protocol.sip.headers and session.connection.proto then offer2’ must be padded with extra (rejected) m-lines.
col.sip.requesturi associative arrays defined above, applications
can choose between two semantically equivalent ways of
17.2.3.6.3 Media Resource Control
referring to the array. For example, either of the following
Protocol Establishment
can be used to access a Request-URI parameter named foo:
Media Resource Control Protocol (MRCP) (RFC 6787, see
session.connection.protocol.sip
Section 7.6) is a protocol that enables clients such as a VoiceXML
.requesturi["foo"]
session.connection.protocol.sip.requesturi.foo media server to control media service resources such as speech
synthesizers, recognizers, verifiers, and identifiers residing in
However, it is important to note that not all SIP servers on the network. Figure 17.6 illustrates how a VoiceXML
header names or Request-URI parameter names are valid media server may establish an MRCP session in response to an
ECMAScript identifiers, and as such, can only be accessed initial INVITE. In Figure 17.6, the VoiceXML media server
using the first form (array notation). For example, the Call-ID is responsible for establishing a session with the MRCP (RFC
header can only be accessed as session.connection.protocol.sip. 6787, see Section 7.6) media resource server before sending the
headers; attempting to access the same value as session.connec- 200 OK response to the initial INVITE.
tion.protocol.sip.headers.call-id would result in an error. The VoiceXML media server will perform the appro-
priate offer–answer with the MRCP media resource server
based on the SDP capabilities of the application server and
17.2.3.5 Terminating a VoiceXML Session
the MRCP media resource server. The VoiceXML media
The application server can terminate a VoiceXML session server will change the offer received from step F1 to establish
by issuing a BYE to the VoiceXML media server. Upon an MRCP session in step (F5) and will re-write the SDP to
636 ◾ Handbook on Session Initiation Protocol
SIP HTTP
SIP user VoiceXML
application application
agent media server
server server
F5. Get
F9. ACK
F10. ACK
(Execute VoiceXML
F11. RTP/SRTP application)
Figure 17.4 Basic session establishment for VoiceXML session. (Copyright IETF. Reproduced with permission.)
SIP HTTP
SIP user VoiceXML
application application
agent media server
server server
F1. INVITE
F3. Get
F7. INVITE
F13. ACK
(Execute VoiceXML
F14. RTP/SRTP application)
Figure 17.5 Preparation for VoiceXML session. (Copyright IETF. Reproduced with permission.)
Media Server Interfaces in SIP ◾ 637
F3. Get
F7. ACK
F10. ACK
F11. RTP/SRTP
Figure 17.6 VoiceXML session using MRCP server. (Copyright IETF. Reproduced with permission.)
include an m-line for each MRCP resource to be used and This is useful primarily for minimizing the delay in starting a
other required SDP modifications as specified by MRCP. VoiceXML session, particularly in cases where a session with
Once the VoiceXML media server performs the offer–answer the UA already exists but the media stream associated with
with the MRCP media resource server, it will establish an that session needs to be redirected to a VoiceXML media
MRCP control channel in step (F8). The MRCP resource server. Figure 17.7 demonstrates the use of early media (using
is deallocated when the VoiceXML media server receives or the gateway model defined in RFC 3960, see Section 11.4.7.2).
sends a BYE (not shown). Although RFC 3960 prefers the use of the application
server model for early media over the gateway model, the
primary issue with the gateway model—forking—is sig-
17.2.4 Media Support nificantly less common when issuing requests to VoiceXML
This section describes the mandatory and optional media media servers. This is because VoiceXML media servers
support required by this interface. respond to all requests with 200 OK responses in the absence
of unusual errors, and they typically do so within several
hundred milliseconds. This makes them unlikely targets in
17.2.4.1 Offer–Answer
forking scenarios, since alternative targets of the forking pro-
The VoiceXML media server MUST support the standard cess would virtually never be able to respond more quickly
offer–answer mechanism of (RFC 3264, see Section 3.8.4). than an automated system, unless they are themselves auto-
In particular, if an SDP offer is not present in the INVITE, mated systems, in which case, there is little point in setting
the VoiceXML media server will make an offer in the 200 up a response time race between two automated systems.
OK response listing its supported codecs. Issues with ringing tone generation in the gateway
model are also mitigated, both by the typically quick 200
OK response time and because this specification mandates
17.2.4.2 Early Media
that no media packets are generated until the receipt of
The VoiceXML media server may support establishment an ACK (thus eliminating the need for the UA to perform
of early media streams as described in RFC 3960 (see media packet analysis). Note that the offer of early media by
Section 11.4.7.2). This allows the application server to estab- a VoiceXML media server does not imply that the referenced
lish media streams between a UA and the VoiceXML media VoiceXML application can always be fetched and executed
server in parallel with the initial VoiceXML document being successfully. For instance, if the HTTP application server
processed (which may involve dynamic VoiceXML page were to return a 4xx response in step F10 above, or if the
generation and interaction with databases or other systems). provided VoiceXML content was not valid, the VoiceXML
638 ◾ Handbook on Session Initiation Protocol
SIP HTTP
SIP user VoiceXML
application application
agent media server
server server
(Existing session)
F1. INVITE
F 2. Ge t
F3. 183 Session progress
(offer)
F6. ACK
F9. RTP/SRTP
F11. 200 OK
F12. ACK
(Execute VoiceXML
application)
Figure 17.7 VoiceXML session for early media services. (Copyright IETF. Reproduced with permission.)
media server would still return a 500 response. At this point, payload formats that MUST be supported by the VoiceXML
it would be the responsibility of the application server to tear media server. For audio-only applications, G.711 mu-law
down any media streams established with the media server. and A-law MUST be supported using the RTP payload type
0 and 8 (RFC 3551, see Section 7.2). Other codecs and pay-
load formats may be supported. Video telephony applica-
17.2.4.3 Modifying the Media Session
tions, which employ a video stream in addition to the audio
The VoiceXML media server must allow the media ses- stream, are possible in VoiceXML 2.0/2.1 through the use of
sion to be modified via a re-INVITE and should support multimedia file container formats such as the .3gp [6] and
the UPDATE method (RFC 3311, see Section 3.8.3) for the .mp4 formats [7]. Video support is optional for this specifica-
same purpose. In particular, it must be possible to change tion. If video is supported, then
streams between sendrecv, sendonly, and recvonly as speci-
fied in RFC 3264 (see Section 3.8.4). Unidirectional streams 1. H.263 Baseline (RFC 4629) must be supported. For
are useful for announcement- or listening-only (hotword). legacy reasons, the 1996 version of H.263 may be sup-
The preferred mechanism for putting the media session ported using the RTP payload format defined in RFC
on hold is specified in RFC 3264, that is, the UA modi- 2190/4628 (payload type 34 specified in RFC 3551, see
fies the stream to be sendonly and mutes its own stream. Section 7.2).
Modification of the media session does not affect VoiceXML 2. Adaptive Multirate (AMR) narrow-band audio (RFC
application execution (except that recognition actions initi- 4867) should be supported.
ated while on hold will result in noinput timeouts). 3. MPEG-4 video (RFC 6416) should be supported.
4. MPEG-4 Advanced Audio Coding (AAC) audio (RFC
6416) should be supported.
17.2.4.4 Audio and Video Codecs
5. Other codecs and payload formats may be supported.
For the purposes of achieving a basic level of interoperability,
this section specifies a minimal subset of codecs and Real- Video record operations carried out by the VoiceXML
Time Transport Protocol (RTP) (RFC 3550, see Section 7.2) media server typically require receipt of an intraframe before
Media Server Interfaces in SIP ◾ 639
the recording can commence. The VoiceXML media server a BYE request sent from the VoiceXML media server as a
should use the mechanism described in RFC 4585 to request result of encountering the <exit> or <disconnect> element.
that a new intraframe be sent. Since some applications may A VoiceXML media server may support inclusion of the
choose to transfer confidential information, the VoiceXML expr/namelist data in the message body of the 200 OK mes-
media server must support SRTP (RFC 3711, see Section 7.3). sage in response to a received BYE request (i.e., when the
VoiceXML application responds to the connection.disconnect
17.2.4.5 DTMF .hangup event and subsequently executes an <exit> element
with the expr or namelist attribute specified). Note that send-
DTMF events (RFC 4733, see Section 16.10) must be sup- ing expr/namelist data in the 200 OK response requires that
ported. When the UA does not indicate support for RFC the VoiceXML media server delay the final response to the
4733, the VoiceXML media server may perform DTMF received BYE request until the VoiceXML application’s post-
detection using other means, such as detecting DTMF tones disconnect final processing state terminates. This mechanism
in the audio stream. Implementation note: the reason only is subject to the constraint that the VoiceXML media server
telephone-events (RFC 4733) must be used when the UA must respond before the UAC’s timer F expires (defaults to
indicates support of it is to avoid the risk of double detection 32 seconds).
of DTMF if detection on the audio stream was simultane- Moreover, for unreliable transports, the UAC will
ously applied. retransmit the BYE request according to the rules of
RFC 3261 (see Section 3.10). The VoiceXML media
17.2.5 Returning Data to the Application server should implement the recommendations of RFC
4320 (see Section 3.12.2.5) regarding when to send the
Server
100 Trying provisional response to the BYE request. If a
This section discusses the mechanisms for returning data VoiceXML application executes a <disconnect> [2] and
(e.g., collected utterance or digit information) from the then subsequently executes an <exit> with namelist infor-
VoiceXML media server to the application server. mation, the namelist information from the <exit> ele-
ment is discarded. Namelist variables are first converted
to their JSON value equivalent (RFC 4627) and encoded
17.2.5.1 HTTP Mechanism
in the message body using the application/x-www-form-
At any time during the execution of the VoiceXML appli- urlencoded format content type [5]. The behavior result-
cation, data can be returned to the application server ing from specifying a recording variable in the namelist
via HTTP using standard VoiceXML elements such as or an ECMAScript object with circular references is not
<submit> or <subdialog>. Notably, the <data> element in defined. If the expr attribute is specified on the <exit> ele-
VoiceXML 2.1 [2] allows data to be sent to the application ment instead of the namelist attribute, the reserved name
server efficiently without requiring a VoiceXML page tran- — exit is used. To allow the application server to differen-
sition and is ideal for short VoiceXML applications such as tiate between a BYE resulting from a <disconnect> from
prompt and collect. For most applications, it is necessary to one resulting from an <exit>, the reserved name — reason
correlate the information being passed over HTTP with a is used, with a value of disconnect (without brackets) to
particular VoiceXML session. One way this can be achieved reflect the use of VoiceXML’s <disconnect> element, and a
is to include the SIP Call-ID (accessible in VoiceXML via value of exit (without brackets) to an explicit <exit> in the
the session.connection.protocol.sip.headers array) within VoiceXML document. If the session terminates for other
the HTTP POST fields. Alternatively, a unique POST-back reasons (such as the media server encountering an error),
URI can be specified as an application-specific URI param- this parameter may be omitted, or may take on platform-
eter in the Request-URI of the initial INVITE (acces- specific values prefixed with an underscore.
sible in VoiceXML via the session.connection.protocol.sip This specification extends the application/x-www-form-
.requesturi array). Since some applications may choose to urlencoded by replacing non-ASCII characters with one or
transfer confidential information, the VoiceXML media more octets of the UTF-8 representation of the character,
server must support the https: scheme. with each octet in turn replaced by %HH, where HH rep-
resents the uppercase hexadecimal notation for the octet
value and % is a literal character. As a consequence, the
17.2.5.2 SIP Mechanism
Content-Type header field in a BYE message containing
Data can be returned to the application server via the expr expr/namelist data MUST be set to application/x-www-
or namelist attribute on <exit> or the namelist attribute form-urlencoded;charset=utf-8. The following table provides
on <disconnect>. A VoiceXML media server must support some examples of <exit> usage and the corresponding result
encoding of the expr/namelist data in the message body of content.
640 ◾ Handbook on Session Initiation Protocol
17.2.7 Call Transfer
Assuming the following VoiceXML variables and values: While VoiceXML is at its core a dialog language, it also
provides optional call transfer capability. VoiceXML’s
userAuthorized=true transfer capability is particularly suited to the PSTN IVR
pin=1234
errors=0
Service Node use case described earlier. It is not recom-
mended to use VoiceXML’s call transfer capability in net-
For example, consider the VoiceXML snippet: works involving application servers. Rather, the application
server itself can provide call routing functionality by tak-
... ing signaling actions based on the data returned to it from
<exit namelist="id pin"/> the VoiceXML media server via HTTP or in the SIP BYE
... message.
If VoiceXML transfer is supported, the mechanism
If id equals 1234 and pin equals 9999, say, the BYE mes- described in this section must be employed. The trans-
sage would look similar to fer flows specified here are selected on the basis that they
BYE sip:[email protected] SIP/2.0 provide the best interworking across a wide range of SIP
Via: SIP/2.0/UDP 192.0.2.4;branch=z9hG4bKnas devices. CCXML<->VoiceXML implementations, which
hds10 require tight coupling in the form of bidirectional event-
Max-Forwards: 70 ing to support all transfer types defined in VoiceXML,
From: sip:[email protected];tag=a6c85cf may benefit from other approaches, such as the use of
To: sip:[email protected];tag=1928301774
SIP event packages (RFC 6665, see Section 5.2). In what
Call-ID: a84b4c76e66710
CSeq: 231 BYE follows, the provisional responses have been omitted for
Content-Type: clarity.
application/x-www-form-
urlencoded;charset=utf-8
Content-Length: 30 17.2.7.1 Blind
id=1234&pin=9999&__reason=exit The blind-transfer sequence is initiated by the VoiceXML
media server via a REFER message on the original SIP dia-
Since some applications may choose to transfer confiden- log. The Refer-To header contains the URI for the called
tial information, the VoiceXML media server must support party, as specified via the dest or destexpr attributes on
the S/MIME encoding of SIP message bodies. the VoiceXML <transfer> tag. If the REFER request is
accepted, in which case the VoiceXML media server will
receive a 2xx response, the VoiceXML media server throws
17.2.6 Outbound Calling
the connection.disconnect.transfer event and will termi-
Outbound calls can be triggered via the application server nate the VoiceXML session with a BYE message. For blind
using third-party call control (RFC 3725, see Section 18.3). transfers, implementations may use RFC 4488 (see Section
Flow IV from RFC 3725 (see Figure 18.1b, Section 18.3) 2.8) to suppress the implicit subscription associated with
is recommended in conjunction with the VoiceXML session the REFER message. If the REFER request results in a non-
preparation mechanism. This flow has several advantages 2xx response, the <transfer>’s form item variable (or event
over others, as follows: raised) depends on the SIP response and is specified in the
Media Server Interfaces in SIP ◾ 641
following table. Note that this indicates that the transfer 17.2.7.2 Bridge
request was rejected.
The bridge transfer function results in the creation of a
small multiparty session involving the caller, the VoiceXML
media server, and the callee. The VoiceXML media server
SIP Response <transfer> variable/event invites the callee to the session and will eject the callee if
404 Not Found error.connection. the transfer is terminated. If the aai or aaiexpr attribute is
baddestination present on <transfer>, it is appended to the Request-URI
in the INVITE as a URI parameter named aai. Reserved
405 Method Not Allowed error.unsupported.
characters are URL encoded as required for SIP/SIPS URIs
transfer.blind
(RFC 3261, see Section 4.2). The mapping of values outside
503 Service Unavailable error.connection. of the ASCII range is platform specific. During the trans-
noresource fer attempt, audio specified in the transferaudio attribute of
(No response) network_busy
<transfer> is streamed to UA 1. A VoiceXML media server
may play early media received from the callee to the caller
(Other 3xx/4xx/5xx/6xx) unknown if the transferaudio attribute is omitted. The bridge transfer
sequence is illustrated in Figure 17.9. The VoiceXML media
server (acting as a UAC) makes a call to UA 2 with the same
An example is illustrated in Figure 17.8 (provisional codecs used by UA 1. When the call setup is complete, RTP
responses and NOTIFY messages corresponding to provi- flows between UA 2 and the VoiceXML media server. This
sional responses have been omitted for clarity). stream is mixed with that of UA 1.
If the aai or aaiexpr attribute is present on <transfer>, it is If a final response is not received from UA 2 from the
appended to the Refer-To URI as a parameter named aai in INVITE and the connecttimeout expires (specified as an
the REFER method. Reserved characters are URL encoded attribute of <transfer>), the VoiceXML media server will issue
as required for SIP/SIPS URIs (RFC 3261, see Section 4.2). a CANCEL to terminate the transaction and the <transfer>’s
The mapping of values outside of the ASCII range is plat- form item variable is set to noanswer. If INVITE results in
form specific. a non-2xx response, the <transfer>’s form item variable (or
event raised) depends on the SIP response and is specified in
Table 17.1.
SIP user SIP user Once the transfer is established, the VoiceXML media
VoiceXML
agent 1
media server
agent server can listen to the media stream from UA 1 to perform
(caller) (callee) speech or DTMF hotword, which when matched results
in a near-end disconnect, that is, the VoiceXML media
F0. RTP/SRTP
server issues a BYE to UA 2 and the VoiceXML application
continues with UA 1. A BYE will also be issued to UA 2
F1. REFER <transfer> if the call duration exceeds the maximum duration speci-
F2. 202 Accepted fied in the maxtime attribute on <transfer>. If UA 2 issues
F3. BYE
SIP user SIP user
VoiceXML
F4. 200 OK agent 1 agent
Stop RTP/SRTP (F0) media server
(caller) (callee)
F5. INVITE
Figure 17.8 Blind transfer using REFER for VoiceXML ses- Figure 17.9 Bridge transfer for VoiceXML session. (Copyright
sion. (Copyright IETF. Reproduced with permission.) IETF. Reproduced with permission.)
642 ◾ Handbook on Session Initiation Protocol
a BYE during the transfer, the transfer terminates and the F14. 200 OK
VoiceXML <transfer>’s form item variable receives the value F15. BYE
far_end_disconnect. If UA 1 issues a BYE during the trans- F16. 200 OK
Stop RTP/
fer, the transfer terminates and the VoiceXML event connec- SRTP (F0)
tion.disconnect.transfer is thrown.
Figure 17.10 Consultation transfer for VoiceXML session.
17.2.7.3 Consultation (Copyright IETF. Reproduced with permission.)
consultation). Similarly, the media (audio, video, and data 7. Explain how the data collected utterance or digit infor-
application) bridging/mixing application servers also need to mation during the session between the SIP user and
create many multimedia services for the multiparty multi- the VoiceXML media server is transferred to the appli-
media conferencing distributive for offering multimedia ser- cation server using the following mechanisms: HTTP
vices with scalability over the large-scale next-generation SIP and SIP.
networks. This is a hugely growing area for multimedia ser- 8. Explain using SIP call flows creating a scenario of
vices creation, and some of the works in this area have been the outbound calling while the session is established
progressing in the International Engineering Task Force; between the SIP user and the VoiceXML server in
however, we have not addressed these for the sake of brevity. addition to between two SIP users.
9. Explain with detailed SIP call flows creating a call transfer
PROBLEMS service scenario of the session between the two SIP users
making VoiceXML server a part of the session for the fol-
1. What is stimulus signaling? Explain the stimulus sig-
lowing call services: blind, bridge, and consultation.
naling framework in detail as specified in RFC 5629.
2. How does stimulus signaling help in the context of ser-
vices creation using SIP interfaces to the VoiceXML
media application server within the framework speci-
fied in RFC 5629? References
3. Explain the use cases in detail for the SIP interfaces to 1. McGlashan, S., Burnett, D., Carter, J., Danielsen, P., Ferrans,
the VoiceXML media server for the following scenar- J., Hunt, A., Lucas, B., Porter, B., Rehor, K., and Tryphonas,
ios: IVR services, PSTN IVR server node, IMS MRF, S., “Voice Extensible Markup Language (VoiceXML) ver-
CCXML, and VoiceXML. sion 2.0,” W3C Recommendation, March 2004.
2. Oshry, M., Auburn, R.J., Baggia, P., Bodell, M., Burke, D.,
4. Describe in detail the VoiceXML session establishment
Burnett, D., Candell, E., Kilic, H., McGlashan, S., Lee,
and termination using SIP explaining the following A., Porter, B., and Rehor, K., “Voice Extensible Markup
steps: service identification, initiating and preparing Language (VoiceXML) version 2.1,” W3C Candidate
the VoiceXML session, session variable mappings, and Recommendation, June 2005.
terminating the VoiceXML session. 3. 3GPP, “3rd Generation Partnership Project: Network architec-
5. Explain in detail using SIP call flows for media services ture (Release 6),” 3GPP TS 23.002 v6.6.0, December 2004.
to the SIP users for the following cases: basic session 4. Auburn, R.J., “Voice Browser Call Control: CCXML version
establishments, VoiceXML session preparation, and 1.0,” W3C Working Draft, June 2005.
MRCP establishment. 5. Raggett, D., Le Hors, A., and Jacobs, I., “HTML 4.01 speci-
fication,” W3C Recommendation, December 1999.
6. Explain the VoiceXML media services to the SIP users 6. 3GPP, “Transparent end-to-end packet switched streaming
using detailed call flows, step by step as follows: media service (PSS); 3GPP file format (3GP),” 3GPP TS 26.244
negotiations using SDP offer–answer model, support v6.4.0, December 2004.
of early media, modifying the media session, support 7. “Information technology. Coding of audio-visual objects.
of audio and video codecs, and DTMF. MP4 file format,” ISO/IEC 14496-14:2003, October 2003.
Chapter 18
645
646 ◾ Handbook on Session Initiation Protocol
creation of a multiparty call evolving from a two-party call ability to manipulate calls between two or multiple parties. It
dynamically where media bridging is needed to a centralized is assumed that the controller knows the addresses of all par
place is not possible with the present SIP/SDP protocol. This ties and when to dial-out the participants or the address of
is especially because of the way the protocol architecture of the controller is a priori known to all conference participants
SDP that is used for media negotiations has been developed. for dial-in. Call control, including modifications of signaling
Consequently, there has been a basic assumption that a messages, is done by the controller acting as the back-to-back
centralized functional entity such as conference controller is UA (B2BUA). It is also possible that the media bridge can be
a priori known to all multiparty conferences. The conference used to dial-out or dial-in as another party by the control
controller is dialed-in by the conferencing parties or dialed-out ler. A controller is a SIP UA that wishes to create a session
to the conferencing parties. In this way, one of the problems is between two other UAs. 3PCC is often used for operator
knowing the central point of contact where every multiparty services (where an operator creates a call that connects two
participant will be communicating to have the star-like topol or more participants together) and multimedia conferencing.
ogy for conferencing. This centralized controller may also Many SIP services are possible through 3PCC. These
take the responsibility of media bridging and other function include the traditional ones on the public switched telephone
alities for conferencing, if not controlling. RFC 3725, known network (PSTN), but also new ones such as click-to-dial,
as the 3PCC, is being developed as if a centralized control mid-call announcements, media transcoding, call transfer,
ler is setting up the multiparty conference. This has been a and others. For example, click-to-dial allows a user to click
simple SIP multiparty conference architecture, although it is on a web page when they wish to speak to a customer service
riddled with many problems where the conference setup may representative. The web server then creates a call between the
fail in many circumstances. Later on, RFC 5239 (see Sections user and a customer service representative. The call can be
2.2, 2.4.4.1, and 4.2.1.6) describes the framework of the mul between two phones, a phone and an Internet Protocol (IP)
tiparty conferencing system and its mechanisms, along with host, or two IP hosts. 3PCC is possible using only the mech
the conference controller known as focus user agent (UA). anisms specified within RFC 3261. In addition, many other
Conference-aware UA, and Conference-unaware UA, along services are possible using other methods and mechanisms
with well-defined Conference Factory Uniform Resource defined in other SIP-related RFCs. Indeed, many possible
Identifier (URI), have been standardized. Although we have approaches can be used for the 3PCC; however, we describe
included some definitions and addressing schemes of RFC here some examples of the call setup following RFC 3725,
5239 in those sections, the detail description of RFC 5239 is each with different benefits and drawbacks. The use of 3PCC
out of scope of this book. also becomes more complex when aspects of the call utilize
Multiparty conferencing with multimedia requires a host SIP extensions or optional features of SIP.
with very rich and complex functionalities, namely confer
ence control including floor control, conference objects that
are manipulated during the conference, audio and video
18.3.1 3PCC Call Establishment
bridging, sharing/bridging of a variety of data applications, The controller that establishes the call between two or mul
media channel control, and many others. These functions do tiple parties is termed as the third party (RFC 3725) because
not belong to SIP call control functions, and non-SIP proto the establishment of the session is orchestrated by the con
cols are needed. In this respect, the Centralized Conference troller that is not a party among the conference participants.
Manipulation Protocol (CCMP) (RFC 6503), Binary Floor It is implied that the controller somehow has the prior intel
Control Protocol (BFCP) (RFC 4582), Conference Object ligence to set up the multiparty conference. The mechanisms
Model (RFC 6501), Mixer for Media Control Channel (RFC by which a controller will be fed this intelligence may itself
6505), and other functionalities for multiparty multimedia be another application that RFC 3725 does not describe.
conferencing have been standardized. It is beyond the scope The controller can play a significant role in setting up the
of this chapter to deal with all those aspects of conferencing sessions among the participants, which may be very simple
here. However, we briefly outline some of the 3PCCs here. to very complex. Figure 18.1 shows different primitives of
operations that a controller can play in establishing the calls
in the SIP network that consists of three parties and a con
troller. Subsequent call flows described here are similar to
18.3 Third-Party Multiparty RFC 3725.
Note that we have used the connection address of
Conferencing 0.0.0.0, which, although recommended by RFC 3264 (see
3PCC provides mechanisms for the creation of multiparty Section 3.8.4), has numerous drawbacks. It is anticipated
conference calls in a star-like point-to-point fashion from that a future specification will recommend use of a domain
a centralized controller. In fact, 3PCC refers to the general within the invalid Domain Name System top-level domain
Multiparty Conferencing in SIP ◾ 647
(d) (e) (f )
Note: bh = “black holed” (i.e., no media stream)
Figure 18.1 3PCC call flows in SIP network: (a) SIP network, (b) simplest flows, (c) flows with ping-ponging of INVITEs,
(d) flows with unknown entities, (e) efficient flows with unknown entities, and (f) flows with error handling. (Copyright
IETF. Reproduced with permission.)
instead of the 0.0.0.0 IP address. As a result, implementers offer as defined in RFC 3264 (see Section 3.8.4). The con
are encouraged to track such developments once they arise. troller needs to send its answer in the ACK message, as
In addition, RFC 6157 describes how the IPv4 SIP UAs can mandated by RFC 3261. To obtain the answer, it sends the
communicate with IPv6 SIP UAs (and vice versa) at the sig offer it received from party A (offer1) in an INVITE mes
naling layer as well as exchange media once the session has sage (F3) to party B. Party B’s phone rings. When party B
been successfully set up. Both single- and dual-stack (i.e., answers, the 200 OK message (F4) contains the answer to
IPv4-only and IPv4/IPv6) UAs are also considered. We have this offer, answer1. The controller sends an ACK message
not discussed these scenarios in the 3PCC call flows here. (F5) to party B, and then passes answer1 to A in an ACK
message (F6) sent to it. Because the offer was generated by
A and the answer generated by B, the actual media session is
18.3.1.1 Simplest Multiparty Call Flows
between A and B. Therefore, Real-time Transport Protocol
The simplest 3PCC is depicted in Figure 18.1b. The control (RTP) (F7) media flows between parties A and B. This flow
ler first sends a SIP message INVITE (F1) with no SDP is simple, requires no manipulation of the SDP by the con
to party A. That is, this INVITE message has no session troller, and works for any media types supported by both
description. Party A’s phone rings, and party A answers. end points. However, it has a serious timeout problem. User
This results in a 200 OK (F2) message that contains an B may not answer the call immediately.
648 ◾ Handbook on Session Initiation Protocol
The result is that the controller cannot send the ACK This 200 OK (F8) is acknowledged in ACK message F9, and
to A right away. This causes A to retransmit the 200 OK then media can flow from A to B. Media from B to A could
response periodically. The 200 OK will be retransmitted for already start flowing once message F5 was sent. This flow
64*T1 seconds (RFC 3261, see Section 3.12). If an ACK does has the advantage that all final responses are immediately
not arrive by then, the call is considered to have failed. This acknowledged using the ACK message. It therefore does not
limits the applicability of this flow to scenarios where the suffer from the timeout and message-inefficiency problems
controller knows that B will answer the INVITE immedi of the simplest call flows in the earlier section. However, it,
ately. Once the calls are established, both participants believe too, has troubles. First, it requires that the controller know
they are in a single point-to-point call. However, they are the media types to be used for the call (since it must gener
exchanging media directly with each other, rather than with ate a bh SDP, which requires media lines). Secondly, the first
the controller. The controller is involved in two dialogs, yet INVITE (F1) to A contains media with a 0.0.0.0 connection
sees no media. Since the controller is still a central point address. The controller expects that the response contains a
for signaling, it now has complete control over the call. If valid, nonzero connection address for A.
it receives a BYE from one of the participants, it can create However, experience has shown that many UAs respond
a new BYE and hang up with the other participant. This is to an offer of a 0.0.0.0 connection address with an answer
shown in Figure 17.1b as well, in messages F8 through F11. It containing a 0.0.0.0 connection address. The offer–answer
should be noted that this will be the general behavior of the specification of RFC 3264 (see Section 3.8.4) explicitly tells
controller in dealing with BYE messages sent by one of the implementers not to do this; however, at the time of publi
participants. Later, we will also describe how continued call cation of RFC 3725, many implementations still did. If A
processing can be done by the controller through use of the should respond with a 0.0.0.0 connection address in sdp2,
re-INVITE message. the flow will not work. The most serious flaw in this flow is
the assumption that the 200 OK (F8) to the re-INVITE (F7)
contains the same SDP as in message 2. This may not be the
18.3.1.2 Flows with Ping-Ponging of INVITEs case. If it is not, the controller needs to re-INVITE party
The call flows with ping-ponging of INVITE messages are B with that SDP (say, sdp4), which may result in getting a
shown in Figure 18.1c. The controller first sends an INVITE different SDP, say sdp5, in the 200 OK from party B. Then,
(F1) to user A. This is a standard INVITE (F1), containing the controller needs to re-INVITE party A again, and so on.
an offer (sdp1) with a single audio media line, one codec, a The result is an infinite loop of re-INVITEs. It is possible to
random port number (but not zero), and a connection address break this cycle by having very smart UAs that can return
of 0.0.0.0. This creates an initial media stream that is black the same SDP whenever possible, or really smart controllers
holed (bh), since no media (or Real-time Transport Control that can analyze the SDP to determine if a re-INVITE is
Protocol packets defined in RFC 3550, see Section 7.2) will really needed. However, RFC 3725 recommends keeping
flow from A. The INVITE causes A’s phone to ring. When this mechanism simple, and avoids SDP awareness in the
party A answers, the 200 OK (F2) contains an answer, sdp2, controller. As a result, this flow is not really workable. This
with a valid address in the connection line. It then generates 3PCC call flow is therefore not recommended by RFC 3725.
a second INVITE (F3). This INVITE is addressed to user
B, and it contains sdp2 as the offer to B. Note that the role
18.3.1.3 Flows with Unknown Entities
of sdp2 has changed. In the 200 OK (F2), it was an answer,
but in the INVITE (F3), it is an offer. Fortunately, all valid The call flows with the parties whose media compositions
answers are valid initial offers. At the same time, the con for the sessions are not known to the controller are shown
troller also sends an ACK (F4) to party A as it has been due in Figure 18.1d. First, the controller sends an INVITE (F1)
immediately in response to A’s 200 OK (F2) for stopping to user A without any SDP because the controller does not
retransmissions or timeout. The INVITE (F3) causes B’s need to assume anything about the media composition of
phone to ring. When it answers, it generates a 200 OK (F5) the session. Party A’s phone rings. When party A answers,
with an answer, sdp3. As usual, the controller then generates a 200 OK (F2) is generated containing its offer, offer1. The
an ACK (F6). Next, it sends a re-INVITE (F7) to A contain controller generates an immediate ACK (F3) containing an
ing sdp3 as the offer. Once again, there has been a reversal of answer (F3). This answer (F3) is a bh SDP, with its con
roles. sdp3 of message F5 has been an answer, and now it is nection address equal to 0.0.0.0. The controller then sends
an offer to party A in INVITE (F7). an INVITE (F4) to B without SDP. This causes party B’s
Fortunately, an answer to an answer recast as an offer is, phone to ring. When party B answers, a 200 OK (F5) is sent,
in turn, a valid offer. This re-INVITE generates a 200 OK containing their offer, offer2. This SDP is used to create a
(F8) with sdp2, assuming that A does not decide to change re-INVITE (F6) back to party A. That re-INVITE (F6) is
any aspects of the session as a result of this re-INVITE (F7). based on offer2, but may need to be reorganized to match
Multiparty Conferencing in SIP ◾ 649
up media lines, or to trim media lines. For example, if offer1 user A will be alerted without any media having been estab
contained an audio and a video line, in that order, but offer2 lished yet. This means that user A will not be able to reject
contained just an audio line, the controller would need to or accept the call based on its media composition. Secondly,
add a video line to the offer (setting its port to zero) to create both A and B will end up answering the call (i.e., generat
offer2ʹ. ing a 200 OK) before it is known whether there is compat
Since this is a re-INVITE (F6), it should complete quickly ible media. If there is no media in common, the call can be
in the general case. That is good since user B is retransmit terminated later with a BYE. However, the users will have
ting their 200 OK (F5), waiting for an ACK. The SDP in already been alerted, resulting in user annoyance and pos
the 200 OK (F7) from A, answer2’, may also need to be sibly resulting in billing events.
reorganized or trimmed before sending it an ACK (F8) to
B as answer2. Finally, an ACK (F9) is sent to A, and then 18.3.2 Recommendations for 3PCC
media can flow. This flow has many benefits. First, it will
Call Setups
usually operate without any spurious retransmissions or
timeouts (although this may still happen if a re-INVITE The call flows shown Figure 18.1b represent the simplest
is not responded to quickly). Secondly, it does not require and the most efficient flows. This flow should be used by a
the controller to guess the media that will be used by the controller if it knows with certainty that user B is actually
participants. There are some drawbacks. The controller does an automaton that will answer the call immediately. This is
need to perform SDP manipulations. Specifically, it must the case for devices such as media servers, conferencing serv
take some SDP, and generate another SDP that has the same ers, and messaging servers, for example. Since we expect a
media composition but has connection addresses equal to great deal of 3PCCs to be to automata, special casing in this
0.0.0.0. This is needed for message F3. Secondly, it may need scenario is reasonable. For calls to unknown entities, or to
to reorder and trim SDP Y so that its media lines match up entities known to represent people, it is recommended that
with those in some other SDP, Z. Thirdly, the offer from B flows shown in Figure 18.1e be used for 3PCC. Call flows
(offer2, message F5) may have no codecs or media streams shown in Figure 18.1d may be used instead, but they provide
in common with the offer from A (offer 1, message F2). The no additional benefits over flows of Figure 18.1e. However,
controller will need to detect this condition and terminate flows shown in Figure 18.1c should not be used because of
the call. Finally, the flow is far more complicated than the the potential for infinite ping-ponging of re-INVITEs.
simplest and elegant call flows shown in Figure 17.1b. Several of these flows use a bh connection address of
0.0.0.0. This is an IPv4 address with the property that pack
ets sent to it will never leave the host that sent them; they are
18.3.1.4 Efficient Flows with Unknown Entities just discarded. Those flows are therefore specific to IPv4. For
The call flows with reduced complexities shown in Figure 18.1e other network or address types, an address with an equivalent
are a variation of the call flows shown in Figure 18.1d with property should be used. In most cases, including the recom
the parties whose media compositions for the sessions are not mended flows, user A will hear silence while the call to B
known to the controller. The actual message flow is identical, completes. This may not always be ideal. It can be remedied
but the SDP placement and construction differ. The initial by connecting the caller to a music-on-hold source while the
INVITE (F1) contains SDP with no media at all, meaning call to B occurs. In addition, RFC 6157 can be used for IPv4
that there are no m-lines. This is valid, and implies that the and IPv6 network scenarios that are not shown here.
media makeup of the session will be established later through
a re-INVITE described in RFC 3264 (see Section 3.8.4). 18.3.3 Multiparty Call Establishment
Once the INVITE is received, user A is alerted. When the
Error Handling
party A answers the call, the 200 OK (F2) has an answer
with no media either. This is acknowledged by the control Numerous error cases may occur in establishing the multi
ler sending an ACK message (F3). The flow from this point party calls, which merit discussion. With all of the call flows
onwards is identical to flows shown in Figure 18.1d. However, described earlier, one call is established to party A, and
the manipulations required to convert offer2 to offer2ʹ, and then the controller attempts to establish a call to party B.
answer2ʹ to answer2, are much simpler. Indeed, no media However, this call attempt may fail, for any number of rea
manipulations are needed at all. sons. User B might be busy (resulting in a 486 Busy Here
The only change that is needed is to modify the origin response to the INVITE), there may not be any media in
lines, so that the origin line in offer2ʹ is valid based on the common, the requests may time out, and so on. If the call
value in offer1 (validity requires that the version increments attempt to B should fail, it is recommended that the con
by one, and that the other parameters remain unchanged). troller send a BYE to A. This BYE should include a Reason
There are some limitations associated with this flow. First, header defined in RFC 3326 (see Section 2.8), which carries
650 ◾ Handbook on Session Initiation Protocol
the status code from the error response. This will inform user what the recipient is expecting. It is important to point out
A of the precise reason for the failure. The information is that the call need not have been established by the controller
important from a user interface perspective. For example, if in order for the processing of this section to be used. Rather,
A was calling from a black phone, and B generated a 486 the controller could have acted as a B2BUA during a call
Busy Here, the BYE will contain a Reason code of 486 Busy established by A toward B (or vice versa).
Here, and this could be used to generate a local busy signal
so that A knows that B is busy.
Another error condition worthy of discussion is shown
18.3.5 3PCC and Early Media
in Figure 18.1f. After the controller establishes the dia Early media represents the condition where the session is
log with party A (messages F1–F3), it attempts to contact established (as a result of the completion of an offer–answer
B (message F4). Contacting party B may take some time. exchange), yet the call itself has not been accepted. This is
During that interval, party A could possibly attempt a re- usually used to convey tones or announcements regarding
INVITE, providing an updated offer. However, the control progress of the call. Handling of early media in a third-party
ler cannot pass this offer on to B, since it has an INVITE call is straightforward. Figure 18.2b shows the case where
transaction pending with it. As a result, the controller needs user B generates early media before answering the call. The
to reject the request. It is recommended that a 491 Request flow is almost identical to flows of Figure 18.1e. The only dif
Pending response be used. The situation here is similar to the ference is that user B generates a reliable provisional response
glare condition described in RFC 3261, and thus the same (F5) defined in RFC 3262 instead of a final response, and
error handling is sensible. However, user A is likely to retry answer2 is carried in a PRACK (F9) instead of an ACK.
its request (as a result of the 491 Request Pending), and this When party B finally does accept the call (F12), there is no
may occur before the exchange with B is completed. In that change in the session state, and therefore, no signaling needs
case, the controller would respond with another 491 Request to be done with user A. The controller simply acknowledges
Pending. the 200 OK (F13) to confirm the dialog. The case where user
A generates early media is more complicated, and is shown in
Figure 18.2c. The flow is based on call flows of Figure 18.1e.
18.3.4 Continued Call Processing in 3PCC The controller sends an INVITE (F1) to user A, with an offer
In continuation of the call flows of Figure 18.1e after set containing no media streams. User A generates a reliable pro
ting the RTP (F10), if the controller receives a re-INVITE visional response (F2) containing an answer with no media
from one of the participants, it can forward it to the other streams.
participant. Depending on which flow was used, this may The controller acknowledges this provisional response
require some manipulation on the SDP before passing it on. (F2) sending PRACK (F3). Now, the controller sends an
However, the controller need not proxy the SIP messages INVITE (F5) without SDP to user B. User B’s phone
received from one of the parties. Since it is a B2BUA, it can rings, and the user answers resulting in a 200 OK (F6)
invoke any signaling mechanism on each dialog, as it sees fit. with an offer, offer2. The controller now needs to update
For example, if the controller receives a BYE from A, it can the session parameters with user A. However, since the
generate a new INVITE to a third party, C, and connect B call has not been answered, it cannot use a re-INVITE.
to that participant instead. A call flow for this is shown in Rather, it uses a SIP UPDATE request (F7) defined RFC
Figure 18.2a, assuming the case where C represents an end 3311, passing the offer (after modifying it to get the origin
user, not an automaton. field correct). User A generates its answer in the 200 OK
From here, new parties can be added, removed, trans (F8) to the UPDATE. This answer is passed to user B in
ferred, and so on, as the controller sees fit. In many cases, the ACK (F9) message. When user A finally answers (F11),
the controller will be required to modify the SDP exchanged there is no change in session state, so the controller simply
between the participants in order to affect these changes. In acknowledges the 200 OK sending the ACK (F12) mes
particular, the version number in the SDP will need to be sage. Note that it is likely that there will be clipping of
changed by the controller in certain cases. Should the con media in this call flow. User A is likely a PSTN gateway,
troller issue an SDP offer on its own (e.g., to place a call on and has generated a provisional response because of early
hold), it would need to increment the version number in the media from the PSTN side. The PSTN will deliver this
SDP offer. The other participant in the call will not know media even though the gateway does not have anywhere
that the controller has done this, and any subsequent offer to send it, since the initial offer from the controller had
it generates will have the wrong version number as far as its no media streams. When user B answers, media can begin
peer is concerned. As a result, the controller will be required to flow. However, any media sent to the gateway from the
to modify the version number in SDP messages to match PSTN up to that point will be lost.
Multiparty Conferencing in SIP ◾ 651
Figure 18.2 3PCC: (a) continued call processing, (b) simple early media, (c) complex early media, (d) controller initiated
SDP preconditions, and (e) party-initiated SDP preconditions. (Copyright IETF. Reproduced with permission.)
18.3.6 3PCC and SDP Preconditions alerted unless there is a common set of media and codecs. It
can also provide both parties with information on the media
A SIP extension has been specified that allows for the cou composition of the call before they decide to accept it. The
pling of signaling and resource reservation described in RFC flow for this scenario is shown in Figure 18.2d. In this exam
3312 (see Section 15.4). This specification relies on exchanges ple, we assume that user B is an automaton or agent of some
of session descriptions before completion of the call setup. sort that will answer the call immediately. Therefore, the flow
These flows are initiated when certain SDP parameters are is based on call flows of Figure 18.1b. The controller sends an
passed in the initial INVITE. As a result, the interaction INVITE to user A containing no SDP, but with a Require
of this mechanism with 3PCC is not obvious, and worth header indicating that preconditions are required. This specific
detailing. scenario (an INVITE without an offer, but with a Require
header indicating preconditions) is not described in RFC
18.3.6.1 Initiation by Controller 3312 (see Section 15.4). It is recommended that the UA server
In one usage scenario, the controller wishes to make use of respond with an offer in a 1xx including the media streams it
preconditions in order to avoid the call failure scenarios docu wishes to use for the call, and, for each, list all preconditions it
mented in the earlier section. Specifically, the controller can supports as optional. Of course, the user is not alerted at this
use preconditions in order to guarantee that neither party is time. The controller takes this offer and passes it to user B (F3).
652 ◾ Handbook on Session Initiation Protocol
User B does not support preconditions, or does but is not User A sees that its peer is capable of supporting precondi
interested in them. Therefore, when it answers the call, the tions. Since it desires preconditions for the call, it generates
200 OK contains an answer without any preconditions listed an answer in the 200 OK (F8) to the UPDATE. This answer,
(F4). This answer is passed to user A in the PRACK (F6). in turn, is passed to B in the PRACK (F9) for the provisional
At this point, user A knows that there are no preconditions response. Now, both sides perform resource reservation. User
actually in use for the call, and therefore, it can alert the A succeeds first, and passes an updated session description in
user. When the call is answered, user A sends a 200 OK (F8) an UPDATE request (F13). The controller simply passes this
to the controller and the call is complete. In the event that to A, after the manipulation of the origin field, as required
the offer generated by user A was not acceptable to user B in call flows of Figure 18.1e, in an UPDATE (F14), and
(e.g., because of nonoverlapping codecs or media), user B the answer (F15) is passed back to A (F16). The same flow
would immediately reject the INVITE (F3). The controller happens, but from B to A, when B’s reservation succeeds
would then CANCEL the request to user A. In this situ (F17–F20).Since the preconditions have been met, both sides
ation, neither user A nor user B would have been alerted, ring (F21 and F22), and then both answer (F23 and F25),
achieving the desired effect. It is interesting to note that this completing the call. What is important about this flow is
property is achieved using preconditions even though it does that the controller does not know anything about precondi
not matter what specific types of preconditions are supported tions. It merely passes the SDP back and forth as needed. The
by user A. It is also entirely possible that user B does actu trick is using UPDATE and PRACK to pass the SDP when
ally desire preconditions. In that case, it might generate a needed. That determination is made entirely based on the
1xx of its own with an answer containing preconditions. offer–answer rules described in RFC 3311 (see Section 3.8.3)
That answer would still be passed to user A, and both parties and RFC 3262 (see Sections 2.5, 2.8.2, and 2.10), and is
would proceed with whatever measures are necessary to meet independent of preconditions.
the preconditions. Neither user would be alerted until the
preconditions are met.
18.3.7 3PCC Service Examples
We have considered two applications for offering using the
18.3.6.2 Initiation by Party A
3PCC mechanisms: click-to-dial and mid-call announce
In the previous section, the controller requested the use of ment. Figure 18.3 shows the call flows for both of these
preconditions to achieve a specific goal. It is also possible that services.
the controller does not care (or perhaps does not even know) For the click-to-dial service, Figure 18.3a shows the SIP
about preconditions, but one of the participants in the call network that contains user A’s phone and browser, control
does care. A call flow for this case is shown in Figure 18.2e. ler, and customer service, while Figure 18.3b shows the call
The controller follows the call flows of Figure 18.1e; it has flows. Note that both phone and browser capability can also
no specific requirements for support of the preconditions be implemented in a single device. Figure 18.3c depicts the
specification of RFC 3312 (see Section 15.4). Therefore, it SIP network with user A’s phone, called party B’s phone,
sends an INVITE (F1) with SDP that contains no media controller, and media server, while Figure 18.3d describes the
lines. User A is interested in supporting preconditions, and call flows for the mid-call announcement services.
does not want to ring its phone until resources are reserved.
Since there are no media streams in the INVITE, it can
18.3.7.1 Click-to-Dial
not reserve resources for media streams, and therefore it can
not ring the phone until they are conveyed in a subsequent In the click-to-dial service, a user is browsing the web page
offer and then reserved. Therefore, it generates a 183 Session of an e-commerce site and would like to speak to a customer
Progress (F2) message with the answer, and does not alert the service representative. The user clicks on a link, and a call is
user. The controller acknowledges this 183 Session Progress placed to a customer service representative. When the repre
provisional response sending the PRACK message (F3), and sentative picks up, the phone on the user’s desk rings. When
A responds to the PRACK (F3) message with 200 OK (F4). the user picks up, the customer service representative is there,
At this point, the controller attempts to bring B into the call. ready to talk to the user. The call flow for this service is given
It sends B an INVITE without SDP (F5). B is interested in in Figure 18.3b. It is identical to that of Figure 18.1e, with
having preconditions for this call. Therefore, it generates its the exception that the service is triggered through an HTTP
offer in a 183 Session Progress (F6) message that contains the POST request when the user clicks on the link. Normally,
appropriate SDP attributes. The controller passes this offer to this POST request would contain neither the number of
A in an UPDATE request (F7). the user nor that of the customer service representative. The
The controller uses UPDATE because the call has not user’s number would typically be obtained by the web appli
been answered yet, and therefore, it cannot use a re-INVITE. cation from backend databases, since the user would have
Multiparty Conferencing in SIP ◾ 653
(a) (b)
Party A Party B Party A Party B
Controller Media server Controller Media Controller Media
prepaid called prepaid called
SIP UA SIP UA server server
user user user user
(B2BUA) (B2BUA)
F1. INVITE sdp
c=bh
F2. 200 OK
answer1
F3. ACK F15. INVITE
F4. INVITE offer3'
no SDP
F5. 200 OK F16. 200 OK
F6. INVITE offer2 answer3' F17. ACK
offer2
answer3'
F7. 200 OK answer2
F8. ACK answer2 F18. ACK
SIP network F9. ACK
Party A Party B F10. RTP F19. RTP
prepaid user called party
SIP UA SIP UA F11. BYE
F12. 200 OK
F13. INVITE
no SDP
F14. 200 OK
offer3
Figure 18.3 3PCC service examples: (a) click-to-dial SIP network, (b) click-to-dial call flows, (c) mid-call announcement
SIP network, and (d) mid-call announcement call flows. (Copyright IETF. Reproduced with permission.)
presumably logged into the site, giving the server the needed routed only over the PSTN. This may result in better-
context. The customer service number would typically be quality calls with the PINT solution, depending on the
obtained through provisioning. Thus, the HTTP POST is codec in use and quality-of-service capabilities of the
actually providing the server nothing more than an indica network routing the Internet portion of the call.
tion that a call is desired. ◾◾ The PINT solution requires extensions to SIP (PINT
We note that this service can be provided through other is an extension to SIP), whereas the solution described
mechanisms, namely PSTN/Internet Interworking (PINT) here is done with baseline SIP.
described in RFC 2848. However, there are numerous dif ◾◾ The PINT solution allows the controller (acting as a
ferences between the way in which the service is provided by PINT client) to step out once the call is established.
PINT and the way in which it is provided here as described The solution described here requires the controller to
in RFC 3725: maintain call state for the entire duration of the call.
would like the user to hear an announcement that tells them to ◾◾ Re-INVITEs that change the connection address
enter a credit card to continue when this timer fires. Once they ◾◾ Re-INVITEs that add a media stream
enter the credit card info, more money is added to the prepaid ◾◾ Re-INVITEs that remove a media stream (setting its
card, and the user is reconnected to the destination party. We port to zero)
consider here the usage of 3PCC just for playing the mid-call ◾◾ Re-INVITEs that add a codec among the set in a
dialog to collect the credit card information. We assume the media stream
call is set up so that the controller is in the call as a B2BUA. ◾◾ SDP connection address of zero
We wish to connect the caller to a media server when the timer ◾◾ Initial INVITE requests with a connection address of
fires. The flow for this is shown in Figure 18.3d. When the zero
timer expires, the controller places the called party with a con ◾◾ Initial INVITE requests with no SDP
nection address of 0.0.0.0 (F1). This effectively disconnects the ◾◾ Initial INVITE requests with SDP but no media lines
called party. The controller then sends an INVITE without ◾◾ Re-INVITEs with no SDP
SDP to the prepaid caller (F4). ◾◾ The UPDATE method described in RFC 3311 (see
The offer returned from the caller (F5) is used in an Section 3.8.3)
INVITE to the media server that will be collecting digits ◾◾ Reliability of provisional responses in RFC 3262 (see
(F6). This is an instantiation of call flows of Figure 18.1b. Sections 2.5, 2.8.2, and 2.10)
This flow can only be used here because the media server is ◾◾ Integration of resource management and SIP described
an automaton, and will answer the INVITE immediately. If in RFC 3312 (see Section 15.4)
the controller was connecting the prepaid user with another
end user, call flows of Figure 18.1d would need to be used.
The media server returns an immediate 200 OK (F7) with
an answer, which is passed to the caller in an ACK (F8). The
18.3.9 Concluding Remarks
result is that the media server and the prepaid caller have Scalable multiparty conferencing with interoperability in
their media streams connected. The media server plays an multivendor environments needs further attention from
announcement and prompts the user to enter a credit card researchers worldwide. The large-scale multiparty confer
number. After collecting the number, the card number is encing architecture is still in the early stage of development,
validated. The media server then passes the card number to while emerging SIP and non-SIP RFCs mentioned above
the controller (using some means beyond the scope of this will provide glimpses on how we need to proceed in offer
specification), and then hangs up the call (F11). After hang ing most of the media-rich, intelligent, real-time networked
ing up with the media server, the controller reconnects the multimedia services. The ease of use, like clicking on a URL,
user to the original called party. To do this, the controller and scalable multiparty conferencing services with utmost
sends an INVITE without SDP to the called party (F13). reliability and a variety application sharing among busi
The 200 OK (F14) contains an offer, offer3. The controller nesses, universities, governments, academic institutions, and
modifies the SDP as shown in Figure 18.1d, and passes the domestic users worldwide are of paramount need. Should
offer in an INVITE to the prepaid user (F15). The prepaid these technical challenges be met in the future, multiparty
user generates an answer in a 200 OK (F16), which the con conferencing with audio, video, and data applications would
troller passes to user B in the ACK (F17). At this point, the create a new thrilling era of real-time presence, providing a
caller and called party are reconnected. feeling of intimacy as if all parties are close together and geo
graphical distance is merely an illusion.
18.3.8 3PCC Implementation
Recommendations
18.4 Summary
Most of the work involved in supporting 3PCC is within the
controller. A standard SIP UA should be controllable using We have described the key problems of offering multiparty
the mechanisms described here. However, 3PCC relies on conferencing using SIP/SDP. The simple SDP media descrip
a few features that might not be implemented. As such, we tion architecture that has been very attractive for the initial
recommend that implementers of UA servers support the fol less media-rich point-to-point call has posed a huge chal
lowing as described in RFC 3725: lenge for multiparty multimedia conferencing. The mul
tiparty 3PCC conference architecture has been described
◾◾ Offers and answers that contain a connection line with using SIP/SDP per RFC 3725, where the controller sets up
an address of 0.0.0.0 the multipart call in a star-like, point-to-point, centralized
◾◾ Re-INVITE requests that change the port to which conferencing topology under many assumptions. Even then,
media should be sent multiparty conferencing call establishment may fail under
Multiparty Conferencing in SIP ◾ 655
many circumstances. We have explained each of those error 7. What are the recommendations of 3PCC call setups
cases that need to be taken care of. We have also clarified per RFC 3725?
that a huge amount of research work still needs to be done, 8. How can 3PCC call establishment errors be handled in
although there are many other multiparty conferencing- general?
related SIP and non-SIP application protocols like CCMP, 9. Describe the 3PCC call flows using SIP/SDP for con
BFCP, media channel mixer, and others that are being stan tinued call processing along with each header and
dardized to fill many of the gaps for centralized conferencing message-body field of each message.
only, but the detail descriptions of those beyond the scope of 10. Describe the 3PCC call flows with early media using
this chapter. SIP/SDP for continued call processing along with each
header and message-body field of each message.
11. Describe the 3PCC call flows with SDP precondi
PROBLEMS tions using SIP/SDP for the following along with their
1. What are the fundamental functional differences limitations, if any: (a) initiation by the controller and
between two-party point-to-point and multiparty con (b) initiation by the first party (say, A). Provide detailed
ference calls? descriptions for every header and message-body field of
2. What are the limitations of single or multiple media each SIP/SDP message of each call establishment flows
negotiations with the media description architecture of of (a) and (b).
multiparty conferencing? 12. Describe the 3PCC call flows using SIP/SDP for the
3. Describe the problems of developing a multiparty call following services along with their limitations, if
control architecture that needs media bridging evolv any: (a) click-to-dial and (b) mid-call announcement.
ing from the initial two-party point-to-point call using Provide detailed descriptions of every header and
SIP/SDP. message-body field of each SIP/SDP message of each
4. What are the assumptions made with the third-party call establishment flows of (a) and (b).
centralized controller to set up the multiparty confer 13. What are the implementation recommendations of
ence call using SIP/SDP, which is described in RFC 3PCC call setups per RFC 3725?
3725? 14. Describe the security recommendations using some
5. Describe the 3PCC call flows using SIP/SDP for the examples for the 3PCC call setups and services per
following: (a) simplest flows, (b) flows with ping- RFC 3725: authentication, authorization, end-to-end
ponging of INVITEs, (c) flows with unknown entities, encryption, and integrity.
and (d) efficient flows with unknown entities. Provide 15. Describe the limitations, and suggest probable solu
detailed descriptions of every header and message body tions for the worldwide large-scale 3PCC multiparty
field of each SIP/SDP message of each call establish audio, video, and application sharing (including media
ment flows of (a) through (d). bridging) conferencing services architecture recom
6. Describe all limitations of each call establishment mended in RFC 3725 for providing scalability and
flows of (a) through (d) of item 5. interoperability in multivendor environments.
Chapter 19
Abstract 19.1 Introduction
The inherent security capabilities that are avail- Security services in Session Initiation Protocol (SIP) are
able in Session Initiation Protocol (SIP) are dis- important because SIP services primarily deal with human
cussed in this section. SIP security is required communications in real time over the fixed or mobile wire-
in two levels: session level and media level. line or wireless networks. Moreover, networked multimedia
Session-level security deals with SIP signal- RT and near-RT communications not only use SIP/SDP sig-
ing messages, while SIP media-level security is naling messages for session setups and teardowns, but also
handled by secure media transport protocols like use audio, video, and/or data application media protocols
SRTP and ZRTP, which are extensions of the and payloads once sessions are setup. SIP communications
Real-Time Transmission Protocol. We do not may include all aspects of human life, including commercial
deal with media-level security because it in itself and financial industries, governments, and private house-
needs more detailed treatment that is beyond holds including individuals. Countering sophisticated cyber-
the scope of this book. However, we explain the attacks has made SIP security services more important. In
Session Description Protocol (SDP) media-level general, five categories of security services are offered in SIP
cryptographic features that are used for media described in this section as follows:
streams once the session is set up, and that are
negotiated using SIP/SDP messages. First, we
explain the session-and media-level security of Authentication: This is the identification of a certain
SIP. Second, the functionality for negotiating entity and verification of that identity using creden-
the security mechanisms used between a SIP UA tials (e.g., digital signature) that can assure the iden-
and its next-hop SIP entity at the session level is tification and can be trusted for the communication/
provided. Third, all security mechanisms of SIP transmission. The entity can be a human user, an end
are described in detail. Fourth, we explain the device, a consumer/provider of a service, or others. In
SIP session setup that includes security features case of two parties, termed as mutual authentication,
using an example of a call flow. Fifth, we explain the authentication can be made during communi-
the possible security threats that are being faced cating through security key exchanges authenticat-
in the context of SIP that is an application-layer ing the identity of the peer. It can also be termed as
protocol. Finally, means to mitigate security identity or entity authentication. Sometimes, a mes-
threats using existing security mechanisms in sage may also need to be authenticated. In this case,
SIP are provided. In this context, how the lower- the authentication usually deals with the origin of
transport-layer security capabilities complement the message as opposed to the message itself, and is
SIP application-layer security features is dis- termed as data origin authentication. For example,
cussed. However, a separate chapter is devoted to Hypertext Transfer Protocol (HTTP) authentication
describing privacy and anonymity in SIP. is used in SIP.
657
658 ◾ Handbook on Session Initiation Protocol
Authorization: This deals with the access rights to vari- follow entirely different paths. There are two stages of com-
ous resources, application subsystems, functionality, munications: in the first stage, the session is set up and the
and data. Various policies, access control models, and second stage is followed by the media streams that have been
security levels are used as means of supporting autho- negotiated using the Session Description Protocol (SDP) car-
rization. They are used to support confidentiality and ried in the SIP message body during the call establishment.
integrity. For example, SIP call-level priority, the kind Signaling messages use SIP protocol, whereas multimedia
of media to be used, resource-level priority, and many streams use entirely different sets of real-time media trans-
other functions in SIP may be subject to authorization. port (e.g., Real-Time Transport Protocol [RTP], Secure RTP
Indirectly, it also implies that the authorized entity will [SRTP], and ZRTP) and data application sharing application
be authenticated before providing authorization. (e.g., ITU-T T-series, instant messaging [IM], e-mails, web
Integrity: In information security, integrity ensures that applications) protocols.
the data used maintains consistency over its entire life In addition, both SIP signaling messages and media
cycle and is free from deliberate or accidental modifi- streams may move via different SIP and network functional
cation. Integrity protection ensures the recipient that a entities between the source–destination paths. For multi-
received data/message has not been tampered along the party conferring, audio, video, or data application mixing/
end-to-end transmission path. Integrity protects infor- bridging is needed. Again, these application-layer signaling
mation against malicious attacks and communication and media protocols also need the support of the lower-
errors and failures. Data/message integrity service is transport-layer protocols (e.g., User Datagram Protocol
provided in addition to confidentiality in information [UDP], Transmission Control Protocol [TCP], Stream
security. Control Transmission Protocol [SCTP], Transport Layer
Confidentiality: This deals only with the secrecy of the Security [TLS], and Datagram Transport Layer Security
information or data that passes through from one [DTLS]). SIP security involves of all of those application
party to another through the network. That is, data/ and transport layers together in addition to the two-stage
information can only be kept confidential between the approach explained earlier.
trusted entities. For example, confidentiality enables a
message to be encrypted using a key so that only the
19.2.2 Session-Level Security
recipients who possess the appropriate key can retrieve
the original message by decrypting the message. It does The security of the SIP session establishment is the most
not deal with the privacy of a person, although it offers important part of SIP security because all the security fea-
a kind of privacy service to the information/data. SIP tures that are negotiated for the media level to be used in the
has mechanisms for encrypting messages for providing second stage are decided during the session setup. We will
confidentiality. see in the subsequent sections that if there is no adequate
Nonrepudiation: This deals with the need to prove that security for the session establishment, then media-level secu-
a certain action has been taken by an identity with- rity cannot be provided. All security mechanisms that are
out plausible deniability. It implies that one party of a addressed here are primarily focused on the SIP session-level
transaction cannot deny having received a transaction security that includes both SIP signaling message headers
nor can the other party deny having sent a transac- and SIP message body that includes SDP. We start first on
tion. Nonrepudiation enables holding people account- how SIP security mechanisms can be agreed upon between
able for their actions. It provides a verifiable proof that the communicating entities using SIP security-related head-
the transaction has taken place between the entities. ers that were later added to SIP specifications. Furthermore,
Nonrepudiation security services can be provided authentication, authorization, integrity, confidentiality, and
using SIP. nonrepudiation at the SIP session level are discussed in detail
in the subsequent sections.
to the RFER request. Special consideration is warranted for authenticated REFER to an HTTP URI if the referror is
the authorization policies applied to REFER requests and for on an explicit list of approved referrors. In the absence of
the use of message/sipfrag (RFC 3420, see Section 2.8.2) to such preprovided authorization, the user must interactively
convey the results of the referenced request. provide authorization to reference the indicated resource. To
see the danger of a policy that blindly accepts and acts on an
HTTP URI, for example, consider a web server configured
19.2.2.1.1 Constructing a Refer-To Uniform
to accept requests only from clients behind a small organi-
Resource Identifier
zation’s firewall. As it sits in this soft-creamy-middle envi-
This mechanism relies on providing contact information ronment where the small organization trusts all its members
for the referred-to resource to the party being referred. Care and has little internal security, the web server is frequently
should be taken to provide a suitably restricted Uniform behind on maintenance, leaving it vulnerable to attack
Resource Identifier (URI) if the referred-to resource should through maliciously constructed URIs (resulting perhaps in
be protected. running arbitrary code provided in the URI).
If a SIP UA inside this firewall blindly accepted a refer-
ence to an arbitrary HTTP URI, an attacker outside the fire-
19.2.2.1.2 Authorization Considerations for REFER
wall could compromise the web server. On the other hand, if
We are clarifying some authorization issues specific to the the UA’s user has to take positive action (such as responding
SIP REFER method here, although SIP authorization in to a prompt) before acting on this URI, the risk is reduced
general that is specified in Section 19.5 is also applicable for to the same level as the user clicking on the URI in a web
this request message. As described for the REFER method browser or e-mail message. The conclusion in the above
in Section 2.5, an implementation can receive a REFER paragraph generalizes to URIs with an arbitrary scheme. An
request with a Refer-To URI containing an arbitrary agent that takes automated action to access a URI with a
scheme. For instance, a user could be referred to an online given scheme risks being used to indirectly attack another
service by using a telnet URI. Customer service could refer host that is vulnerable to some security flaw related to that
a customer to an order-tracking web page using an HTTP scheme. This risk and the potential for harm to that other
URI. The SIP REFER method allows a user agent (UA) host is heightened when the host and agent reside behind
to reject a REFER request when it cannot process the ref- a common policy-enforcement point such as a firewall.
erenced scheme. It also requires the UA to obtain autho- Furthermore, this agent increases its exposure to denial-of-
rization from its user before attempting to use the URI. service (DoS) attacks through resource exhaustion, especially
Generally, this could be achieved by prompting the user if each automated action involves opening a new connection.
with the full URI and with a question such as “Do you UAs should take care when handing an arbitrary URI to a
wish to access this resource (yes/no)?” Of course, URIs can third-party service such as that provided by some modern
be arbitrarily long and are occasionally constructed with operating systems, particularly if the UA is not aware of the
malicious intent, so care should be taken to avoid surprises scheme and the possible ramifications using the protocols it
even in the display of the URI itself (such as partial display indicates. The opportunity for violating the principal of least
or crashing). Furthermore, care should be taken to expose surprise is very high.
as much information about the reference as possible to the
user to mitigate the risk of being misled into a dangerous
19.2.2.1.3 Considerations for the Use
decision. For instance, the Refer-To header may contain a
display name along with the URI. Nothing ensures that
of message/sipfrag
any property implied by that display name is shared by the Using message/sipfrag bodies to return the progress and
URI. For instance, the display name may contain secure results of a REFER request is extremely powerful. Careless
or president and when the URI indicates sip:agent59@tele use of that capability can compromise confidentiality and
marketing.example.com. Thus, prompting the user with privacy. Here are a couple of simple, somewhat contrived,
the display name alone is insufficient. examples to demonstrate the potential for harm:
In some cases, the user can provide authorization for
some REFER requests ahead of time by providing policy ◾◾ Circumventing privacy
to the UA. This is appropriate, for instance, for call transfer Suppose Alice has a UA that accepts REFER
as described in RFC 5589 (see Section 16.2). Here, a prop- requests to SIP INVITE URIs, and NOTIFYs the
erly authenticated REFER request within an existing SIP referrer of the progress of the INVITE by copy-
dialog to a sip:, sips:, or tel: URI may be accepted through ing each response to the INVITE into the body of a
policy without interactively obtaining the user’s authoriza- NOTIFY. Suppose further that Carol (see Section 2.5
tion. Similarly, it may be appropriate to accept a properly for REFER method) has a reason to avoid Mallory and
660 ◾ Handbook on Session Initiation Protocol
has configured her system at her proxy to only accept Future extensions may place additional constraints on
calls from a certain set of people she trusts (includ- the agent responding to REFER to allow using the message/
ing Alice), so that Mallory does not learn when she is sipfrag body part in a NOTIFY to make statements like “I
around, or what UA she is actually using. Mallory can contacted the party you referred me to, and here is crypto-
send a REFER to Alice, with a Refer-To URI indicat- graphic proof.” These statements might be used to affect the
ing Carol. If Alice can reach Carol, the 200 OK Carol behavior of the receiving UA. This kind of extension will
sends gets returned to Mallory in a NOTIFY, letting need to define additional mechanism to protect itself from
him know not only that Carol is around, but also the copy-based attacks.
IP address of the agent she is using.
◾◾ Circumventing confidentiality
19.2.2.2 P-Asserted- and Preferred-Services
Suppose Alice, with the same UA as above, is work-
Identification Security
ing at a company that is working on the greatest SIP
device ever invented—the SIP FOO. The company has We have discussed the P-Asserted-Services and P-Preferred-
been working for months building the device and the Services Identification header field in Section 12.3 (RFC
marketing materials, carefully keeping the idea, even 6050). The mechanism provides a partial consideration of
the name of the idea secret (since a FOO is one of those the problem of service identification in SIP. For example,
things that anybody could do if they just had the idea these mechanisms provide no means by which end users
first). FOO is up and running, and anyone at the com- can securely share service information end-to-end without a
pany can use it, but it is not available outside the com- trusted service provider. This information is secured by tran-
pany firewall. Mallory has heard rumor that Alice’s sitive trust, which is only as reliable as the weakest link in the
company is onto something big, and has even managed chain of trust. The Trust Domain provides a set of servers
to get his hands on a URI that he suspects might have where the characteristics of the service are agreed for that ser-
something to do with it. He sends a REFER to ALICE vice identifier value, and where the calling user is entitled to
with the mysterious URI and as Alice connects to the use that service. RFC 5897 (see Sections 12.1 through 12.3)
FOO, Mallory gets NOTIFYs with bodies containing identifies the impact of allowing such service identifier values
server: FOO/v0.9.7. to leak outside of the Trust Domain, including implications
◾◾ Limiting the breach on interoperability, and stifling of service innovation.
For each of these cases, and in general, returning In addition to interoperability and stifle in services inno-
a carefully selected subset of the information avail- vative issues, the declarative service identification can lead
able about the progress of the reference through the to fraud. If a provider uses the service identifier for billing
NOTIFYs mitigates risk. The minimal implementa- and accounting purposes, or for authorization purposes, it
tion for the REFER method described in Section 2.5 opens an avenue for attack. The user can construct the sig-
exposes the least information about what the agent naling message so that its actual effect (which is the service
operating on the REFER request has done, and is least the user will receive) is what the user desires, but the user
likely to be a useful tool for malicious users. places a service identifier into the request (which is what is
◾◾ Cut, paste, and replay considerations used for billing and authorization) that identifies a cheaper
The mechanism defined in this specification is not service, or one that the user is not authorized to receive. In
directly susceptible to abuse through copying the such a case, the user will receive service, and not be billed
message/sipfrag (RFC 5589, see Section 16.2) bodies properly for it. If, however, the domain administrator derived
from NOTIFY requests and inserting them, in whole the service identifier from the signaling itself (derived service
or in part, in future NOTIFY requests associated with identification), the user cannot lie. If they did lie, they would
the same or a different REFER. Under this specifica- not get the desired service. Consider the example of Internet
tion the agent replying to the REFER request is in Protocol television (IPTV) versus multimedia conferencing.
complete control of the content of the bodies of the If multimedia conferencing is cheaper, the user could send an
NOTIFY it sends. There is no mechanism defined here INVITE for an IPTV session, but include a service identifier
requiring this agent to faithfully forward any informa- that indicates multimedia conferencing.
tion from the referenced party. Thus, saving a body for The user gets the service associated with IPTV, but at the
later replay gives the agent no more ability to affect the cost of multimedia conferencing. This same principle shows
mechanism defined in this document at its peer than it up in other places—for example, in the identification of an
has without that body. Similarly, capture of a message/ emergency services call (see Section 16.11). It is desirable to
sipfrag body by eavesdroppers will give them no more give emergency services calls special treatment, such as being
ability to affect this mechanism than they would have free and authorized even when the user cannot otherwise
without it. make calls, and to give them priority. If emergency calls were
Security Mechanisms in SIP ◾ 661
indicated through something other than the target of the party. Once this information is given out, any use may be
call being an emergency services Uniform Resource Name made of it, including relaying to a third party as in this speci-
(URN) (RFC 5031, see Section 16.11.2), it would open an fication. A REFER-Issuer must not create or guess feature
avenue for fraud. The user could place any desired URI in tags. Instead, a feature tag included in a REFER should be
the request-URI, and indicate separately, through a declara- discovered in an authenticated and secure method (such as
tive identifier, that the call is an emergency services call. This an OPTIONS response or from a remote target URI in a
would then get special treatment but of course would get dialog) directly from the REFER-Target. It is recommended
routed to the target URI. The only way to prevent this fraud that the REFER-Issuer includes in the Refer-To header field
is to consider an emergency call as any call whose target is all feature tags that were listed in the most recent Contact
an emergency services URN. Thus, the service identification header field of the REFER-Target.
here is based on the target of the request. When the target A feature tag provided by a REFER-Issuer cannot be
is an emergency services URN, the request can get special authenticated or certified directly from the REFER request.
treatment. The user cannot lie, since there is no way to sepa- As such, the REFER-Recipient must treat the information
rately indicate that this is an emergency call, besides target- as a hint. If the REFER-Recipient application logic or user’s
ing it to an emergency URN. action depends on the presence of the expressed feature, the
RFC 5876 (see Section 10.4.4) raises the following addi- feature tag can be verified. For example, in order to do so,
tional security considerations. When adding a P-Asserted- the REFER-Recipient can directly send an OPTIONS query
Identity header field to a message, an entity must have to the REFER-Target over a secure (e.g., mutually authenti-
authenticated the source of the message by some means. One cated and integrity-protected) connection. This protects the
means is to challenge the sender of a message to provide SIP REFER-Recipient against the sending of incorrect or mali-
Digest authentication. Responses cannot be challenged, and cious feature tags.
also ACK and CANCEL requests cannot be challenged.
Therefore, this document limits the use of P-Asserted-
19.2.2.4 Offer–Answer Security
Identity to requests other than ACK and CANCEL. When
sending a request containing the P-Asserted-Identity header There are numerous attacks possible if an attacker can mod-
field and also the Privacy header field with value id to a node ify offers or answers described in Section 3.8.4 in transit.
within the Trust Domain, special considerations apply if Generally, these include diversion of media streams (enabling
that node does not support this specification. Section 10.5.2 eavesdropping), disabling of calls, and injection of unwanted
makes a special provision for this case. media streams. If a passive listener can construct fake offers,
When receiving a request containing a P-Asserted-Identity and inject those into an exchange, similar attacks are possi-
header field, a proxy will trust the assertion only if the source ble. Even if an attacker can simply observe offers and answers,
is known to be within the Trust Domain and behaves in they can inject media streams into an existing conversation.
accordance with a Spec(T), which defines the security require- Offer–answer relies on transport within an application sig-
ments. This applies regardless of the nature of the resource naling protocol, such as SIP. It also relies on that protocol for
(UA or proxy). One example where a trusted source might security capabilities. Because of the attacks described above,
be a UA is a public switched telephone network (PSTN) that protocol MUST provide a means for end-to-end authen-
gateway. In this case, the UA can assert an identity received tication and integrity protection of offers and answers.
from the PSTN, with the proxy itself having no means to It should offer encryption of bodies to prevent eavesdrop-
authenticate such an identity. A SIP entity must not trust ping. However, media injection attacks can alternatively be
an identity asserted by a source outside the Trust Domain. resolved through authenticated media exchange, and there-
Typically, a UA under the control of an individual user (such fore the encryption requirement is a should instead of a must.
as a desk phone or mobile phone) should not be considered Replay attacks are also problematic. An attacker can replay
part of a Trust Domain. When receiving a response from a an old offer, perhaps one that had put media on hold, and
node outside the Trust Domain, a proxy has no standardized thus disable media streams in a conversation. Therefore, the
SIP means to authenticate the source of the response. For this application protocol must provide a secure way to sequence
reason, this document does not specify the use of P-Asserted- offers and answers, and to detect and reject old offers or
Identity or P-Preferred-Identity in responses. answers, and SIP meets all of these requirements.
Internet. Rather, the intention is to state requirements for it is not, effectively unauthenticated access to location infor-
a mechanism to be used within a community of devices mation must be permitted. In this case, choosing pseudo
that are known to obey the specification of the mechanism random URIs for location-by-reference, coupled with path
(Spec(T)) and between which there are secure connections. encryption like SIP Secure (SIPS), can help ensure that only
Such a community is known here as a Trust Domain. The entities on the SIP signaling path learn the URI. Thus, it can
requirements on the mechanisms used for security and to restore rough parity with sending location-by-value. Location
initially derive the NAI must be part of the specification information is especially sensitive when the identity of its tar-
Spec(T). The requirements also support the transfer of infor- get is obvious. Note that there is the ability, according to RFC
mation from a node within the Trust Domain, via a secure 3693 (see Sections 2.8 and 9.10.1), to have an anonymous
connection to a node outside the Trust Domain. Use of this identity for the target’s location. This is accomplished by the
mechanism in any other context has serious security short- use of an unlinkable pseudonym in the entity= attribute of
comings, namely that there is absolutely no guarantee that the <presence> element defined in RFC 4479. However, this
the information has not been modified, or was even correct can be problematic for routing messages based on location
in the first place. covered in RFC 4479.
Moreover, anyone fishing for information would corre-
late the identity at the SIP layer with that of the location
19.2.2.6 Location Information Security information referenced by SIP signaling. When a UA inserts
RFC 6442 (see Section 9.10), which specifies the target’s location, the UA sets the policy on whether to reveal its loca-
location information creation and distribution, describes tion along the signaling path—as discussed earlier, as well as
the security of the target. Conveyance of the physical loca- flags in the PIDF-LO defined in RFC 4119 (see Section 2.8).
tion of a UA raises privacy concerns (see Section 20.2), and UA client (UAC) implementations must make such capabili-
depending on use, there probably will be authentication ties conditional on explicit user permission, and must alert
and integrity concerns. RFC 6442 (see Section 9.10) calls the user that location is being conveyed. This SIP extension
for conveyance to be accomplished through secure mecha- offers the default ability to require permission to process
nisms, like Secure/Multipurpose Internet Mail Extension location while the SIP request is in transit. The default for
(S/MIME) (see Section 19.6) encrypting message bodies this is set to no. There is an error explicitly describing how
(although this is not widely deployed), or TLS protecting the an intermediary asks for permission to view the target’s loca-
overall signaling or conveyance location-by-reference and tion, plus a rule stating the user has to be made aware of this
requiring all entities that dereference location to authenti- permission request. There is no end-to-end integrity on any
cate themselves. In location-based routing cases, encrypting locationValue or locationErrorValue header field parameter
the location payload with an end-to-end mechanism such (or middle-to-end if the value was inserted by an interme-
as S/MIME is problematic because one or more proxies on diary), so recipients of either header field need to implicitly
the path need the ability to read the location information trust the header field contents, and take whatever precau-
to retarget the message to the appropriate new destination tions each entity deems appropriate given this situation.
UA server (UAS). Data can only be encrypted to a particu-
lar, anticipated target, and thus if multiple recipients need to
19.2.2.7 Early-Session Option Tags Security
inspect a piece of data, and those recipients cannot be pre-
dicted by the sender of data, encryption is not a very feasible We have discussed the Early-Session option tags in Section
choice. Securing the location hop-by-hop, using TLS, pro- 11.4 in the context of the early-media solution model with
tects the message from eavesdropping and modification in Disposition-Type SIP message header. The security impli-
transit, but exposes the information to all proxies on the path cations of using early-session bodies in SIP are the same as
as well as the end point. when using session bodies; they are part of the offer–answer
In most cases, the UA has no trust relationship with the model. SIP uses the offer–answer model to establish early
proxy or proxies providing location-based routing services, so sessions in both the gateway and the application server mod-
such end-to-middle solutions might not be appropriate either. els. UAs generate a session description, which contains the
When location information is conveyed by reference, however, transport address (i.e., IP address plus port) where they want
one can properly authenticate and authorize each entity that to receive media, and send it to their peer in a SIP message.
wishes to inspect location information. This does not require When media packets arrive at this transport address, the UA
that the sender of data anticipate who will receive data, and assumes that they come from the receiver of the SIP message
it does permit multiple entities to receive it securely; however, carrying the session description. Nevertheless, attackers may
it does not obviate the need for preassociation between the attempt to gain access to the contents of the SIP message and
sender of data and any prospective recipients. Obviously, in send packets to the transport address contained in the session
some contexts, this preassociation cannot be presumed; when description.
Security Mechanisms in SIP ◾ 663
To prevent this situation, UAs should encrypt their ses- potential for unauthorized use of early media. It is assumed
sion descriptions (e.g., using S/MIME). Still, even if a UA that the P-Early-Media header field is used within the con-
encrypts its session descriptions, an attacker may try to guess text of a given administrative Trust Domain (e.g., Third
the transport address used by the UA and send media packets Generation Partnership Project IP multimedia subsystem
to that address. Guessing such a transport address is some- [3GPP IMS] network) or a similar Trust Domain, consisting
times easier than it may seem because many UAs always pick of a collection of SIP servers maintaining pairwise security
up the same initial media port. To prevent this situation, associations.
UAs should use media-level authentication mechanisms (e.g., Within the Trust Domain of a network, it is only neces-
SRTP; see Section 7.3). In addition, UAs that wish to keep sary to police the use of the P-Early-Media header field at
their communications confidential should use media-level the boundary to user equipment served by the network and
encryption mechanisms (e.g., SRTP). at the boundary to peer networks. It is assumed that bound-
Attackers may attempt to make a UA send media to a ary servers in the Trust Domain of a network will have local
victim as part of a DoS attack. This can be done by send- policy for the treatment of the P-Early-Media header field
ing a session description with the victim’s transport address as it is sent to or received from any possible server external
to the UA. To prevent this attack, the UA should engage to the network. Since boundary servers are free to modify
in a handshake with the owner of the transport address or remove any P-Early-Media header field in SIP messages
received in a session description (just verifying willingness forwarded across the boundary, the integrity of the P-Early-
to receive media) before sending a large amount of data to Media header field can be verified to the extent that the
the transport address. This check can be performed by using connections to external servers are secured. The authentic-
a connection-oriented transport protocol, by using STUN ity of the P-Early-Media header field can only be assured to
(see Section 14.3.2.1) in an end-to-end fashion, or by the key the extent that the external servers are trusted to police the
exchange in SRTP (see Section 7.3). In any event, note that authenticity of the header field.
the previous security considerations are not early-media spe-
cific, but apply to the usage of the offer–answer model in SIP
19.2.2.9 Served-User Identification Security
to establish sessions in general.
Additionally, an early-media-specific risk (roughly speak- We have described the Served-User Identification in SIP in
ing, an equivalent to forms of toll fraud in the PSTN) Section 12.4. The P-Served-User header field defined in this
attempts to exploit the different charging policies some oper- document is to be used in an environment where elements are
ators apply to early and to regular media. When UAs are trusted and where attackers are not supposed to have access to
allowed to exchange early media for free, but are required the protocol messages between those elements. Traffic protec-
to pay for regular media sessions, rogue UAs may try to tion between network elements is sometimes achieved by using
establish a bidirectional early-media session and never send IPsec, and sometimes by physically protecting the network.
a 2xx response for the INVITE. On the other hand, some In any case, the environment where the P-Served-User header
application servers (e.g., interactive voice response systems) field will be used ensures the integrity and the confidentiality
use bidirectional early media to obtain information from the of the contents of this header field. The Spec(T) that defines
callers (e.g., the personal identification number [PIN] code the Trust Domain for P-Served-User must require that mem-
of a calling card). Thus, we do not recommend that operators ber nodes understand the P-Served-User header extension.
disallow bidirectional early media. Instead, operators should There is a security risk if a P-Served-User header field is
consider a remedy of charging early-media exchanges that allowed to propagate out of the Trust Domain where it was
last too long, or stopping them at the media level (according generated. In that case, user-sensitive information would be
to the operator’s policy). revealed. To prevent such a breach from occurring, proxies
must not insert the header when forwarding requests to a hop
that is located outside the Trust Domain. When forwarding
19.2.2.8 Early-Media Security the request to a node in the Trust Domain, proxies must not
We have described early media in SIP in Chapter 11. The use insert the header unless they have sufficient knowledge that
of this P-Early-Media header field is only applicable inside a the route set includes another proxy in the Trust Domain
Trust Domain, as defined in RFC 3324 (see Section 10.5). that understands the header, such as the home proxy. There
As stated earlier, this header does not offer a general early- is no automatic mechanism to learn the support for this
media authorization model suitable for interdomain use or specification. Proxies must remove the header when forward-
use in the Internet at large. No confidentiality concerns are ing requests to nodes that are not in the Trust Domain or
associated with the P-Early-Media header field. It is desir- when the proxy does not have knowledge of any other proxy
able to maintain the integrity of the direction parameters in included in the route set that will remove it before it is routed
the header field across each hop between servers to avoid the to any node that is not in the Trust Domain.
664 ◾ Handbook on Session Initiation Protocol
19.2.2.10 Connection Reuse Security resolution is set up such that a Domain Name System (DNS)
lookup of example.com and example.net both resolve to an
RFC 5923 (see Section 13.2.8) presents requirements and a
{IP-address, port, transport} tuple of {192.0.2.1, 5061, TLS}.
mechanism for reusing existing connections easily. Unauthen
A UA in the example.com domain sends a request to P1 caus-
ticated connection reuse would present many opportunities
ing it to make a downstream connection to its peering proxy,
for rampant abuse and hijacking. Authenticating connection
P2, and authenticating itself as a proxy in the example.com
aliases is essential to prevent connection hijacking. For example,
domain by sending it a X.509 certificate asserting such an
a program run by a malicious user of a multiuser system could
identity. P2’s alias table now looks like the following:
attempt to hijack SIP requests destined for the well-known SIP
port from a large relay proxy.
Destination Destination Destination Destination Alias
IP Address Port Transport Identity Descriptor
19.2.2.10.1 Authenticating TLS Connections:
…
Client View
192.0.2.1 5061 TLS sip:example. 18
When a TLS client establishes a connection with a server, it is com
presented with the server’s X.509 certificate. Authentication
proceeds as described in RFC 5922 (see Section 19.4.6).
At some later point in time, a UA in P2’s domain wants
to send a request to a UA in the example.net domain. P2
19.2.2.10.2 Authenticating TLS performs an RFC 3263 (see Section 8.2.4) server resolution
Connections: Server View process on sips:example.net to derive a resolved address tuple
{192.0.2.1, 5061, TLS}. It appears that a connection to this
A TLS server conformant to this specification must ask
network address is already cached in the alias table; however,
for a client certificate; if the client possesses a certificate, it
P2 cannot reuse this connection because the destination
will be presented to the server for mutual authentication,
identity (sip:example.com) does not match the server iden-
and authentication proceeds as described in RFC 5922 (see
tity used for RFC 3261 resolution (sips:example.net). Hence,
Section 19.4.6). If the client does not present a certificate, the
P2 will open up a new connection to the example.net virtual
server must proceed as if the alias header field parameter was
domain hosted on P1. P2’s alias table will now look like
not present in the topmost Via header. In this case, the server
must not update the alias table.
Destination Destination Destination Destination Alias
IP Address Port Transport Identity Descriptor
19.2.2.10.3 Connection Reuse and Virtual Servers
…
Virtual servers present special considerations for connection
192.0.2.1 5061 TLS sip:example. 18
reuse. Under the name-based virtual server scheme, one SIP com
proxy can host many virtual domains using one IP address
and port number. If adequate defenses are not put in place, a 192.0.2.1 5061 TLS sip:example. 54
net
connection opened to a downstream server on behalf of one
domain can be reused to send requests in the backwards direc-
tion to a different domain. The Destination Identity column in The identities conveyed in an X.509 certificate are asso-
the alias table has been added to aid in such defenses. Virtual ciated with a specific TLS connection. With the absence of
servers must only perform connection reuse for TLS connec- such a guarantee of an identity tied to a specific connection,
tions; virtual servers must not perform connection reuse for a normal TCP or SCTP connection cannot be used to send
other connection-oriented transports. To understand why this requests in the backwards direction without a significant
is the case, note that the alias table caches not only which con- risk of inadvertent (or otherwise) connection hijacking. The
nections go to which destination addresses, but also which above discussion details the impact on P2 when connec-
connections have authenticated themselves as responsible for tion reuse is desired for virtual servers. There is a subtle but
which domains. If a message is to be sent in the backwards important impact on P1 as well. P1 should keep separate alias
direction to a new SIP domain that resolves to an address with tables for the requests served from the UAs in the example.
a cached connection, the cached connection cannot be used com domain from those served by the UAs in the example.
because it is not authenticated for the new domain. net domain. This is so that the boundary between the two
As an example, consider a proxy P1 that hosts two vir- domains is preserved; P1 must not open a connection on
tual domains—example.com and example.net—on the same behalf of one domain and reuse it to send a new request on
IP address and port. RFC 3263 (see Section 8.2.4) server behalf of another domain.
Security Mechanisms in SIP ◾ 665
19.2.2.11 Loss-Based Overload applying BCP 38 (RFC 2827). Attacks that are mounted
Control Security to suppress genuine overload conditions can be similarly
avoided by using TLS on the connection. Generally, TCP or
We have described the overload control in SIP networks in WebSockets (RFC 6455) in conjunction with BCP 38 makes
Section 13.3. Overload control mechanisms can be used it more difficult for an attacker to insert or modify messages,
by an attacker to conduct a DoS attack on a SIP entity if but may still prove inadequate against an adversary that con-
the attacker can pretend that the SIP entity is overloaded. trols links L1 and L2. TLS provides the best protection from
When such a forged overload indication is received by an an attacker with access to the network links.
upstream SIP client, it will stop sending traffic to the vic- Another way to conduct an attack is to send a message
tim. Thus, the victim is subject to a DoS attack. To bet- containing a high overload feedback value through a proxy
ter understand the threat model, consider the diagram in that does not support this extension. If this feedback is
Figure 19.1. added to the second Via header (or all Via headers), it will
Here, requests travel downstream from the left-hand reach the next upstream proxy. If the attacker can make the
side, through proxy P1, toward the right-hand side; responses recipient believe that the overload status was created by its
travel upstream from the right-hand side, through P1, toward direct downstream neighbor (and not by the attacker fur-
the left-hand side. Proxies Pa, Pb, and P1 support overload ther downstream), the recipient stops sending traffic to the
control. L1 and L2 are labels for the links connecting P1 to victim. A precondition for this attack is that the victim
the upstream clients and downstream servers. If an attacker proxy does not support this extension since it would not pass
is able to modify traffic between Pa and P1 on link L1, it can through overload control feedback otherwise. A malicious
cause a DoS attack on P1 by having Pa not send any traffic to SIP entity could gain an advantage by pretending to support
P1. Such an attack can proceed by the attacker modifying the this specification but never reducing the amount of traffic
response from P1 to Pa such that Pa’s Via header is changed it forwards to the downstream neighbor. If its downstream
to indicate that all requests destined toward P1 should be neighbor receives traffic from multiple sources that correctly
dropped. Conversely, the attacker can simply remove any oc, implement overload control, the malicious SIP entity would
oc-validity, and oc-seq markings added by P1 in a response benefit since all other sources to its downstream neighbor
to Pa. In such a case, the attacker will force P1 into over- would reduce load. Note that the solution to this problem
load by denying request quenching at Pa even though Pa is depends on the overload control method. With rate-based,
capable of performing overload control. window-based, and other similar overload control algorithms
Similarly, if an attacker is able to modify traffic between that promise to produce no more than a specified number of
P1 and Pb on link L2, it can change the Via header associ- requests per unit time, the overloaded server can regulate the
ated with P1 in a response from Pb to P1 such that all subse- traffic arriving to it.
quent requests destined toward Pb from P1 are dropped. In However, when using loss-based overload control, such
essence, the attacker mounts a DoS attack on Pb by indicat- policing is not always obvious since the load forwarded
ing false overload control. Note that it is immaterial whether depends on the load received by the client. To prevent such
Pb supports overload control or not; the attack will succeed attacks, servers should monitor client behavior to determine
as long as the attacker is able to control L2. Conversely, an whether they are complying with overload control policies.
attacker can suppress a genuine overload condition at Pb by If a client is not conforming to such policies, then the server
simply removing any oc, oc-validity, and ocseq markings should treat it as a nonsupporting client described earlier.
added by Pb in a response to P1. In such a case, the attacker Finally, a distributed DoS (DDoS) attack could cause an
will force P1 into sending requests to Pb even under over- honest server to start signaling an overload condition. Such
load conditions because P1 would not be aware that Pb sup- a DDoS attack could be mounted without controlling the
ports overload control. Attacks that indicate false overload communications links since the attack simply depends on
control are best mitigated by using TLS in conjunction with
Pa Pb
... P1 ...
L1 L2
Upstream (requests)
the attacker injecting a large volume of packets on the com- 19.2.2.13 Resource-Priority Security
munication links. If the honest server attacked by a DDoS
attack has a long oc-validity interval and the attacker can 19.2.2.13.1 Background
guess this interval, the attacker can keep the server over- We have described resource priority in SIP in Section 15.2.
loaded by synchronizing the DDoS traffic with the validity Any resource-priority mechanism can be abused to obtain
period. While such an attack may be relatively easy to spot, resources and thus deny service to other users. An adver-
mechanisms for combating it are outside the scope of this sary may be able to take over a particular PSTN gateway,
document and, of course, since attackers can invent new cause additional congestion during emergencies affecting
variations, the appropriate mechanisms are likely to change the PSTN, or deny service to legitimate users. In SIP end
over time. systems, such as IP phones, this mechanism could inappro-
priately terminate existing sessions and calls. Thus, while the
19.2.2.12 Session Border Controller Security indication itself does not have to provide separate authenti-
cation, SIP requests containing this header are very likely to
We have described the session border controller (SBC) in SIP have higher authentication requirements than those without.
networks in Section 14.2. Many of the functions of the SBC These authentication and authorization requirements extend
have important security and privacy implications. One major to users within the administrative domain, as later intercon-
security problem is that many functions implemented by nection with other administrative domains may invalidate
SBCs (e.g., topology hiding and media traffic management) earlier assumptions on the trustworthiness of users. Below,
modify SIP messages and their bodies without the UAs’ con- we describe authentication and authorization aspects, con-
sent. The result is that the UAs may interpret the actions fidentiality and privacy requirements, protection against
taken by an SBC as a man-in-the-middle (MITM) attack. DoS attacks, and anonymity requirements. Naturally, the
SBCs modify SIP messages because it allows them to, for general discussion in RFC 3261 (see Section 19.9) applies.
example, protect elements in the inner network from direct All UAs and proxy servers that support this extension must
attacks. SBCs that place themselves (or another entity) on implement SIP over TLS specified in RFC 3546, the sips
the media path can be used to eavesdrop on conversations. URI scheme as described in RFC 3261 (see Section 19.12.2),
Since, often, UAs cannot distinguish between the actions of and Digest Authentication (see Section 19.4.5) as described
an attacker and those of an SBC, users cannot know whether in RFC 3261. In addition, UAs that support this exten-
they are being eavesdropped on or if an SBC on the path is sion should also implement S/MIME (see Section 19.6) as
performing some other function. SBCs place themselves on described in RFC 3261 to allow for signing and verification
the media path because it allows them to, for example, per- of signatures over requests that use this extension.
form legal interception.
On a general level, SBCs prevent the use of end-to-end
19.2.2.13.2 Authentication and Authorization
authentication. This is because SBCs need to be able to per-
form actions that look like MITM attacks, and in order Prioritized access to network and end-system resources
for UAs to communicate, they must allow those type of imposes particularly stringent requirements on authentica-
attacks. It other words, UAs cannot use end-to-end security. tion and authorization mechanisms, since access to priori-
This is especially harmful because other network elements, tized resources may influence overall system stability and
besides SBCs, are then able to do similar attacks. However, performance and not just result in theft of, say, a single phone
in some cases, UAs can establish encrypted media connec- call. Under certain emergency conditions, the network infra-
tions between one another. One example is a scenario where structure, including its authentication and authorization
SBC is used for enabling media monitoring but not for mechanism, may be under attack. Given the urgency during
interception. emergency events, normal statistical fraud detection may be
An SBC is a single point of failure from the architec- less effective, thus placing a premium on reliable authentica-
tural point of view. This makes it an attractive target for tion. Common requirements for authentication mechanisms
DoS attacks. The fact that some functions of SBCs require apply, such as resistance to replay, cut-and-paste, and bid-
those SBCs to maintain session-specific information makes down attacks.
the situation even worse. If the SBC crashes (or is brought Authentication may be SIP based or use other mecha-
down by an attacker), ongoing sessions experience undeter- nisms. Use of Digest authentication or S/MIME is recom-
mined behavior. If the International Engineering Task Force mended for UAS authentication. Digest authentication
(IETF) decides to develop standard mechanisms to address requires that the parties share a common secret, thus limiting
the requirements presented in Section 14.2.4, the security- its use across administrative domains. SIP systems employ-
and privacy-related aspects of those mechanisms will, of ing resource priority should implement S/MIME at least
course, need to be taken into consideration. for integrity, as described in RFC 3261 (see Section 19.6).
Security Mechanisms in SIP ◾ 667
19.2.2.16 Call Transfer Security step and information transfer. There are a number of
potential abuses of a content indirection mechanism:
The call transfer that is explained in detail in Section
16.2 is implemented using the REFER and Replaces call
◾◾ Content indirection allows the initiator to choose an
control primitives in SIP. As such, the security consid-
alternative protocol with weaker security or known
erations detailed in the REFER and Replaces must be
vulnerabilities for the content transfer (e.g., asking the
followed. The security addresses the issue of protecting
recipient to issue an HTTP request that results in a
the address of record (AOR) URI of a Transfer Target in
basic authentication challenge).
Sections 16.2.6.2 through 16.2.7. Any REFER request
◾◾ Content indirection allows the initiator to ask the
must be appropriately authenticated and authorized
recipient to consume additional resources in the infor-
using standard SIP mechanisms or else calls may be
mation transfer and content processing, potentially
hijacked. A UA may use local policy or human interven-
creating an avenue for DoS attacks (e.g., an active File
tion in deciding whether or not to accept a REFER. In
Transfer Protocol Universal Resource Locator [URL]
generating NOTIFY responses based on the outcome of
consuming two connections for every indirect-content
the triggered request, care should be taken in construct-
message).
ing the message/sipfrag body to ensure that no private
◾◾ Content indirection could be used as a form of port-
information is leaked. An INVITE containing a Replaces
scanning attack where the indirect-content URL is
header field should only be accepted if it has been prop-
actually a bogus URL pointing to an internal resource
erly authenticated and authorized using standard SIP
of the recipient. The response to the content indirec-
mechanisms, and the requestor is authorized to perform
tion request could reveal information about open (and
dialog replacement. Special care is needed if the replaced
vulnerable) ports on these internal resources.
dialog utilizes additional media streams compared with
◾◾ A content-indirection URL can disclose sensitive
the original dialog. In this case, the user must authorize
information about the initiator such as an internal user
the addition of new media streams in a dialog replace-
name (as part of an HTTP URL) or possibly geoloca-
ment. For example, the same mechanism used to autho-
tion information.
rize the addition of a media stream in a re-INVITE could
be used.
Fortunately, all of these potential threats can be mitigated
through careful screening of both the indirect-content URIs
19.2.2.17 Call Referring to Multiple that are received and those that are sent. Integrity and confi-
Resources Security dentiality protection of the indirect-content URI can prevent
additional attacks as well. For confidentiality, integrity, and
We have described referring calls to multiple resources authentication, this content-indirection mechanism relies on
in Section 16.5. Given that a REFER-Recipient accept- the security mechanisms outlined in RFC 3261 (see Section
ing REFER requests with multiple REFER-targets acts as 19.6). In particular, the usage of S/MIME (see Section 19.6)
a URI-list service, implementations of this type of server provides the necessary mechanism to ensure integrity, pro-
MUST follow the security-related rules that include opt- tection, and confidentiality of the indirect-content URI
in lists and mandatory authentication and authorization and associated parameters. Securing the transfer of the indi-
of clients. Additionally, REFER-Recipients should only rect content is the responsibility of the underlying protocol
accept REFER requests within the context of an applica- used for this transfer. If HTTP is used, applications imple-
tion that the REFER-Recipient understands (e.g., a con- menting this content-indirection method should support the
ferencing application). This implies that REFER-Recipients HTTPS URI scheme for secure transfer of content and must
must not accept REFER requests for methods they do support the upgrading of connections to TLS (see Section
not understand. The idea behind these two rules is that 3.13), by using starttls.
REFER-Recipients are not used as dumb servers whose Note that a failure to complete HTTPS or starttls (e.g.,
only function is to fan-out random messages they do not due to certificate or encryption mismatch) after having
understand. accepted the indirect content in the SIP request is not the
same as rejecting the SIP request, and it may require addi-
tional user–user communication for correction. Note that
19.2.2.18 Content-Indirection Security
this document does not advocate the use of transitive trust.
We have articulated the call services with content indi- That is, just because the UAS receives a URI from a UAC
rection in SIP in Section 16.6. Any content-indirection that the UAS trusts, the UAS should not implicitly trust the
mechanism introduces additional security concerns. By its object referred to by the URI without establishing its own
nature, content indirection requires an extra processing trust relationship with the URI provider. Access control to
Security Mechanisms in SIP ◾ 669
the content referenced by the URI is not defined by this spec- apply based on the participants in the call? Ideally, the par-
ification. Access control mechanisms may be defined by the ticipants would be able to know the identities of both other
protocol for the scheme of the indirect-content URI. If the parties, and have authorization policies be based on those,
UAC knows the content in advance, the UAC should include as appropriate. However, this is not possible using existing
a hash parameter in the content indirection. mechanisms.
The hash parameter is a hexadecimal-encoded SHA-1 As a result, the next best thing is for the INVITE requests
(RFC 3174) hash of the indirect content. If a hash value is to contain the identity of the third party. Ultimately, this
included, the recipient MUST check the indirect content is the user who is requesting communication, and it makes
against that hash and indicate any mismatch to the user. In sense for call authorization policies to be based on that iden-
addition, if the hash parameter is included and the target tity. This requires, in turn, that the controller authenticate
URI involves setting up a security context using certificates, itself as that third party. This can be challenging, and the
the UAS must ignore the results of the certificate validation appropriate mechanism depends on the specific application
procedure, and instead verify that the hash of the (canonical- scenario. In one common scenario, the controller is acting
ized) content received matches the hash presented in the con- on behalf of one of the participants in the call. A typical
tent-indirection hash parameter. If the hash parameter is not example is click-to-dial, where the controller and the cus-
included, the sender should use only schemes that offer mes- tomer service representative are run by the same administra-
sage integrity (such as https:). When the hash parameter is tive domain. Indeed, for the purposes of identification, the
not included and security using certificates is used, the UAS controller can legitimately claim to be the customer service
must verify any server certificates, by using the UAS’s list of representative. In this scenario, it would be appropriate for
trusted top-level certificate authorities. If hashing of indirect the INVITE to the end user to contain a From field identify-
content is not used, the content returned to the recipient by ing the customer service representative, and authenticate the
exercise of the indirection might have been altered from that request using S/MIME (see Section 19.6) signed by the key
intended by the sender. of the customer service representative (which is held by the
controller). This requires the controller to actually have cre-
19.2.2.19 Third-Party Call Control Multiparty dentials with which it can authenticate itself as the customer
support representative.
Conferencing Security
In many other cases, the controller is representing one
The multiparty conferencing in SIP using third-party call of the participants, but does not possess their credentials.
control (3PCC) mechanisms are provided in Section 18.3. Unfortunately, there are currently no standardized mecha-
We are describing the security schemes for this in the follow- nisms that allow a user to delegate credentials to the con-
ing sections. troller in a way that limits their usage to specific 3PCC
operations. In the absence of such a mechanism, the best that
can be done is to use the display name in the From field to
19.2.2.19.1 Authorization and Authentication
indicate the identity of the user on whose behalf the call is
In most uses of SIP INVITE, whether or not a call is being made. It is recommended that the display name be set
accepted is based on a decision made by a human when pre- to “[controller] on behalf of [user],” where user and controller
sented information about the call, such as the identity of the are textual identities of the user and controller, respectively.
caller. In other cases, automata answer the calls, and whether In this case, the URI in the From field would identify the
or not they do so may depend on the particular application controller. In other situations, there is no real relationship
to which SIP is applied. For example, if a caller makes a SIP between the controller and the participants in the call. In
call to a voice portal service, the call may be rejected unless these situations, ideally the controller would have a means to
the caller has previously signed up (perhaps via a website). In assert that the call is from a particular identity (which could
other cases, call handling policies are made based on auto- be one of the participants, or even a third party, depending
mated scripts, such as those described by the Call Processing on the application), and to validate that assertion with a sig-
Language (CPL) in RFC 3880. Frequently, those decisions nature using the key of the controller. The security features
are also made on the basis of the identity of the caller. These described in RFCs 4474 (see Section 19.4.8) and 4916 (see
authorization mechanisms would be applied to normal first- Section 20.4) can be used for authenticated identity manage-
party calls and third-party calls, as these two are indistin- ment and connected identity in 3PCC scenarios, respectively.
guishable. As a result, it is important for these authorization
policies to continue to operate correctly for third-party calls.
Of course, third-party calls introduce a new party—the one
19.2.2.19.2 End-to-End Encryption and Integrity
initiating the third-party call. Do the authorizations policies With 3PCC, the controller is actually one of the participants
apply based on the identity of that third party, or do they as far as the SIP dialog is concerned. Therefore, encryption
670 ◾ Handbook on Session Initiation Protocol
for example, by using TLS (see Section 3.13) transport, then If the UUI data was included by the UA originator of
the INFO request and its contents will be vulnerable as well. the SIP request or response, normal SIP mechanisms can be
Even with SIP/TLS, any SIP hop along the path from UAC used to determine the identity of the inserter of the UUI
to UAS can view, modify, or intercept INFO requests, as they data. If the UUI data was included by a UA that was not
can with any SIP request. This means some applications may the originator of the request, a History-Info header field can
require end-to-end encryption of the INFO payload, beyond, be used to determine the identity of the inserter of the UUI
for example, hop-by-hop protection of the SIP signaling itself. data. UAs can apply policy based on the origin of the UUI
Since the application dictates the level of security required, indi- data using this information. In short, the UUI data included
vidual Info Packages have to enumerate these requirements. In in an INVITE can be trusted as much as the INVITE itself.
any event, the Info Package mechanism described by this docu- Note that it is possible that this mechanism could be used as
ment provides the tools for such secure, end-to-end transport a covert communication channel between UAs, conveying
of application data. One interesting property of Info Package information unknown to the SIP network.
usage is that one can reuse the same digest-challenge mecha-
nism used for INVITE-based authentication for the INFO
19.2.2.24 VoiceXML Media Server Security
request. For example, one could use a quality-of-protection
(qop) value of authentication with integrity (auth-int), to chal- We have described SIP interfaces to VoiceXML media server
lenge the request and its body, and prevent intermediate devices in Section 17.2. Exposing a VoiceXML media service with a
from modifying the body. However, this assumes the device well-known address may enhance the possibility of exploi-
that knows the credentials in order to perform the INVITE tation (e.g., an invoked network service may trigger a bill-
challenge is still in the path for the INFO request, or that the ing event). The VoiceXML media server is recommended
far-end UAS knows such credentials. to use standard SIP mechanisms of RFC 3261 to authenti-
cate (see Section 19.4) requesting end points and authorize
per local policy. Some applications may choose to transfer
19.2.2.23 User-to-User Information Security confidential information to or from the VoiceXML media
The user-to-user information (UUI) transfer service is dis- server. To provide data confidentiality, the VoiceXML media
cussed in Section 16.9. UUI data can potentially carry server must implement the sips: and https: schemes in addi-
sensitive information that might require confidentiality pro- tion to S/MIME message-body encoding as described in
tection for privacy or integrity protection from third parties RFC 3261 (see Section 19.6). The VoiceXML media server
that may wish to read or modify the UUI data. The three must support SRTP (see Section 7.3) to provide confidenti-
security models described in RFC 6567 (see Section 16.9) ality, authentication, and replay protection for RTP media
may be applicable for the UUI mechanism. One model treats streams (including RTCP control traffic). To mitigate the
the SIP layer as untrusted and requires end-to-end integrity possibility of DoS attacks, the VoiceXML media server is
protection or encryption. This model can be achieved by pro- recommended (in addition to authenticating and authorizing
viding these security services at a layer above SIP. In this end points described in Section 17.2) to provide mechanisms
case, applications are encouraged to use their own integrity for implementing local policies such as the time limiting of
or encryption mechanisms before passing it to the SIP layer. VoiceXML application execution.
The second approach is for the application to pass the
UUI without any protection to the SIP layer and require
19.2.2.25 Security for Multiple Telephone
the SIP layer to provide this security. This approach is pos-
AOR Registration
sible in theory, although its practical use would be extremely
limited. To preserve multihop or end-to-end confidential- RFC 6140 (see Section 3.3.5), which describes the multiple
ity and integrity of UUI data, approaches using S/MIME telephone AOR registration, takes the unprecedented step
(see Section 19.6) or IPsec can be used. However, the lack of extending the previously defined REGISTER method to
of deployment of these mechanisms means that applica- apply to more than one AOR. In general, this kind of change
tions cannot in general rely on them being present. The has the potential to cause problems at intermediaries, such as
third model utilizes a Trust Domain and relies on perimeter proxies, that are party to the REGISTER transaction. In par-
security at the SIP layer. This is the security model of the ticular, such intermediaries may attempt to apply policy to
PSTN and ISDN where UUI is commonly used today. This the user indicated in the To header field (i.e., the SIP–PBX’s
approach uses hop-by-hop security mechanisms and relies identity), without any knowledge of the multiple AORs that
on border elements for filtering and application of policy. are being implicitly registered. The mechanism defined by
Standard deployed SIP security mechanisms such as TLS this document solves this issue by adding an option tag to
transport offer privacy and integrity protection properties on a Proxy-Require header field in such REGISTER requests.
a hop-by-hop basis at the SIP layer. Proxies that are unaware of this mechanism will not process
672 ◾ Handbook on Session Initiation Protocol
the requests, preventing them from misapplying policy. up to an implementation, provided that the cryptographic
Proxies that process requests with this option tag are clearly properties are sufficient to prevent third parties from spoof-
aware of the nature of the REGISTER request and can make ing GRUU-related information. The procedure for the gen-
reasonable policy decisions. As noted in RFC 6140 (see eration of temporary GRUUs also requires the use of RSA
Section 3.3.5), intermediaries need to take care if they use keys. The selection of the proper key length for such keys
a policy token in the path and service route mechanisms, as requires careful analysis, taking into consideration the cur-
doing so will cause them to apply the same policy to all users rent and foreseeable speed of processing for the period of
serviced by the same SIP Private Branch Exchange (SIP– time during which GRUUs must remain anonymous, as well
PBX). This may frequently be the correct behavior, but cir- as emerging cryptographic analysis methods.
cumstances can arise in which differentiation of user policy The most recent guidance from RSA Laboratories [1] sug-
is required. gests a key length of 2048 bits for data that needs protection
RFC 6140 (see Section 3.3.5) also notes that these tech- through the year 2030, and a length of 3072 bits thereafter.
niques use a token or cookie in the Path or Service-Route Similarly, implementers are warned to take precautionary
header values, and that this value will be shared among all measures to prevent unauthorized disclosure of the private
AORs associated with a single registration. Because this key used in GRUU generation. Any such disclosure would
information will be visible to UAs under certain conditions, result in the ability to correlate temporary GRUUs to each
proxy designers using this mechanism in conjunction with other and, potentially, to their associated PBXs. Furthermore,
the techniques described in this document need to take care the use of RSA decryption when processing GRUUs received
that doing so does not leak sensitive information. One of from arbitrary parties can be exploited by DoS attackers to
the key properties of the outbound client connection mecha- amplify the impact of an attack: because of the presence of a
nism discussed in RFC 6140 (see Section 3.3.5) is the assur- cryptographic operation in the processing of such messages,
ance that a single connection is associated with a single user the CPU load may be marginally higher when the attacker
and cannot be hijacked by other users. With the mechanism uses (valid or invalid) temporary GRUUs in the messages
defined in this document, such connections necessarily employed by such an attack. Normal DoS mitigation tech-
become shared between users. However, the only entity in niques, such as rate-limiting processing of received mes-
a position to hijack calls as a consequence is the SIP–PBX sages, should help reduce the impact of this issue as well.
itself. Because the SIP–PBX acts as a registrar for all the Finally, good security practices should be followed regard-
potentially affected users, it already has the ability to redirect ing the duration an RSA key is used. For implementers, this
any such communications as it sees fit. In other words, the means that systems must include an easy way to update the
SIP–PBX must be trusted to handle calls in an appropriate public key provided to the SIP–PBX. To avoid immediately
fashion, and the use of the outbound connection mechanism invalidating all currently issued temporary GRUUs, the SSP
introduces no additional vulnerabilities. servers should keep the retired RSA key around for a grace
The ability to learn the identity and registration state period before discarding it. If decryption fails based on the
of every user on the PBX (RFC 6140, see Section 3.3.5) is new RSA key, then the SSP server can attempt to use the
invaluable for diagnostic and administrative purposes. For retired key instead. By contrast, the SIP–PBXs must discard
example, this allows the SIP–PBX to determine whether all the retired public key immediately and exclusively use the
its extensions are properly registered with the SSP. However, new public key.
this information can also be highly sensitive, as many orga-
nizations may not wish to make their entire list of phone
numbers available to external entities. Consequently, SSP
19.2.3 Media-Level Security
servers are advised to use explicit (i.e., whitelist) and con- The SIP media-level security is a huge area that needs a
figurable policies regarding who can access this informa- complete separate discussion all by itself. We have also
tion, with very conservative defaults (e.g., an empty access briefly discussed SRTP and ZRTP that deals with the real-
list or an access list consisting only of the PBX itself). The time audio-video media streams in Sections 7.3 and 7.4,
procedure for the generation of temporary Globally Routable respectively. We have also described the TLS and DTLS
UA URIs (GRUUs) requires the use of a Hashed Message that provides security at the transport layer (see Sections
Authentication Code (HMAC) to detect any tampering that 3.1.2.1, 4.2.3, 19.1, 19.4.6, 19.4.11, 19.10, and 19.12.4.3). It
external entities may attempt to perform on the contents of is believed that non-real-time data applications that, unlike
a temporary GRUU. The mention of HMACSHA256-80 in real-time audio-video media, do not need synchronization
RFC 6140 (see Section 3.3.5) is intended solely as an exam- will be protected by TLS and DTLS appropriately. However,
ple of a suitable HMAC algorithm. Since all HMACs used we are providing the SDP security features that allow nego-
in this document are generated and consumed by the same tiating the cryptographic mechanisms to be used for media
entity, the choice of an actual HMAC algorithm is entirely security during the SIP session establishment time.
Security Mechanisms in SIP ◾ 673
19.2.3.1 SDP Media Security Description There is a single namespace for the key method,
that is, the key method is transport independent. New
The SDP security attribute descriptions defined in RFCs
key methods (e.g., use of a URL) may be defined in a
4568, 5029, and 5939 enable the conferencing parties in two-
Standards Track RFC in the future. Although the key
party unicast communication to exchange security param-
method itself may be generic, the accompanying key-
eters and keys that can be used to set up SRTP (RFC 3171,
info definition is specific not only to the key method
see Section 7.3) and ZRTP (RFC 6191, Section 7.4) cryp-
but also to the transport in question. Key-info encodes
tographic contexts. A new media-level SDP attribute called
keying material for a crypto-suite, which defines that
crypto describes the cryptographic suite, key parameters, and
keying material. New key methods must be registered
session parameters for the preceding unicast media line. The
with the IANA.
crypto attribute must only appear at the SDP media level
<Key-info> is defined as a general octet string;
(not at the session level). The crypto attributes, known as the
further transport and key method-specific syntax and
SDP security description (SDES) specified in RFC 4568 (see
semantics must be provided in a Standards Track RFC
Sections 7.7.3 and 7.7.4), that are used for media encryption
for each combination of transport and key method
are expressed with the following structure:
that uses it. Note that such definitions are provided
a=crypto:<tag> <crypto-suite> <key-params> within the context of both a particular transport (e.g.,
[<session-params>] RTP/SAVP) and a specific key method (e.g., inline).
The Internet Assigned Numbers Authority (IANA)
The fields tag, crypto-suite, key-params, and session-
will register the list of supported key methods for each
params are described as follows:
transport.
When multiple keys are included in the key param-
◾◾ Tag: The <tag> is unique among all crypto attri-
eters, it must be possible to determine which of the
butes for a given media line in SDP. It is used with the
keys is being used in a given media packet by a simple
SDP offer–answer model specified in RFC 3264 (see
inspection of the media packet received; a trial-and-
Section 3.8.4) and determine which of several offered
error approach between the possible keys must not be
crypto attributes were chosen by the answerer. In the
performed. For SRTP, this could be achieved by use
offer–answer model, the tag is a negotiated parameter.
of Master Key Identifiers (RFC 3711, see Section 7.3).
◾◾ Crypto-Suite: The <crypto-suite> field is an iden-
Use of <From, To> values are not supported in SRTP
tifier that describes the encryption and authentication
security descriptions. In the SDP offer–answer model,
algorithms (e.g., AES_CM_128_HMAC_SHA1_80)
the key parameter is a declarative parameter.
for the transport in question. The possible values for the
◾◾ Session p arameters: Session parameters are specific
crypto-suite parameter are defined within the context
to a given transport and their use is optional in the
of the transport, that is, each transport defines a sepa-
security descriptions framework, where they are just
rate namespace for the set of crypto-suites. For example,
defined as general character strings. If session parame-
the crypto-suite AES_CM_128_HMAC_SHA1_80
ters are to be used for a given transport, then transport-
defined within the context RTP/SAVP transport applies
specific syntax and semantics must be provided in a
to secure RTP only; the string may be reused for another
Standards Track RFC.
transport (e.g., RTP/SAVPF, but a separate definition
In the SDP offer–answer model, session parameters
would be needed. In the SDP offer–answer model, the
may be either negotiated or declarative; the definition of
crypto-suite is a negotiated parameter.
specific session parameters must indicate whether they
◾◾ Key parameters: The <key-params> field provides
are negotiated or declarative. Negotiated parameters
one or more sets of keying material for the crypto-suite
apply to data sent in both directions, whereas declara-
in question. The field consists of a method indicator
tive parameters apply only to media sent by the entity
followed by a colon, and the actual keying information
that generated the SDP. Thus, a declarative parameter
as shown below:
in an offer applies to media sent by the offerer, whereas
key-params=<key-method> ":" <key-info> a declarative parameter in an answer applies to media
sent by the answerer.
Keying material might be provided by different
means from that for key-params; however, the detail dis- Multimedia streams such as audio and video are carried
cussion of SDP keying parameters is beyond the scope via RTP that does not offer media security. However, SRTP
of this Book. Only one method is defined in this docu- (see Section 7.3) and ZRTP (see Section 7.4) are designed
ment, namely, inline, which indicates that the actual extending RTP for offering media security. However, media
keying material is provided in the key-info field itself. security needs cryptographic suite and parameters that are
674 ◾ Handbook on Session Initiation Protocol
required to be agreed upon between the conferencing parties security precondition defined in RFC 5027 that is described
through negotiations. The SIP session is established and the here can be used to delay session establishment or modifica-
media that need to be sent is negotiated using INVITE and tion until media stream security for a secure media stream
SDP offer–answer model. The cryptographic algorithms and has been negotiated successfully.
parameters that will be used for encryption of each medium
are negotiated during the session using the SDES parameters
of SDP described above for using SRTP or ZRTP. 19.2.3.2.2 Security Precondition Definition
Once the security parameters are agreed upon during The semantics for a security precondition defined in RFC
negotiations using the SDP offer–answer model, then the 5027 are that the relevant cryptographic parameters (cipher,
SRTP or ZRTP cryptographic algorithms are used for media key, etc.) for a secure media stream are known to have been
encryption. The encrypted media payloads are then sent negotiated in the direction(s) required. If the security pre-
on an end-to-end basis between the end points. The huge condition is used with a nonsecure media stream, the secu-
descriptions are provided in SRTP and ZRTP protocol speci- rity precondition is, by definition, satisfied. A secure media
fications for providing media security for audio, video, and stream is here defined as a media stream that uses some kind
data applications. The detailed description of media security of security service (e.g., message integrity, confidentiality, or
provided by these two protocols is beyond the scope of this both), regardless of the cryptographic strength of the mecha-
chapter and needs to be addressed separately. nisms being used.
necessary parameters to establish the secure media stream The precise criteria for determining when the other party
keying material for example), the offered media stream must is able to correctly process media stream packets from a
be rejected as described in RFC 3312. The delay of ses- security point of view depend on the secure media stream
sion establishment defined here implies that alerting of the protocol being used as well as the mechanism by which the
called party must not occur and media for which security required cryptographic parameters are negotiated.
is being negotiated MUST NOT be exchanged until the We here provide details for SRTP negotiated through
precondition has been satisfied. In cases where secure media SDP security descriptions as defined in RFC 4568:
and other nonmedia data is multiplexed on a media stream,
for example, when Interactive Connectivity Establishment ◾◾ When the offerer requests the send security precondi-
(ICE) (RFC 5245, see Section 14.3) is being used, the non- tion, it needs to receive the answer before the security
media data is allowed to be exchanged before the security precondition is satisfied. The reason for this is two-
precondition is satisfied. fold. First, the offerer needs to know where to send the
When a security precondition with a strength tag of media. Secondly, in the case where alternative cryp-
optional is received in an offer, the answerer must generate tographic parameters are offered, the offerer needs to
its answer SDP as soon as possible. Since session progress is know which set was selected. The answerer does not
not delayed in this case, the answerer does not know when know when the answer is actually received by the
the offerer is able to process secure media stream packets, offerer (which, in turn, will satisfy the precondition),
and hence clipping may occur. If the answerer wants to and hence the answerer needs to use the confirm-status
avoid clipping and delay session progress until he knows the attribute (RFC 3312, see Section 15.4). This will make
offerer has received the answer, the answerer must increase the offerer generate a new offer showing the updated
the strength of the security precondition by using a strength status of the precondition.
tag of mandatory in the answer. Note that use of a manda- ◾◾ When the offerer requests the recv security precon-
tory precondition in an offer requires the presence of a SIP dition, it also needs to receive the answer before the
Require header field containing the option tag precondition: security precondition is satisfied. The reason for this
any SIP UA that does not support a mandatory precondi- is straightforward: the answer contains the crypto-
tion will consequently reject such requests that also has unin- graphic parameters that will be used by the answerer
tended ramifications for SIP forking described in RFC 5393 for sending media to the offerer; before receipt of these
(see Section 19.9). To get around this, an optional security cryptographic parameters, the offerer is unable to
precondition and the SIP Supported header field containing authenticate or decrypt such media.
the option tag precondition can be used instead.
When a security precondition with a strength tag of none When security preconditions are used with the Key
is received, processing continues as usual. The none strength Management Extensions for the SDP (RFC 4567), the
tag merely indicates that the offerer supports the follow- details depend on the actual key management protocol being
ing security precondition—the answerer may upgrade the used. After an initial offer–answer exchange in which the
strength tag in the answer as described in RFC 3312. The security precondition is requested, any subsequent offer–
direction tags defined in RFC 3312 (see Section 15.4) are answer sequence for the purpose of updating the status of
interpreted as follows: the precondition for a secure media stream should use the
same key material as the initial offer–answer exchange. This
◾◾ send: Media stream security negotiation is at a stage means that the key-mgmt attribute lines (RFC 4567), or
where it is possible to send media packets to the other crypto attribute lines (RFC 4568) in SDP offers, that are sent
party, and the other party will be able to process them in response to SDP answers containing a confirm-status field
correctly from a security point of view, that is, decrypt (RFC 3312, see Section 15.4) should repeat the same data as
or integrity check them as necessary. The definition that sent in the previous SDP offer. If applicable to the key
of media packets includes all packets that make up the management protocol or SDP security description, the SDP
media stream. In the case of secure RTP for example, answers to these SDP offers should repeat the same data in
it includes SRTP as well as SRTCP. When media and the key-mgmt attribute lines (RFC 4568) or crypto attribute
nonmedia packets are multiplexed on a given media lines (RFC 4567) as that sent in the previous SDP answer.
stream (e.g., when ICE is being used), the requirement Of course, this duplication of key exchange during pre-
applies to the media packets only. condition establishment is not to be interpreted as a replay
◾◾ recv: Media stream security negotiation is at a stage attack. This issue may be solved if, for example, the SDP
where it is possible to receive and correctly process implementation recognizes that the key management proto-
media stream packets sent by the other party from a col data is identical in the second offer–answer exchange and
security point of view. avoids forwarding the information to the security layer for
676 ◾ Handbook on Session Initiation Protocol
further processing. Offers with security preconditions in re- Since B does not know any of the security parameters
INVITEs or UPDATEs follow the rules given in RFC 3312 yet, the current status (see RFC 3312, see Section 15.4)
(see Section 15.4), that is, “Both UAs should continue using is set to none. A’s local status table (see RFC 3312) for
the old session parameters until all the mandatory precondi- the security precondition is as follows:
tions are met. At that moment, the UAs can begin using the
new session parameters.” At that moment, we furthermore
Direction Current Desired Strength Confirm
require that UAs must start using the new session parameters
for media packets being sent. The UAs should be prepared to send no mandatory no
process media packets received with either the old or the new
recv no mandatory no
session parameters for a short period of time to accommodate
media packets in transit. Note that this may involve itera-
tive security processing of the received media packets during and the resulting offer SDP is
that period of time. RFC 3264 (see Section 3.8.4) lists sev- m=audio 20000 RTP/SAVP 0
eral techniques to help alleviate the problem of determining c=IN IP4 192.0.2.1
when a received media packet was generated according to the a=curr:sec e2e none
old or new offer–answer exchange. a=des:sec mandatory e2e sendrecv
a=crypto:foo...
SDP 2: When B receives the offer and generates an
19.2.3.2.3 SDP Security Description Examples answer, B knows the (send and recv) security param-
The call flow of Figure 19.2 shows a basic session establish- eters of both A and B. From a security perspective, B is
ment using the SIP and SDP security descriptions (RFC now able to receive media from A, so B’s recv security
4568) with security descriptions for the secure media stream precondition is yes. However, A does not know any of
(SRTP in this case). B’s SDP information, so B’s send security precondition
The SDP descriptions of this example are shown below— is no. B’s local status table therefore looks as follows:
we have omitted the details of the SDP security descriptions
as well as any SIP details for clarity of the security precondi-
Direction Current Desired Strength Confirm
tion described here:
send no mandatory no
SDP 1: A includes a mandatory end-to-end security pre-
recv yes mandatory no
condition for both the send and receive direction in the
initial offer as well as a crypto attribute RFC 4568 (see
Sections 7.7.3 and 7.7.4), which includes keying mate- B requests A to confirm when A knows the security
rial that can be used by A to generate media packets. parameters used in the send and receive direction (it
A B
F1 INVITE SDP 1
F3 PRACK SDP 3
F5 180 ringing
Figure 19.2 Security preconditions with SDP security description example. (Copyright IETF. Reproduced with permission.)
Security Mechanisms in SIP ◾ 677
would suffice for B to ask for confirmation of A’s send is satisfied immediately, and the original offer–answer
direction only) and hence the resulting answer SDP exchange is complete.
becomes SDP 4: Upon receiving the updated offer, B updates its
m=audio 30000 RTP/SAVP 0 local status table based on the rules in RFC 3312 (see
c=IN IP4 192.0.2.4 Section 15.4), which yields the following:
a=curr:sec e2e recv
a=des:sec mandatory e2e sendrecv
a=conf:sec e2e sendrecv Direction Current Desired Strength Confirm
a=crypto:bar... send yes mandatory no
SDP 3: When A receives the answer, A updates its local
status table based on the rules in RFC 3312 (see recv yes mandatory no
Section 15.4). A knows the security parameters of both
the send and receive direction and hence A’s local sta- B responds with an answer (F4) that contains the
tus table is updated as follows: current status of the security precondition (i.e., send-
recv) from B’s point of view:
Direction Current Desired Strength Confirm m=audio 30000 RTP/SAVP 0
c=IN IP4 192.0.2.4
send yes mandatory yes
a=curr:sec e2e sendrecv
recv yes mandatory yes a=des:sec mandatory e2e sendrecv
a=crypto:bar...
Since B requested confirmation of the send and recv B’s local status table indicates that all mandatory
security preconditions, and both are now satisfied, A preconditions have been satisfied, and hence ses-
immediately sends an updated offer (F3) to B showing sion establishment resumes; B returns a 180 Ringing
that the security preconditions are satisfied: response (F5) to indicate alerting.
m=audio 20000 RTP/SAVP 0
c=IN IP4 192.0.2.1
19.2.3.2.4 Key Management Extension
a=curr:sec e2e sendrecv
for SDP Example
a=des:sec mandatory e2e sendrecv
a=crypto:foo... The call flow of Figure 19.3 shows a basic session establish-
Note that we here use PRACK (RFC 3262, see ment using the SIP and Key Management Extensions for
Sections 2.5, 2.8.2, and 2.10) instead of UPDATE SDP (RFC 4567) with security descriptions for the secure
(RFC 3311, see Section 3.8.3) since the precondition media stream (SRTP in this case).
A B
F1 INVITE SDP 1
F3 PRACK SDP 3
F5 180 ringing
Figure 19.3 Security preconditions with key management extensions for SDP example. (Copyright IETF. Reproduced with
permission.)
678 ◾ Handbook on Session Initiation Protocol
The SDP descriptions of this example are shown Note that the actual MIKEY data in the answer
below—we show an example use of Multimedia Internet differs from that in the offer; however, we have only
KEYing (MIKEY) (RFC 3830) with the Key Management shown the initial and common part of the MIKEY
Extensions; however, we have omitted the details of the value above.
MIKEY parameters as well as any SIP details for clarity of SDP 3: When A receives the answer, A updates its local
the security precondition described here: status table based on the rules in RFC 3312 (see Section
15.4). A now knows all the security parameters of both
SDP1: A includes a mandatory end-to-end security pre- the send and receive direction and hence A’s local sta-
condition for both the send and receive direction in tus table is updated as follows:
the initial offer as well as a key-mgmt attribute (RFC
4567), which includes keying material that can be used
Direction Current Desired Strength Confirm
by A to generate media packets. Since B does not know
any of the security parameters yet, the current status send yes mandatory yes
(see RFC 3312, see Section 15.4) is set to none. A’s local
recv yes mandatory yes
status table (see RFC 3312) for the security precondi-
tion is as follows:
Since B requested confirmation of the send and recv
security preconditions, and both are now satisfied, A
Direction Current Desired Strength Confirm
immediately sends an updated offer (F3) to B showing
send no mandatory no that the security preconditions are satisfied:
m=audio 20000 RTP/SAVP 0
recv no mandatory no
c=IN IP4 192.0.2.1
a=curr:sec e2e sendrecv
and the resulting offer SDP is a=des:sec mandatory e2e sendrecv
m=audio 20000 RTP/SAVP 0 a=key-mgmt:mikey AQAFgM0X...
c=IN IP4 192.0.2.1 SDP 4: Upon receiving the updated offer, B updates its
a=curr:sec e2e none local status table based on the rules in RFC 3312,
a=des:sec mandatory e2e sendrecv which yields the following:
a=key-mgmt:mikey AQAFgM0X...
SDP 2: When B receives the offer and generates an answer,
Direction Current Desired Strength Confirm
B knows the (send and recv) security parameters of both
A and B. B generates keying material for sending media send yes mandatory no
to A; however, A does not know B’s keying material, so
recv yes mandatory no
the current status of B’s send security precondition is no.
B does know A’s SDP information, so B’s recv security
precondition is yes. B’s local status table therefore looks B responds with an answer (F4) that contains the
as follows: current status of the security precondition (i.e., send-
recv) from B’s point of view:
m=audio 30000 RTP/SAVP 0
Direction Current Desired Strength Confirm c=IN IP4 192.0.2.4
send no mandatory no a=curr:sec e2e sendrecv
a=des:sec mandatory e2e sendrecv
recv yes mandatory no a=key-mgmt:mikey AQAFgM0X...
B’s local status table indicates that all mandatory pre-
B requests A to confirm when A knows the security conditions have been satisfied, and hence session estab-
parameters used in the send and receive direction and lishment resumes; B returns a 180 Ringing response (F5)
hence the resulting answer SDP becomes to indicate alerting.
m=audio 30000 RTP/SAVP 0
c=IN IP4 192.0.2.4
19.2.3.2.5 Security Considerations
a=curr:sec e2e recv
a=des:sec mandatory e2e sendrecv In addition to the general security considerations for precon-
a=conf:sec e2e sendrecv ditions provided in RFC 3312 (see Section 15.4), the following
a=key-mgmt:mikey AQAFgM0X... security issues should be considered. Security preconditions
Security Mechanisms in SIP ◾ 679
delay session establishment until cryptographic parameters transport protocol by use of SDP and the offer–answer model
required to send or receive media for a media stream have with various extensions. Such a mechanism introduces a num-
been negotiated. Negotiation of such parameters can fail ber of security considerations in general; however, use of SDP
for a variety of reasons, including policy preventing use of Security Preconditions with such a mechanism introduces the
certain cryptographic algorithms, keys, and other security following security precondition specific security considerations.
parameters. If an attacker can remove security precondi- A basic premise of negotiating secure and nonsecure
tions or downgrade the strength tag from an offer–answer media streams as alternatives is that the offerer’s secu-
exchange, the attacker can thereby cause user alerting for a rity policy allows for nonsecure media. If the offer were to
session that may have no functioning media. This is likely include secure and nonsecure media streams as alternative
to cause inconvenience to both the offerer and the answerer. offers, and media for either alternative may be received before
Similarly, security preconditions can be used to pre- the answer, then the offerer may not know if the answerer
vent clipping due to race conditions between offer–answer accepted the secure alternative. An active attacker thus may
exchanges and secure media stream packets based on that be able to inject malicious media stream packets until the
offer–answer exchange. If an attacker can remove or down- answer (indicating the chosen secure alternative) is received.
grade the strength tag of security preconditions from an From a security point of view, it is important to note that use
offer–answer exchange, the attacker can cause clipping to of security preconditions (even with a mandatory strength
occur in the associated secure media stream. Conversely, an tag) would not address this vulnerability since security pre-
attacker might add security preconditions to offers that do conditions would effectively apply only to the secure media
not contain them or increase their strength tag. This, in turn, stream alternatives. If the nonsecure media stream alterna-
may lead to session failure (e.g., if the answerer does not sup- tive was selected by the answerer, the security precondition
port it), heterogeneous error response forking problems, or a would be satisfied by definition, the session could progress,
delay in session establishment that was not desired. Use of and (nonsecure) media could be received before the answer
signaling integrity mechanisms can prevent all of the above is received.
problems. Where intermediaries on the signaling path (e.g.,
SIP proxies) are trusted, it is sufficient to use only hop-by-
hop integrity protection of signaling, for example, IPSec or 19.3 Security Mechanisms Negotiation
TLS. In all other cases, end-to-end integrity protection of
signaling (e.g., S/MIME, see Section 19.6) must be used. SIP uses HTTP authentication and other security mecha-
Note that the end-to-end integrity protection must cover nisms where multiple alternative methods and algorithms
not only the message body, which contains the security pre- are available to choose from. All security mechanisms may
conditions, but also the SIP Supported and Require headers, not be suitable under all circumstances of different secu-
which may contain the precondition option tag. If only the rity threats. For example, RFC 2617 (see Section 19.4.5)
message body were integrity protected, removal of the pre- describes that some of those security features are vulner-
condition option tag could lead to clipping (when a security able to MITM attacks. It is also difficult, or sometimes even
precondition was otherwise to be used), whereas addition of impossible, to know whether a specific security mechanism is
the option tag could lead to session failure (if the other side truly unavailable to a SIP peer entity, or if in fact an MITM
does not support preconditions). attack is in action. In certain small networks, these issues
However, security preconditions do not guarantee that an are not very relevant, as the administrators of such networks
established media stream will be secure. They merely guaran- can deploy appropriate software versions and set up policies
tee that the recipient of the media stream packets will be able for using exactly the right type of security. However, SIP is
to perform any relevant decryption and integrity checking also expected to be deployed to hundreds of millions of small
on those media stream packets. Current SDP (RFC 4566, devices with little or no possibilities for coordinated security
see Section 7.7) and associated offer–answer procedures policies, let alone software upgrades, which necessitates the
(RFC 3264, see Section 3.8.4) allows only a single type of need for the negotiation functionality to be available from
transport protocol to be negotiated for a given media stream the very beginning of deployment. RFC 3329 described here
in an offer–answer exchange. provides negotiation capabilities for the security mechanisms
Negotiation of alternative transport protocols (e.g., plain between a SIP UA and its first-hop SIP entity. The security
and secure RTP) is currently not defined. Thus, if the trans- negotiations need to abide by the following objectives:
port protocol offered (e.g., secure RTP) is not supported,
the offered media stream will simply be rejected. There is, ◾◾ The entities involved in the security agreement pro-
however, work in progress to address that. For example, the cess need to find out exactly which security mecha-
SDP Capability Negotiation framework (RFC 5939) defines nisms to apply, preferably without excessive additional
a method for negotiating the use of a secure or a nonsecure roundtrips.
680 ◾ Handbook on Session Initiation Protocol
◾◾ The selection of security mechanisms itself needs to be ◾◾ Step 5: The server verifies its own list of security mech-
secure. Traditionally, all security protocols use a secure anisms in order to ensure that the original list had not
form of negotiation. For instance, after establishing been modified.
mutual keys through Diffie–Hellman (DH), the Internet
Key Exchange (IKE) protocol sends hashes of the previ- This procedure is stateless for servers (unless the used
ously sent data including the offered crypto mechanisms security mechanisms require the server to keep some state).
(RFC 4306). This allows the peers to detect if the initial, The client and the server lists are both static (i.e., they do
unprotected offers were tampered with. not and cannot change based on the input from the other
◾◾ The entities involved in the security agreement pro- side). Nodes may, however, maintain several static lists, one
cess need to be able to indicate success or failure of the for each interface, for example. Between steps 1 and 2, the
security agreement process. server may set up a non-self-describing security mechanism
if necessary. Note that with this type of security mecha-
nisms, the server is necessarily stateful. The client would
19.3.1 Security Mechanisms Negotiation set up the non-self-describing security mechanism between
19.3.1.1 Operation steps 2 and 4.
We are considering a hypothetical security mechanisms
negotiation message flow as described in RFC 3329 in 19.3.1.2 Syntax for SIP Security Headers
Figure 19.4. The message flow illustrates how the mechanism
defined in this document works: We define three new SIP header fields, namely Security-
Client, Security-Server, and Security-Verify. The notation
◾◾ Step 1: Clients wishing to use this specification can used in the augmented Backus–Naur Form (ABNF) defini-
send a list of their supported security mechanisms tions for the syntax elements in this section is repeated here
along the first request to the server. as used in SIP (see Section 2.4.1) for convenience, and any
◾◾ Step 2: Servers wishing to use this specification can elements not defined in this section are as defined in SIP and
challenge the client to perform the security agreement the documents to which it refers:
procedure. The security mechanisms and parameters
supported by the server are sent along in this challenge. security-client = "Security-Client" HCOLON
◾◾ Step 3: The client then proceeds to select the highest- sec-mechanism *(COMMA
sec-mechanism)
preference security mechanism they have in common security-server = "Security-Server" HCOLON
and to turn on the selected security. sec-mechanism *(COMMA
◾◾ Step 4: The client contacts the server again, now sec-mechanism)
using the selected security mechanism. The server’s security-verify = "Security-Verify" HCOLON
list of supported security mechanisms is returned as a sec-mechanism *(COMMA
response to the challenge. sec-mechanism)
sec-mechanism = mechanism-name *(SEMI
mech-parameters)
mechanism-name = ("digest"/"tls"/
Client Server "ipsec-ike"/
"ipsec-man"/token)
F1. Client list
mech-parameters = (preference/
digest-algorithm/
F2. Server list digest-qop/digest-verify/
extension)
preference = "q" EQUAL qvalue
F3. Turn on security
qvalue=("0" ["." 0*3DIGIT])
/("1" ["." 0*3("0")])
F4. Server list digest-algorithm = "d-alg" EQUAL token
digest-qop = "d-qop" EQUAL token
F5. OK or error
digest-verify = "d-ver" EQUAL LDQUOT 32LHEX
RDQUOT
extension = generic-param
Figure 19.4 Security mechanisms negotiation message flow. Note that qvalue is already defined in the SIP ABNF
(Copyright IETF. Reproduced with permission.) (RFC 3261, see Section 2.4.1). We have copied its definitions
Security Mechanisms in SIP ◾ 681
here for completeness. The parameters described by the protect the negotiation of security mechanisms between SIP
ABNF above have the following semantics: entities.
be carried out without involving any SIP message exchange a request, the server must return a 502 Bad Gateway response.
(e.g., establishing a TLS connection). If an attacker modi- A server that decides to use this agreement mechanism must
fied the Security-Client header field in the request, the server challenge unprotected requests with one Via entry regardless
may not include in its response the information needed to of the presence or the absence of any Require, Proxy-Require,
establish the common security mechanism with the highest or Supported header fields in incoming requests.
preference value (e.g., the Proxy-Authenticate header field is A server that by policy requires the use of this specifica-
missing). A client detecting such a lack of information in tion and receives a request that does not have the sec-agree
the response must consider the current security agreement option tag in a Require, Proxy-Require, or Supported header
process aborted, and may try to start it again by sending a field must return a 421 Extension Required response. If the
new request with a Security-Client header field as described request had the sec-agree option tag in a Supported header
above. field, it must return a 494 Security Agreement Required
All the subsequent SIP requests sent by the client to that response. In both situations, the server must also include in
server should make use of the security mechanism initiated the response a Security-Server header field listing its capabili-
in the previous step. These requests must contain a Security- ties and a Require header field with an option tag sec-agree
Verify header field that mirrors the server’s list received pre- in it. The server must also add necessary information so that
viously in the Security-Server header field. These requests the client can initiate the preferred security mechanism (e.g.,
MUST also have both a Require and Proxy-Require header a Proxy-Authenticate header field for HTTP Digest). Clients
fields with the value sec-agree. The server must check that the that support the extension defined in this document should
security mechanisms listed in the Security-Verify header field add a Supported header field with a value of sec-agree.
of incoming requests correspond to its static list of supported
security mechanisms. Note that, following the standard SIP
19.3.1.4 Security Mechanism Initiation
header field comparison rules, both lists have to contain the
same security mechanisms in the same order to be considered Once the client chooses a security mechanism from the list
equivalent. In addition, for each particular security mecha- received in the Security-Server header field from the server,
nism, its parameters in both lists need to have the same val- it initiates that mechanism. Different mechanisms require
ues. The server can proceed processing a particular request different initiation procedures. If tls is chosen, the client
if, and only if, the list was not modified. If modification of uses the procedures to determine the URI to be used as
the list is detected, the server must respond to the client with an input to the DNS procedures of RFC 3263 (see Section
a 494 Security Agreement Required response. This response 8.2.4). However, if the URI is a SIP URI, it must treat the
must include the server’s unmodified list of supported secu- scheme as if it were sips, not sip. If the URI scheme is not
rity mechanisms. sip, the request must be sent using TLS. If digest is chosen,
If the list was not modified, and the server is a proxy, the 494 Security Agreement Required response will contain
it must remove the sec-agree value from both the Require an HTTP Digest authentication challenge. The client must
and Proxy-Require header fields, and then remove the header use the algorithm and qop parameters in the Security-Server
fields if no values remain. Once the security has been negoti- header field to replace the same parameters in the HTTP
ated between two SIP entities, the same SIP entities may use Digest challenge. The client MUST also use the digest-verify
the same security when communicating with each other in parameter in the Security-Verify header field to protect the
different SIP roles. For example, if a UAC and its outbound Security-Server header field as specified above.
proxy negotiate some security, they may try to use the same To use ipsec-ike, the client attempts to establish an IKE
security for incoming requests (i.e., the UA will be acting as a connection to the host part of the Request-URI in the first
UAS). The user of a UA should be informed about the results request to the server. If the IKE connection attempt fails, the
of the security mechanism agreement. The user may decline agreement procedure must be considered to have failed, and
to accept a particular security mechanism, and abort further must be terminated. Note that ipsec-man will only work if
SIP communications with the peer. the communicating SIP entities know which keys and other
parameters to use. It is outside the scope of this specifica-
tion to describe how this information can be made known
19.3.1.3.2 Server Initiated
to the peers. All rules for minimum implementations, such
A server decides to use the security agreement described in this as mandatory-to-implement algorithms, apply as defined in
document based on local policy. If a server receives a request RFCs 4301–4303. In both IPsec-based mechanisms, it is
from the network interface that is configured to use this expected that appropriate policy entries for protecting SIP
mechanism, it must check that the request has only one Via have been configured or will be created before attempting to
entry. If there are several Via entries, the server is not the first- use the security agreement procedure, and that SIP commu-
hop SIP entity, and it must not use this mechanism. For such nications use port numbers and addresses according to these
Security Mechanisms in SIP ◾ 683
policy entries. It is outside the scope of this specification to different policies, and consequently the use of this extension
describe how this information can be made known to the may be situational by nature. UA and server implementations
peers, but it would typically be configured at the same time must be configurable to operate with or without this exten-
as the IKE credentials or manual security associations have sion. A server that is configured to use this mechanism may
been entered. also accept requests from clients that use TLS based on the
rules defined in RFC 3263 (see Section 8.2.4). Requests from
clients that do not support this extension, and do not support
19.3.1.5 Duration of Security Associations
TLS, cannot be accepted. This obviously breaks interopera-
Once a security mechanism has been negotiated, both the bility with some SIP clients. Therefore, this extension should
server and the client need to know until when it can be used. be used in environments where it is somehow ensured that
All the mechanisms described in this document have a differ- every client implements this extension or is able to use TLS.
ent way of signaling the end of a security association. When This extension may also be used in environments where inse-
TLS is used, the termination of the connection indicates that cure communication is not acceptable if the option of not
a new negotiation is needed. IKE negotiates the duration of being able to communicate is also accepted.
a security association. If the credentials provided by a client
using digest are no longer valid, the server will rechallenge
the client. It is assumed that when IPsec-man is used, the 19.3.3 Security Algorithms
same out-of-band mechanism used to distribute keys is used Negotiation Example
to define the duration of the security association.
19.3.3.1 Client Initiated
A UA negotiates (Figure 19.5) the security mechanism to
19.3.1.6 Header-Field Use
be used with its outbound proxy without knowing before-
The three header fields (Security-Client, Security-Server, hand which mechanisms the proxy supports. The OPTIONS
and Security-Verify) may be used to negotiate the security method can be used here to request the security capabilities
mechanisms between a UAC and other SIP entities, includ- of the proxy. In this way, the security can be initiated even
ing UAS, proxy, and registrar. Information about the use of before the first INVITE is sent via the proxy.
headers in relation to SIP methods and proxy processing is The UAC sends an OPTIONS request to its outbound
summarized in Section 2.8. proxy, indicating at the same time that it is able to negotiate
security mechanisms and that it supports TLS and HTTP
Digest (F1). The outbound proxy responds to the UAC with
19.3.2 Backwards Compatibility its own list of security mechanisms—IPsec and TLS (F2). The
The use of this extension in a network interface is a mat- only common security mechanism is TLS, so they establish
ter of local policy. Different network interfaces may follow a TLS connection between them. When the connection is
F1. OPTIONS
TLS
F3. INVITE
F4. INVITE
F5. 200 OK
F6. 200 OK
F7. ACK
F8. ACK
Figure 19.5 Negotiation initiated by the client. (Copyright IETF. Reproduced with permission.)
684 ◾ Handbook on Session Initiation Protocol
F1. OPTIONS
F3. ACK
IKE
F4. INVITE
F5. INVITE
F6. 200 OK
F7. 200 OK
F8. ACK
F9. ACK
Figure 19.6 Server-initiated security negotiation. (Copyright IETF. Reproduced with permission.)
Security Mechanisms in SIP ◾ 685
modification in all equipment. The method presented in this However, the server’s choice could still be affected as
specification is secure only if the weakest proposed mecha- described below:
nism offers at least integrity and replay protection for the – If the modification affected the server’s choice, the
Security-Verify header field. The security implications of this server and client would end up choosing different
are subtle, but do have a fundamental importance in building security mechanisms in step 3 or 4 of Figure 19.6.
large networks that change over time. Given that the hashes Since they would be unable to communicate to
are produced also using algorithms agreed upon in the first each other, this would be detected as a potential
unprotected messages, one could ask what the difference in attack. The client would either retry or give up in
security really is. Assuming integrity protection is mandatory this situation.
and only secure algorithms are used, we still need to prevent – If the modification did not affect the server’s
MITM attackers from modifying other parameters, such as choice, there is no effect.
whether encryption is provided or not. Let us first assume two ◾◾ Finally, attackers may also try to reply old security
peers capable of using both strong and weak security. agreement messages. Each security mechanism must
If the initial offers are not protected in any way, any provide replay protection. In particular, HTTP Digest
attacker can easily downgrade the offers by removing the implementations should carefully utilize existing reply
strong options. This would force the two peers to use weak protection options such as including a time stamp to
security between them. However, if the offers are protected in the nonce parameter, and using nonce counters (RFC
some way—such as by hashing or repeating them later when 4306).
the selected security is really on—the situation is different.
It would not be sufficient for the attacker to modify a single All clients that implement this specification must select
message. Instead, the attacker would have to modify both HTTP Digest, TLS, IPsec, or any stronger method for the
the offer message as well as the message that contains the protection of the second request.
hash/repetition. More important, the attacker would have to
forge the weak security that is present in the second message,
and would have to do so in real time between the sent offers 19.3.5 Syntax of IPsec–3GPP
and the later messages. Otherwise, the peers would notice Security Headers
that the hash is incorrect. If the attacker is able to break the
The 3gpp document [3] extends the security agreement
weak security, the security method or the algorithm should
framework described in this document with a new security
not be used.
mechanism: ipsec-3gpp. This security mechanism and its
In conclusion, the security difference is making a trivial
associated parameters are used in the 3GPP IMS. The ABNF
attack possible versus demanding the attacker to break algo-
definitions below follow the syntax of SIP (RFC 3261, see
rithms. An example of where this has a serious consequence
Section 2.4.1):
is when a network is first deployed with integrity protection
(such as HTTP Digest), and then later new devices are added mechanism-name=("ipsec-3gpp")
that support also encryption (such as TLS). In this situation, mech-parameters=(algorithm/protocol/mode/
an insecure negotiation procedure allows attackers to trivially encrypt-algorithm/spi/
force even new devices to use only integrity protection. Possible port1/port2)
algorithm="alg" EQUAL ("hmac-md5-96"/
attacks against the security agreement include the following:
"hmac-sha-1-96")
protocol="prot" EQUAL ("ah"/"esp")
◾◾ Attackers could try to modify the server’s list of secu- mode="mod" EQUAL ("trans"/"tun")
rity mechanisms in the first response. This would encrypt-algorithm=”ealg” EQUAL
be revealed to the server when the client returns the ("des-ede3-cbc"/"null")
received list using the security. spi="spi" EQUAL spivalue
spivalue=10DIGIT; 0 to 4294967295
◾◾ Attackers could also try to modify the repeated list
port1="port1" EQUAL port
in the second request from the client. However, if the port2="port2" EQUAL port
selected security mechanism uses encryption, this may port=1*DIGIT
not be possible; if it uses integrity protection, any mod-
ifications will be detected by the server. The parameters described by the ABNF above have the
◾◾ Attackers could try to modify the client’s list of security following semantics:
mechanisms in the first message. The client selects the
security mechanism based on its own knowledge of its ◾◾ Algorithm: This parameter defines the used authenti-
own capabilities and the server’s list; hence, the client’s cation algorithm. It may have a value of hmac-md5-96
choice would be unaffected by any such modification. for HMAC-MD5-96 (RFC 2403), or hmac-sha-1-96
686 ◾ Handbook on Session Initiation Protocol
SIP SIP
proxy A proxy B Outgoing SIP proxy A/ Incoming SIP
authentication server proxy B
INVITE/
SDP INVITE
(with token)
INVITE/ INVITE/
SDP SDP INVITE INVITE
(with token)
IP network
IP network
(a) (b)
Figure 19.7 SIP call flows: (a) SIP trapezoid and (b) SIP authenticated identities.
Security Mechanisms in SIP ◾ 687
playing the role of the authentication server. The core idea of of the request should ascertain whether or not this user is
the authentication service can be described as follows: authorized to make the request in question. No authoriza-
tion systems are recommended or discussed in this docu-
◾◾ Alice sends a SIP INVITE message through outbound ment. The Digest authentication mechanism described in
proxy A that is playing the role of the SIP authentica- this section provides message authentication and replay
tion server. The outgoing proxy authenticates the user protection only, without message integrity or confidential-
using a set of security policies that are used in this ity. Protective measures above and beyond those provided
administrative domain. by Digest need to be taken to prevent active attackers from
◾◾ Alice is authenticated and authorized by proxy A cre- modifying SIP requests and responses. Note that due to its
ating and cryptographically signing an authentication weak security, the usage of Basic authentication has been
token for the user. In this way, proxy A will assert the deprecated. Servers must not accept credentials using the
SIP UA’s identity, known as the asserted identity (see Basic authorization scheme, and servers also must not chal-
Chapter 10) and that UA A (Alice) was authenticated lenge with Basic. This is a change from RFC 2543 (obso-
by adding a digitally signed token (see Figure 19.7b leted by RFC 3261).
shown as token) to the SIP message. The digital signa-
ture is computed over a number of additional fields of
the SIP message discussed later in order to protect their
19.4.2 Framework
integrity and that of the overall message. The framework for SIP authentication closely parallels that
◾◾ Proxy A then shares this identity of SIP UA A with of HTTP defined in RFC 2617. In particular, the ABNF
others, including proxy B, as necessary. The SIP mes- for auth-scheme, auth-param, challenge, realm, realm-value,
sage along with the asserted identity is sent to proxy B. and credentials is identical (although the usage of Basic
Proxy B/SIP UA B (Bob) inspects the content of the as a scheme is not permitted). In SIP, a UAS uses the 401
token as necessary. Of course, the processing steps will Unauthorized response to challenge the identity of a UAC.
involve verification of the digital signature along with Additionally, registrars and redirect servers may make use of
other functions. 401 Unauthorized responses for authentication, but proxies
must not, and instead may use the 407 Proxy Authentication
The key advantage for the incoming proxy and the called Required response. The requirements for inclusion of the
party is that they only require knowing, or having to dis- Proxy-Authenticate, Proxy-Authorization, WWW-Authenti
cover, the certificates of the authentication services they cate, and Authorization in the various messages are identical
interact with. It does not require for interaction with the cer- to those described in RFC 2617.
tificates of each individual by the incoming proxy (e.g., proxy Since SIP does not have the concept of a canonical root
B) and the called party (e.g., SIP UA B, Bob), thereby scaling URL, the notion of protection spaces is interpreted differently
the authentication and authorization services especially for in SIP. The realm string alone defines the protection domain.
the large-scale SIP network. In turn, although this security This is a change from RFC 2543 (obsoleted by RFC 3261),
service provides scalability, this approach exposes another in which the Request-URI and the realm together defined
important security problem known as the user privacy. Proxy the protection domain. This previous definition of protection
B that remains in another administrative domain as well as domain caused some amount of confusion since the Request-
the called party (e.g., SIP UA B, Bob) will be able to know URI sent by the UAC and the Request-URI received by the
the identity of the calling party (e.g., SIP UA A, Alice). This challenging server might be different, and indeed the final
security issue in SIP invites to offer privacy and anonymity form of the Request-URI might not be known to the UAC.
services for SIP users. We have devoted a separate chapter Also, the previous definition depended on the presence of a
(see Chapter 20) to deal with SIP privacy and anonymity SIP URI in the Request-URI and seemed to rule out alterna-
services. It reveals the fact that we need to address secu- tive URI schemes (e.g., the tel URL). Operators of UAs or
rity in SIP in an integrated manner that includes all func- proxy servers that will authenticate received requests must
tions: authentication, authorization, integrity, confidentiality, adhere to the following guidelines for creation of a realm
privacy/anonymity, and nonrepudiation. string for their server:
Note that SIP providing a stateless, challenge-based
mechanism for authentication (RFC 3261) is based on ◾◾ Realm strings must be globally unique. It is recom-
authentication in HTTP (RFC 2617) that is described here. mended that a realm string contain a host name or
Any time that a proxy server or UA receives a request (with domain name, following the recommendation in
the exceptions given in Section 19.4.2), it may challenge Section 3.2.1 of RFC 2617.
the initiator of the request to provide assurance of its iden- ◾◾ Realm strings should present a human-readable identi-
tity. Once the originator has been identified, the recipient fier that can be rendered to a user.
688 ◾ Handbook on Session Initiation Protocol
For example: rejected (though the request may be retried if the nonce was
stale).
INVITE sip:[email protected] SIP/2.0
Authorization: Digest realm="biloxi.com",
<...> 19.4.3 User-to-User Authentication
When a UAS receives a request from a UAC, the UAS may
Generally, SIP authentication is meaningful for a specific authenticate the originator before the request is processed. If
realm, a protection domain. Thus, for Digest authentication, no credentials (in the Authorization header field) are provided
each such protection domain has its own set of user names in the request, the UAS can challenge the originator to provide
and passwords. If a server does not require authentication for credentials by rejecting the request with a 401 Unauthorized
a particular request, it may accept a default user name, anon- status code. The WWW-Authenticate response-header field
ymous, which has no password (password of “”). Similarly, must be included in 401 Unauthorized response messages.
UACs representing many users, such as PSTN gateways, may The field value consists of at least one challenge that indi-
have their own device-specific user name and password, cates the authentication scheme(s) and parameters applicable
rather than accounts for particular users, for their realm. to the realm. An example of the WWW-Authenticate header
While a server can legitimately challenge most SIP field in a 401 challenge is as follows:
requests, there are two requests defined by this document
that require special handling for authentication: ACK
WWW-Authenticate: Digest
and CANCEL. Under an authentication scheme that uses realm="biloxi.com",
responses to carry values used to compute nonces (such as qop="auth,auth-int",
Digest), some problems come up for any requests that take nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093",
no response, including ACK. For this reason, any creden- opaque="5ccc069c403ebaf9f0171e9517f40e41"
tials in the INVITE that were accepted by a server MUST
be accepted by that server for the ACK. UACs creating When the originating UAC receives the 401 Unauthorized,
an ACK message will duplicate all of the Authorization it should, if it is able, reoriginate the request with the proper
and Proxy-Authorization header field values that appeared credentials. The UAC may require input from the originat-
in the INVITE to which the ACK corresponds. Servers ing user before proceeding. Once authentication credentials
MUST NOT attempt to challenge an ACK. have been supplied (either directly by the user, or discovered
Although the CANCEL method does take a response in an internal keyring), UAs should cache the credentials for
(a 2xx), servers must not attempt to challenge CANCEL a given value of the To header field and realm, and attempt
requests since these requests cannot be resubmitted. to reuse these values on the next request for that destination.
Generally, a CANCEL request should be accepted by a server UAs may cache credentials in any way they would like.
if it comes from the same hop that sent the request being If no credentials for a realm can be located, UACs may
canceled (provided that some sort of transport or network attempt to retry the request with a user name of anonymous
layer security association, as described in Section 19.12.2.1, and no password (a password of “”). Once credentials have
is in place). been located, any UA that wishes to authenticate itself with
When a UAC receives a challenge, it should render to a UAS or registrar—usually, but not necessarily, after receiv-
the user the contents of the realm parameter in the challenge ing a 401 Unauthorized response—may do so by includ-
(which appears in either a WWW-Authenticate header field ing an Authorization header field with the request. The
or Proxy-Authenticate header field) if the UAC device does Authorization field value consists of credentials containing
not already know of a credential for the realm in question. A the authentication information of the UA for the realm of the
service provider that preconfigures UAs with credentials for resource being requested, as well as parameters required in
its realm should be aware that users will not have the oppor- support of authentication and replay protection. An example
tunity to present their own credentials for this realm when of the Authorization header field is
challenged at a preconfigured device.
Finally, note that even if a UAC can locate credentials Authorization: Digest username="bob",
that are associated with the proper realm, the potential exists realm="biloxi.com",
that these credentials may no longer be valid or that the chal- nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093",
lenging server will not accept these credentials for whatever uri="sip:[email protected]",
qop=auth,
reason (especially when anonymous with no password is sub-
nc=00000001,
mitted). In this instance, a server may repeat its challenge, cnonce="0a4f113b",
or it may respond with a 403 Forbidden. A UAC must not response="6629fae49393a05397450978507c4ef1",
re-attempt requests with the credentials that have just been opaque="5ccc069c403ebaf9f0171e9517f40e41"
Security Mechanisms in SIP ◾ 689
When a UAC resubmits a request with its credentials after response—may do so by including a Proxy-Authorization
receiving a 401 Unauthorized or 407 Proxy Authentication header field value with the request. The Proxy-Authorization
Required response, it MUST increment the CSeq header request-header field allows the client to identify itself (or its
field value as it would normally when sending an updated user) to a proxy that requires authentication. The Proxy-
request. Authorization header field value consists of credentials con-
taining the authentication information of the UA for the
proxy or realm of the resource being requested.
19.4.4 Proxy-to-User Authentication A Proxy-Authorization header field value applies only to
Similarly, when a UAC sends a request to a proxy server, the the proxy whose realm is identified in the realm parameter
proxy server may authenticate the originator before the request (this proxy may previously have demanded authentication
is processed. If no credentials (in the Proxy-Authorization using the Proxy-Authenticate field). When multiple proxies
header field) are provided in the request, the proxy can chal- are used in a chain, a Proxy-Authorization header field value
lenge the originator to provide credentials by rejecting the must not be consumed by any proxy whose realm does not
request with a 407 Proxy Authentication Required status match the realm parameter specified in that value. Note that
code. The proxy must populate the 407 Proxy Authentication if an authentication scheme that does not support realms is
Required message with a Proxy-Authenticate header field used in the Proxy-Authorization header field, a proxy server
value applicable to the proxy for the requested resource. must attempt to parse all Proxy-Authorization header field
The use of Proxy-Authenticate and Proxy-Authorization values to determine whether one of them has what the proxy
parallel are described in RFC 2543 (obsoleted by RFC 3261), server considers to be valid credentials. Because this is poten-
with one difference. Proxies must not add values to the Proxy- tially very time consuming in large networks, proxy servers
Authorization header field. All 407 Proxy Authentication should use an authentication scheme that supports realms in
Required responses must be forwarded upstream toward the Proxy-Authorization header field.
the UAC following the procedures for any other response. It If a request is forked, as described in RFC 3261 (see Section
is the UAC’s responsibility to add the Proxy-Authorization 3.11.7), various proxy servers or UAs may wish to challenge
header field value containing credentials for the realm of the the UAC. In this case, the forking proxy server is responsible
proxy that has asked for authentication. for aggregating these challenges into a single response. Each
If a proxy were to resubmit a request adding a Proxy- WWW-Authenticate and Proxy-Authenticate value received
Authorization header field value, it would need to increment in responses to the forked request must be placed into the
the CSeq in the new request. However, this would cause the single response that is sent by the forking proxy to the UA; the
UAC that submitted the original request to discard a response ordering of these header field values is not significant. When a
from the UAS, as the CSeq value would be different. When proxy server issues a challenge in response to a request, it will
the originating UAC receives the 407 Proxy Authentication not proxy the request until the UAC has retried the request
Required, it should, if it is able, reoriginate the request with with valid credentials. A forking proxy may forward a request
the proper credentials. It should follow the same procedures simultaneously to multiple proxy servers that require authen-
for the display of the realm parameter that are given above tication, each of which, in turn, will not forward the request
for responding to 401. If no credentials for a realm can be until the originating UAC has authenticated itself in their
located, UACs may attempt to retry the request with a user respective realm. If the UAC does not provide credentials for
name of anonymous and no password (a password of “”). The each challenge, the proxy servers that issued the challenges
UAC should also cache the credentials used in the reorigi- will not forward requests to the UA where the destination
nated request. The following rule is recommended for proxy user might be located, and therefore, the virtues of forking
credential caching. are largely lost. When resubmitting its request in response to
If a UA receives a Proxy-Authenticate header field value a 401 Unauthorized or 407 Proxy Authentication Required
in a 401/407 response to a request with a particular Call-ID, that contains multiple challenges, a UAC may include an
it should incorporate credentials for that realm in all subse- Authorization value for each WWW-Authenticate value and
quent requests that contain the same Call-ID. These creden- a Proxy-Authorization value for each Proxy-Authenticate value
tials must not be cached across dialogs; however, if a UA is for which the UAC wishes to supply a credential.
configured with the realm of its local outbound proxy, when As noted above, multiple credentials in a request should
one exists, then the UA may cache credentials for that realm be differentiated by the realm parameter. It is possible for
across dialogs. Note that this does mean a future request in multiple challenges associated with the same realm to appear
a dialog could contain credentials that are not needed by any in the same 401 Unauthorized or 407 Proxy Authentication
proxy along the Route header path. Any UA that wishes to Required. This can occur, for example, when multiple prox-
authenticate itself to a proxy server—usually, but not neces- ies within the same administrative domain, which use a com-
sarily, after receiving a 407 Proxy Authentication Required mon realm, are reached by a forking request. When it retries
690 ◾ Handbook on Session Initiation Protocol
a request, a UAC may therefore supply multiple credentials any algorithms that have a dependency on the cnonce
in Authorization or Proxy-Authorization header fields with (including MD5-Sess) require that the qop directive
the same realm parameter value. The same credentials should be sent. Use of the qop parameter is optional in RFC
be used for the same realm. 2617 for the purposes of backwards compatibility with
RFC 2069 (note: RFC 2069 was obsoleted by RFC
2617); since RFC 2543 (obsoleted by RFC 3261) was
19.4.5 Digest Authentication Scheme based on RFC 2069, the qop parameter must unfortu-
This section describes the modifications and clarifications nately remain optional for clients and servers to receive.
required to apply the HTTP Digest authentication scheme However, servers must always send a qop parameter in
to SIP. The SIP scheme usage is almost completely identi- WWW-Authenticate and Proxy-Authenticate header
cal to that for HTTP (RFC 2617). Since RFC 2543 (obso- field values. If a client receives a qop parameter in a chal-
leted by RFC 3261) is based on HTTP Digest as defined in lenge header field, it must send the qop parameter in any
RFC 2069, SIP servers supporting RFC 2617 must ensure resulting authorization header field.
they are backwards compatible with RFC 2069. Procedures
for this backwards compatibility are specified in RFC 2617. RFC 2543 did not allow usage of the Authentication-Info
Note, however, that SIP servers must not accept or request header field (it effectively used RFC 2069). However, we now
Basic authentication. The rules for Digest authentication fol- allow usage of this header field, since it provides integrity
low those defined in RFC 2617, with HTTP/1.1 replaced by checks over the bodies and provides mutual authentication.
SIP/2.0 in addition to the following differences: RFC 2617 defines mechanisms for backwards compatibility
using the qop attribute in the request. These mechanisms
◾◾ The URI included in the challenge has the following must be used by a server to determine if the client supports
ABNF: URI=SIP-URI/SIPS-URI. the new mechanisms in RFC 2617 that were not specified in
◾◾ The ABNF in RFC 2617 has an error in that the uri RFC 2069 (obsoleted by 2617).
parameter of the Authorization header field for HTTP
Digest authentication is not enclosed in quotation marks.
(The example in Section 3.5 of RFC 2617 is correct.) For 19.4.6 Domain Certificates over TLS
SIP, the uri must be enclosed in quotation marks. for Authentication in SIP
◾◾ The ABNF for digest-uri-value is digest-uri- 19.4.6.1 Background
value=Request-URI; as defined in Section 2.4.1.
◾◾ The example procedure for choosing a nonce based on The TLS (RFC 5246) protocol is available in an increasing
Etag does not work for SIP. number of SIP implementations. To use the authentication
◾◾ The text in RFC 2617 regarding cache operation does capabilities of TLS, certificates as defined by the Internet
not apply to SIP. X.509 Public Key Infrastructure specified in RFC 5280 are
◾◾ RFC 2617 requires a server check that the URI in the required. Existing SIP specifications do not sufficiently spec-
request line and the URI included in the Authorization ify how to use certificates for domain (as opposed to host)
header field point to the same resource. In a SIP context, authentication.
these two URIs may refer to different users, due to for- RFC 5922 that is described here provides guidance to
warding at some proxy. Therefore, in SIP, a server may ensure interoperability and uniform conventions for the
check that the Request-URI in the Authorization header construction and interpretation of certificates used to iden-
field value corresponds to a user for whom the server is tify their holders as being authoritative for the domain. The
willing to accept forwarded or direct requests, but it is not description in RFC 5922 is pertinent to an X.509 PKIX-
necessarily a failure if the two fields are not equivalent. compliant certificate used for a TLS connection. More
◾◾ As a clarification to the calculation of the A2 value for specifically, this specification describes how to encode and
message integrity assurance in the Digest authentica- extract the identity of a SIP domain in a certificate and how
tion scheme, implementers should assume, when the to use that identity for SIP domain authentication. As such,
entity-body is empty (i.e., when SIP messages have no this specification is relevant both to implementers of SIP and
body), that the hash of the entity-body resolves to the to issuers of certificates.
MD5 hash of an empty string, or: H(entity-body)
=MD5("")="d41d8cd98f00b204e9800998ecf8
19.4.6.2 Problem Statement
427e".
◾◾ RFC 2617 notes that a cnonce value must not be sent in TLS uses X.509 Public Key Infrastructure (RFC 5280) to bind
an Authorization (and by extension Proxy-Authorization) an identity or a set of identities, to the subject of an X.509 cer-
header field if no qop directive has been sent. Therefore, tificate. While RFC 3261 provides adequate guidance on the
Security Mechanisms in SIP ◾ 691
use of X.509 certificates for S/MIME (see Section 19.6), it is to each IP address in the set (the Expected Output). If the
relatively silent on the use of such certificates for TLS. With transport indicates the use of TLS, then a TLS connection
respect to certificates for TLS, Section 26.3.1 of RFC 3261 says, is opened to the server on a specific IP address and port. The
server presents an X.509 certificate to the client for verifica-
Proxy servers, redirect servers, and registrars tion as part of the initial TLS handshake. The client extracts
should possess a site certificate whose subject cor- identifiers from the Subject, and any subjectAltName exten-
responds to their canonical host name. The secu- sion in the certificate and compares these values to the
rity properties of TLS and S/MIME as used in domain part extracted from the original SIP URI (the AUS).
SIP are different: X.509 certificates for S/MIME If any identifier match is found, the server is considered to
are generally used for end-to-end authentication be authenticated and subsequent signaling can now proceed
and encryption; thus, they serve to bind the iden- over the TLS connection. Matching rules for X.509 certifi-
tity of a user to the certificate and RFC 3261 cates and the normative behavior for clients is specified here.
is sufficiently clear that in certificates used for As an example, consider a request that is to be routed to
S/MIME, the subjectAltName field will contain the SIP address sips:[email protected]. This address requires
the appropriate identity. a secure connection to the SIP domain example.com (the sips
scheme mandates a secure connection). Through a series of
On the other hand, X.509 certificates used for TLS DNS manipulations, the domain name is mapped to a set of
serve to bind the identities of the per-hop domain sending host addresses and transports. The entity attempting to cre-
or receiving the SIP messages. However, the lack of guide- ate the connection selects an address appropriate for use with
lines in RFC 3261 on exactly where to put identities—in the TLS from this set. When the connection is established to
subjectAltName field or carried as a Common Name (CN) that server, the server presents a certificate asserting the iden-
in the Subject field—of an X.509 certificate created ambi- tity sip:example.com. Since the domain part of the SIP AUS
guities. Following the accepted practice of the time, legacy matches the subject of the certificate, the server is authenti-
X.509 certificates were allowed to store the identity in the cated. SIPS (see Section 4.2.1) borrows this pattern of server
CN field of the certificate instead of the currently specified certificate matching from HTTPS. However, RFC 2818 pre-
subjectAltName extension. Lack of further guidelines on how fers that the identity be conveyed as a subjectAltName exten-
to interpret the identities, which identity to choose if more sion of type dNSName rather than the common practice of
than one identity is present in the certificate, the behavior conveying the identity in the CN field of the Subject field.
when multiple identities with different schemes were present Similarly, this document recommends that the SIP
in the certificate, etc., lead to ambiguities when attempting domain identity be conveyed as a subjectAltName extension
to interpret the certificate in a uniform manner for TLS use. of type uniformResourceIdentifier (see Section 19.4.11). A
We now describe how the certificates are to be used domain name in an X.509 certificate is properly interpreted
for mutual authentication when both the client and server only as a sequence of octets that is to be compared with hier-
possess appropriate certificates, and normative behavior for archy. For example, a valid certificate for example.com does
matching the DNS query string with an identity stored in not imply that the owner of that certificate has any relation-
the X.509 certificate. Furthermore, a certificate can contain ship at all to subname.example.com.
multiple identities for the subject in the subjectAltName
extension (the subject of a certificate identifies the entity
19.4.6.4 Need for Mutual Interdomain
associated with the public key stored in the public key field).
Authentication
As such, this document specifies appropriate matching rules
to encompass various subject identity representation options. Let us consider the SIP interdomain communications as
Finally, we also provide guidelines to service providers for depicted in Figure 19.8. A user, [email protected], invites
assigning certificates to SIP servers. [email protected] for a multimedia communication session.
Alice’s outbound proxy, proxy-A.example.com, uses normal
RFC 3263 (see Section 8.2.4) resolution rules to find a proxy
19.4.6.3 SIP Domain to Host Resolution
proxy-B.example.net in the example.net domain that uses
Routing in SIP is performed by having the client execute TLS.
RFC 3263 (see Section 8.2.4) procedures on a URI, called Proxy A actively establishes an interdomain TLS connec-
the Application Unique String (AUS). These procedures take tion with proxy B, and each presents a certificate to authen-
as input a SIP AUS (the SIP URI), extract the domain por- ticate that connection. In accordance with RFC 3261 (see
tion of that URI for use as a lookup key, and query the DNS Section 3.13), when a TLS connection is created between
(see Section 8.2) to obtain an ordered set of one or more IP two proxies, each side of the connection should verify and
addresses with a port number and transport corresponding inspect the certificate of the other, noting the domain name
692 ◾ Handbook on Session Initiation Protocol
Domain A: Domain B:
SIP proxy A SIP proxy B
(example.com) (example.net)
IP network
SIP network
Alice Bob
Party A Party B
SIP UA SIP UA
[email protected] [email protected]
Figure 19.8 SIP trapezoid with interdomain communications. (Copyright IETF. Reproduced with permission.)
that appears in the certificate for comparison with the header carefully consider the following advantages of conveying
fields of SIP messages. However, RFC 3261 is silent on identity in the subjectAltName extension field:
whether to use the subjectAltName or CN of the certificate to
obtain the domain name, and which takes precedence when ◾◾ The subjectAltName extension can hold multiple val-
there are multiple names identifying the holder of the certifi- ues, so the same certificate can identify multiple servers
cate. The authentication problem for proxy A is straightfor- or sip domains.
ward: in the certificate, proxy A receives from proxy B; proxy ◾◾ There is no fixed syntax specified for the Subject field, so
A looks for an identity that is a SIP URI (sip:example.net) or issuers vary in how the field content is set. This forces a
a DNS name (example.net) that asserts proxy B’s authority recipient to use heuristics to extract the identity, again
over the example.net domain. The normative behavior for a increasing opportunities for misinterpretation.
TLS client like proxy A is specified here.
The problem for proxy B is slightly more complex since Because of these advantages, service providers are strongly
it accepts the TLS request passively. Thus, proxy B does not encouraged to obtain certificates that contain the identity
possess an equivalent AUS that it can use as an anchor in or identities in the subjectAltName extension field. When
matching identities from proxy A’s certificate. RFC 3261 assigning certificates to authoritative servers, a SIP service
(see Section 19.12.3.2.2), only tells proxy B to “compare the provider must ensure that the SIP domain used to reach the
domain asserted by the certificate with the ‘domainname’ server appears as an identity in the subjectAltName field, or
portion of the From header field in the INVITE request.” for compatibility with existing certificates, the Subject field
The difficulty with that instruction is that the domainname of the certificate. In practice, this means that a service pro-
in the From header field is not always that of the domain vider distributes to its users SIP URIs whose domain portion
from which the request is received. The normative behavior corresponds to an identity for which the service provider has
for a TLS server like proxy B that passively accepts a TLS been issued a certificate.
connection and requires authentication of the sending peer
domain is provided here.
19.4.6.6 Behavior of SIP Entities
This section normatively specifies the behavior of SIP entities
19.4.6.5 Certificate Usage by SIP
when using X.509 certificates to determine an authenticated
Service Provider
SIP domain identity. The first two subsections apply to all SIP
It is possible for service providers to continue the practice implementations that use TLS to authenticate the peer; the
of using existing certificates for SIP usage with the identity next section describes how to extract a set of SIP identities from
conveyed only in the Subject field; however, they should the certificate obtained from a TLS peer, and Sections 4.2.3
Security Mechanisms in SIP ◾ 693
and 19.4.6 specifies how to compare SIP identities. The remain- The above procedure yields a set containing zero or more
ing subsections provide context for how and when these rules identities from the certificate. A client uses these identities to
are to be applied by entities in different SIP roles. authenticate a server, and a server uses them to authenticate
a client.
19.4.6.6.1 Finding SIP Identities in a Certificate
19.4.6.6.2 Comparing SIP Identities
Implementations (both clients and server) must determine
the validity of a certificate by following the procedures When an implementation (either client or server) com-
described in RFC 5280. As specified by Section 4.2.1.12 of pares two values as SIP domain identities: Implementations
RFC 5280, implementations must check for restrictions on must compare only the DNS name component of each SIP
certificate usage declared by any extendedKeyUsage exten- domain identifier; an implementation must not use any
sions in the certificate. The SIP Extended Key Usage (EKU) scheme or parameters in the comparison. Implementations
document (RFC 5924, see Section 19.4.6.8) defines an must compare the values as DNS names, which means that
extendedKeyUsage for SIP. Given an X.509 certificate that the comparison is case insensitive as specified by RFC 4343.
the above checks have found to be acceptable, the follow- Implementations MUST handle Internationalized Domain
ing describes how to determine what SIP domain identity Names (IDNs) in accordance with Section 7.2 of RFC 5280.
or identities the certificate contains. A single certificate can Implementations MUST match the values in their entirety:
serve more than one purpose—that is, the certificate might Implementations must not match suffixes. For example, foo
contain identities not acceptable as SIP domain identities, or .example.com does not match example.com.
might contain one or more identities that are acceptable for Implementations must not match any form of wildcard,
use as SIP domain identities. such as a leading “.” or “*.” with any other DNS label or sequence
of labels. For example, *.example.com matches only *.exam-
◾◾ Examine each value in the subjectAltName field. The ple
.com but not foo.example.com. Similarly, .example.com
subjectAltName field and the constraints on its values matches only .example.com, and does not match foo.example.
are defined in Section 4.2.1.6 of RFC 5280. The sub- com. RFC 2818 (HTTP over TLS) allows the dNSName com-
jectAltName field can be absent or can contain one ponent to contain a wildcard; for example, DNS:*.example.
or more values. Each value in the subjectAltName has com. RFC 5280, while not disallowing this explicitly, leaves
a type; the only types acceptable for encoding a SIP the interpretation of wildcards to the individual specification.
domain identity shall be as follows: if the scheme of RFC 3261 does not provide any guidelines on the presence of
the URI is not sip, then the implementation must not wildcards in certificates. Through the rule above, this docu-
accept the value as a SIP domain identity. ment prohibits such wildcards in certificates for SIP domains.
If the scheme of the URI value is sip, and the URI
value that contains a userpart (there is an “@”), the
19.4.6.6.3 Client Behavior
implementation must not accept the value as a SIP
domain identity (a value with a userpart identifies an A client uses the domain portion of the SIP AUS to query a
individual user, not a domain). If the scheme of the (possibly untrusted) DNS to obtain a result set, which is one
URI value is sip, and there is no userinfo component or more Service (SRV) and A records identifying the server
in the URI (there is no @), then the implementation for the domain. The SIP server, when establishing a TLS
must accept the hostpart as a SIP domain identity. connection, presents its certificate to the client for authenti-
Note that URI scheme tokens are always case insensi- cation. The client must determine the SIP domain identities
tive. An implementation must accept a DNS identifier in the server certificate using the procedure described earlier.
as a SIP domain identity if and only if no other iden- Then, the client must compare the original domain portion
tity is found that matches the sip URI type described of the SIP AUS used as input to the RFC 3263 (see Section
above. 8.2.4) server location procedures to the SIP domain identi-
◾◾ If and only if the subjectAltName does not appear in ties obtained from the certificate.
the certificate, the implementation may examine the
CN field of the certificate. If a valid DNS name is ◾◾ If there were no identities found in the server certifi-
found there, the implementation may accept this value cate, the server is not authenticated.
as a SIP domain identity. Accepting a DNS name in ◾◾ If the domain extracted from the AUS matches any SIP
the CN value is allowed for backwards compatibility; domain identity obtained from the certificate when
however, when constructing new certificates, consider compared as described above, the server is authenti-
the advantages of using the subjectAltName extension cated for the domain. If the server is not authenticated,
field described earlier. the client MUST close the connection immediately.
694 ◾ Handbook on Session Initiation Protocol
19.4.6.6.4 Server Behavior corresponds to an identity for which the proxy has a certifi-
cate; if the proxy does not insert such a URI, then creation of
When a server accepts a TLS connection, the server presents
a secure connection using the value from the Record-Route
its own X.509 certificate to the client. Servers that wish to
as the AUS will be impossible.
authenticate the client will ask the client for a certificate. If
the client possesses a certificate, that certificate is presented to
the server. If the client does not present a certificate, the client 19.4.6.6.6 Registrar Behavior
must not be considered authenticated. Whether or not to close
A SIP registrar, acting as a server, follows the normative
a connection if the client does not present a certificate is a mat-
behavior described above. When the SIP registrar accepts
ter of local policy, and depends on the authentication needs of
a TLS connection from the client, the SIP registrar pres-
the server for the connection. Some currently deployed servers
ents its certificate. Depending on the registrar policies, the
use Digest authentication to authenticate individual requests
SIP registrar can challenge the client with HTTP Digest.
on the connection, and choose to treat the connection as
authenticated by those requests for some purposes.
If the local server policy requires client authentication 19.4.6.6.7 Redirect Server Behavior
for some local purpose, then one element of such a local
A SIP redirect server follows the normative behavior of a
policy might be to allow the connection only if the client is
UAS as specified above.
authenticated. For example, if the server is an inbound proxy
that has peering relationships with the outbound proxies of
other specific domains, the server might allow only connec- 19.4.6.6.8 Virtual SIP Servers and Certificate Content
tions authenticated as coming from those domains. When
In the virtual hosting cases where multiple domains are man-
authenticating the client, the server must obtain the set of
aged by a single application, a certificate can contain multiple
SIP domain identities from the client certificate as described
subjects by having distinct identities in the subjectAltName
earlier.
field as specified in RFC 4474 (see Sections 2.8). Clients
Because the server accepted the TLS connection pas-
seeking to authenticate a server on such a virtual host can
sively, unlike a client, the server does not possess an AUS for
still follow the directions described earlier to find the identity
comparison. Nonetheless, server policies can use the set of
matching the SIP AUS used to query DNS. Alternatively,
SIP domain identities gathered from the certificate described
if the TLS client hello server_name extension as defined in
above to make authorization decisions. For example, a very
RFC 6066 is supported, the client should use that extension
open policy could be to accept an X.509 certificate and vali-
to request a certificate corresponding to the specific domain
date the certificate using the procedures in RFC 5280. If the
(from the SIP AUS) with which the client is seeking to estab-
certificate is valid, the identity set is logged. Alternatively,
lish a connection.
the server could have a list of all SIP domains the server is
allowed to accept connections from; when a client presents its
certificate, for each identity in the client certificate, the server 19.4.6.6.9 Security Considerations
searches for the identity in the list of acceptable domains to
The goals of TLS (when used with X.509 certificates) include
decide whether or not to accept the connection. Other poli-
the following security guarantees at the transport layer:
cies that make finer distinctions are possible. The decision of
confidentiality—packets tunneled through TLS can be read
whether or not the authenticated connection to the client is
only by the sender and receiver; integrity—packets tunneled
appropriate for use to route new requests to the client domain
through TLS cannot be undetectably modified on the connec-
is independent of whether or not the connection is authen-
tion between the sender and receiver; authentication—each
ticated; the connect-reuse RFC 4474 (see Sections 2.8 and
principal is authenticated to the other as possessing a private
19.4.8) discusses this aspect in more detail.
key for which a certificate has been issued. Moreover, this cer-
tificate has not been revoked, and is verifiable by a certificate
chain leading to a (locally configured) trust anchor. We expect
19.4.6.6.5 Proxy Behavior
appropriate processing of domain certificates to provide the
A proxy must use the procedures defined earlier for a UAS following security guarantees at the application level:
when authenticating a connection from a client. A proxy
must use the procedures defined for a UAC earlier when ◾◾ Confidentiality: SIPS messages from alice@example
requesting an authenticated connection to a UAS. If a proxy .com to [email protected] can be read only by alice@
adds a Record-Route when forwarding a request with the example.com, [email protected], and SIP proxies
expectation that the route is to use secure connections, the issued with domain certificates for example.com or
proxy must insert into the Record-Route header a URI that example.net.
Security Mechanisms in SIP ◾ 695
◾◾ Integrity: SIPS messages from [email protected] to the Security Considerations section of the new docu-
[email protected] cannot be undetectably modified on ment. All normative references from this document
the links between [email protected], bob@example can be carried forward to its successor.
.net, and SIP proxies issued with domain certificates ◾◾ Changes
for example.com or example.net. The following paragraph describes changes in spe-
◾◾ Authentication: [email protected] and proxy.exam- cific sections of RFC 3261 that need to be modified in
ple.com are mutually authenticated; moreover, proxy. the successor document to align them with the content
example.com is authenticated to [email protected] of this document. In each of the following, the token
as an authoritative proxy for domain example.com. <domain-authentication> is a reference to the section
Similar mutual authentication guarantees are given added as described above in “Additions.”
between proxy.example.com and proxy.example.net, ◾◾ Changes to Section 26.3.1 of RFC 3261
and between proxy.example.net and bob@example. The current text says: “Proxy servers, redirect serv-
net. As a result, [email protected] is transitively ers, and registrars should possess a site certificate whose
mutually authenticated to [email protected] (assum- subject corresponds to their canonical host name.” The
ing trust in the authoritative proxies for example.com suggested replacement for the above is: “Proxy servers,
and example.net). redirect servers, registrars, and any other server that is
authoritative for some SIP purpose in a given domain
should possess a certificate whose subjects include the
19.4.6.6.10 Connection Authentication Using Digest name of that SIP domain.”
might be administered independently and hosted separately, is restricted to use by a SIP service (along with any usages
it is desirable that a certificate be able to bind the DNS name allowed by other EKU values).
to its usage as a SIP domain name without creating the
implication that the entity presenting the certificate is also id-kp OBJECT IDENTIFIER :: ={iso(1)
identified-
authoritative for some other purpose. A mechanism is needed
organization(3)
to allow the certificate issued to a proxy to be restricted such dod(6) internet(1)
that the subject name(s) that the certificate contains are valid security(5)
only for use in SIP. In our example, proxy B possesses a cer- mechanisms(5)
tificate making proxy B authoritative as a SIP server for the pkix(7) 3}
domain example.net; furthermore, proxy B has a policy that id-kp-sipDomain OBJECT IDENTIFIER :: = {id-kp
20}
requires the client’s SIP domain to be authenticated through
a similar certificate. Proxy A is authoritative as a SIP server
for the domain example.com; when proxy A makes a TLS 19.4.6.8.3 Using the SIP EKU in a Certificate
connection to proxy B, the latter accepts the connection
based on its policy. Domain certificates in the SIP (RFC 5922, see Section
19.4.6) contains the steps for finding an identity (or a set
of identities) in an X.509 certificate for SIP. To determine
19.4.6.8.1 Restricting Usage to SIP whether the usage of a certificate is restricted to serve as a
SIP certificate only, implementations must perform the steps
This memo defines a certificate profile for restricting the
given below as a part of the certificate validation.
usage of a domain name binding to usage as a SIP domain
The implementation must examine the EKU value(s):
name. Section 4.2.1.12 of RFC 5280 defines a mechanism
for this purpose: an Extended Key Usage (EKU) attribute,
where the purpose of the EKU extension is described as fol- ◾◾ If the certificate does not contain any EKU values (the
lows: if the extension is present, then the certificate must EKU extension does not exist), it is a matter of local
only be used for one of the purposes indicated. If multiple policy whether or not to accept the certificate for use
purposes are indicated, the application need not recognize as a SIP certificate. Note that since certificates not fol-
all purposes indicated, as long as the intended purpose is lowing this specification will not have the id-kp-sip-
present. Certificate using applications may require that the Domain EKU value, and many do not have any EKU
EKU extension be present and that a particular purpose values, the more interoperable local policy would be to
be indicated in order for the certificate to be acceptable to accept the certificate.
that application. A Certificate Authority issuing a certifi- ◾◾ If the certificate contains the id-kp-sipDomain EKU
cate whose purpose is to bind a SIP domain identity without extension, then implementations of this specification
binding other non-SIP identities MUST include an id-kp- must consider the certificate acceptable for use as a SIP
sipDomain attribute in the EKU extension value shown in certificate.
the next section. ◾◾ If the certificate does not contain the id-kp-sipDomain
EKU value, but does contain the id-kp-anyExtend-
edKeyUsage EKU value, it is a matter of local policy
19.4.6.8.2 EKU Values for SIP Domains whether or not to consider the certificate acceptable for
RFC 5280 specifies the EKU X.509 certificate extension for use as a SIP certificate.
use in the Internet. The extension indicates one or more pur- ◾◾ If the EKU extension exists, but does not contain any
poses for which the certified public key is valid. The EKU of the idkp-sipDomain or id-kp-anyExtendedKey-
extension can be used in conjunction with the key usage Usage EKU values, then the certificate must not be
extension, which indicates how the public key in the certifi- accepted as valid for use as a SIP certificate.
cate is used, in a more basic cryptographic way.
The EKU extension syntax is repeated here for
convenience: 19.4.6.8.4 Implications for a Certification Authority
The procedures and practices employed by a certification
ExtKeyUsageSyntax :: = S
EQUENCE SIZE (1..
authority MUST ensure that the correct values for the EKU
MAX) OF KeyPurposeId
KeyPurposeId :: = OBJECT IDENTIFIER extension and subjectAltName are inserted in each certifi-
cate that is issued. For certificates that indicate authority
This specification defines the KeyPurposeId id-kp- over a SIP domain, but not over services other than SIP, cer-
sipDomain. Inclusion of this KeyPurposeId in a certificate tificate authorities must include the idkp-sipDomain EKU
indicates that the use of any Subject names in the certificate extension.
Security Mechanisms in SIP ◾ 697
19.4.6.8.5 Security Considerations details of RFC 6072 for certificate and credential manage-
ment of the SIP event package are not described here for the
This memo defines an EKU X.509 certificate extension that
sake of brevity.
restricts the usage of a certificate to a SIP service belonging to
an autonomous domain. Relying parties can execute appli-
cable policies (such as those related to billing) on receiving a 19.4.7 Authenticated Identity
certificate with the id-kp-sipDomain EKU value. An id-kp- Body Format in SIP
sipDomain EKU value does not introduce any new security
or privacy concerns. 19.4.7.1 Overview
RFC 3261 describes an integrity mechanism that relies on
signing tunneled message/sip MIME bodies (see Section
19.4.6.9 Certificate and Credential
19.6) within SIP requests. The purpose of this mechanism is
Management Service for SIP
to replicate the headers of a SIP request within a body car-
RFC 3261, as amended by RFC 3853, provides a mecha- ried in that request in order to provide a digital signature
nism for end-to-end encryption and integrity using S/MIME over these headers. The signature on this body also provides
(RFC 5751, see Section 19.6). Several security properties of authentication. The core requirement that motivates the tun-
RFC 3261 depend on S/MIME, and yet it has not been neled message/sip mechanism is the problem of providing
widely deployed. One reason is the complexity of providing a a cryptographically verifiable identity within a SIP request.
reasonable certificate distribution infrastructure. RFC 6072 The baseline SIP protocol allows a UA to express the identity
specifies a way to address discovery, retrieval, and manage- of its user in any of a number of headers. The primary place
ment of certificates for SIP deployments. Combined with the for identity information asserted by the sender of a request is
SIP Identity (RFC 4474, see Section 19.4.8) specification, the From header. The From header field contains a URI (like
this specification allows users to have certificates that are sip:[email protected]) and an optional display-name (like
not signed by any well-known certification authority while Alice) that identifies the originator of the request. A user may
still strongly binding the user’s identity to the certificate. In have many identities that are used in different contexts.
addition, this specification provides a mechanism that allows Typically, this URI is an AOR that can be dereferenced
SIP UAs such as IP phones to enroll and get their creden- in order to contact the originator of the request; specifically,
tials without any more configuration information than they it is usually the same AOR under which a user registers their
commonly have today. The end user expends no extra effort. devices in order to receive incoming requests. This AOR is
The definitions of certificate and credentials are provided in assigned and maintained by the administrator of the SIP
Section 2.2. service in the domain identified by the host portion of the
The general approach is to provide a new SIP service AOR. However, the From field of a request can usually be set
referred to as a credential service that allows SIP UAs to sub- arbitrarily by the user of a SIP UA; the From header of a mes-
scribe to other users’ certificates using a new SIP event pack- sage provides no internal assurance that the originating user
age (RFC 6665, see Section 5.2). The certificate is delivered can legitimately claim the given identity. Nevertheless, many
to the subscribing UA in a corresponding SIP NOTIFY SIP UAs will obligingly display the contents of the From field
request. An authentication service as described in the SIP as the identity of the originator of a received request (as a sort
Identity (RFC 4474, see Section 19.4.8) specification can be of caller identification function), much as e-mail implemen-
used to vouch for the identity of the sender of the certificate tations display the From field as the sender’s identity.
by using the sender’s proxy domain certificate to sign the To provide the recipient of a SIP message with greater
NOTIFY request. The authentication service is vouching assurance of the identity of the sender, a cryptographic sig-
that the sender is allowed to populate the SIP From header nature can be provided over the headers of the SIP request,
field value. The sender of the message is vouching that this which allows the signer to assert a verifiable identity.
is an appropriate certificate for the user identified in the SIP Unfortunately, a signature over the From header alone is
From header field value. The credential service can manage insufficient because it could be cut-and-pasted into a replay
public certificates as well as the user’s private keys. Users can or forwarding attack, and more headers are therefore needed
update their credentials, as stored on the credential service, to correlate a signature with a request. RFC 3261 therefore
using a SIP PUBLISH (RFC 3903, see Section 5.2) request. recommends copying all of the headers from the request into
The UA authenticates to the credential service using a shared a signed MIME body; however, SIP messages can be large,
secret when a UA is updating a credential. Typically, the and many of the headers in a SIP message would not be rel-
shared secret will be the same one that is used by the UA evant in determining the identity of the sender or assuring
to authenticate a REGISTER request with the Registrar for reference integrity with the request; moreover, some headers
the domain (usually with SIP Digest Authentication). The may change in transit for perfectly valid reasons.
698 ◾ Handbook on Session Initiation Protocol
Thus, this large tunneled message/sip body will almost To: Bob <sip:[email protected]>
necessarily be at variance with the headers in a request when Contact: <sip:[email protected]>
Date: Thu, 21 Feb 2002 13:02:03 GMT
it is received by the UAS, and the burden on the UAS to Call-ID: a84b4c76e66710
determine which header changes were legitimate, and which CSeq: 314159 INVITE
were security violations. It is therefore desirable to find a
happy medium—to provide a way of signing just enough Unsigned AIBs MUST be treated by any recipients
headers that the identity of the sender can be ascertained according to the rules set out in Section 2.3 for AIBs that
and correlated with the request. message/sipfrag defined in do not validate. After the AIB has been signed, it should be
RFC 3420 (see Section 2.8.2) provides a way for a subset added to existing MIME bodies in the request (such as SDP),
of SIP headers to be included in a MIME body. RFC 3893 if necessary by transitioning the outermost MIME body to a
provides a more specific mechanism to derive integrity and multipart/mixed format.
authentication properties from an authenticated identity
body, a digitally signed SIP message, or message fragment. 19.4.7.3 Example of a Request with AIB
A standard format for such bodies, known as Authenticated
The following shows a full SIP INVITE request with an AIB:
Identity Bodies (AIBs), is given in the next section. The AIB
format described here is based on message/sipfrag. INVITE sip:[email protected] SIP/2.0
For reasons of end-to-end privacy, it may also be desirable Via: SIP/2.0/UDP pc33.example.com;
to encrypt AIBs. The procedures for this encryption are given branch=z9hG4bKnashds8
here as well. This document proposes that the AIB format To: Bob <sip:[email protected]>
From: Alice <sip:[email protected]>;
should be used instead of the existing tunneled message/sip tag=1928301774
mechanism described in RFC 3261 (see Section 19.6.4), in Call-ID: a84b4c76e66710
order to provide the identity of the caller; if integrity over CSeq: 314159 INVITE
other, unrelated headers is required, then the message/sip Max-Forwards: 70
mechanism should be used. Date: Thu, 21 Feb 2002 13:02:03 GMT
Contact: <sip:[email protected]>
Content-Type: multipart/mixed;
boundary=unique-boundary-1
19.4.7.2 AIB Format
--unique-boundary-1
As a way of sharing authenticated identity among parties in the
network, a special type of MIME body format, the AIB format, Content-Type: application/sdp
is defined in this section. AIBs allow a party in a SIP transaction Content-Length: 147
to cryptographically sign the headers that assert the identity of v=0
the originator of a message, and provide some other headers nec- o=UserA 2890844526 2890844526 IN IP4 example.
essary for reference integrity. An AIB is a MIME body of type com
message/sipfrag. For more information on constructing sipfrags, s=Session SDP
including examples, see Section 2.8.2. This MIME body must c=IN IP4 pc33.example.com
have a Content-Disposition (see Section 2.8.2) disposition-type t=0 0
m=audio 49172 RTP/AVP 0
of aib, a new value defined in this document specifically for a=rtpmap:0 PCMU/8000
authenticated identity bodies. The Content-Disposition header --unique-boundary-1
should also contain a handling parameter indicating that this Content-Type: multipart/signed;
MIME body is optional; that is, if this mechanism is not sup- protocol="application/pkcs7-signature";
ported by the UAS, it can still attempt to process the request. micalg=sha1; boundary=boundary42
AIBs using the message/sipfrag MIME type must contain Content-Length: 608
the following headers when providing identity for an INVITE --boundary42
request: From, Date, Call-ID, and Contact; they should also
contain the To and CSeq header. The security properties of Content-Type: message/sipfrag
Content-Disposition: aib; handling=optional
these headers, and circumstances in which they should be used, From: Alice <sip:[email protected]>
are also described. AIBs may contain any other headers that To: Bob <sip:[email protected]>
help uniquely identify the transaction or provide related refer- Contact: <sip:[email protected]>
ence integrity. An example of the AIB format for an INVITE is Date: Thu, 21 Feb 2002 13:02:03 GMT
Call-ID: a84b4c76e66710
Content-Type: message/sipfrag CSeq: 314159 INVITE
Content-Disposition: aib; handling=optional
From: Alice <sip:[email protected]> --boundary42
Security Mechanisms in SIP ◾ 699
19.4.7.6 Identity in Responses
19.4.7.4 AIBs for Identifying Third Parties
Many of the practices described in the preceding sections can
There are special-case uses of the INVITE method in which be applied to responses as well as requests. Note that a new
some SIP messages are exchanged with a third party before set of headers must be generated to populate the AIB in a
an INVITE is sent, and in which the identity of the third response. The From header field of the AIB in the response to
party needs to be carried in the subsequent INVITE. The an INVITE must correspond to the AOR of the responder,
details of addressing identity in such contexts are outside the not to the From header field received in the request. The To
scope of this document. At a high level, it is possible that header field of the request must not be included. A new Date
identity information for a third party might be carried in header field and Contact header field should be generated for
a supplemental AIB. The presence of a supplemental AIB the AIB in a response. The Call-ID and CSeq should, how-
within a message would not preclude the appearance of a ever, be copied from the request.
regular AIB as specified in this document. Example cases Generally, the To header field of the request will corre-
in which supplemental AIBs might appear include the fol- spond to the AOR of the responder. In some architectures
lowing: The use of the REFER method, for example, has a where retargeting is used, however, this need not be the case.
requirement for the recipient of an INVITE to ascertain the Some recipients of response AIBs may consider it a cause for
identity of the referrer who caused the INVITE to be sent. security concern if the To header field of the request is not
Third-party call control specified in RFC 3725 (see Section the same as the AOR in the From header field of the AIB in
18.3) has an even more complicated identity problem. A cen- a response.
tral controller INVITEs one party, gathers identity informa-
tion (and session context) from that party, and then uses this
information to INVITE another party. Ideally, the control-
19.4.7.7 Receiving an AIB
ler would also have a way to share a cryptographic identity When a UA receives a request containing an AIB, it must
signature given by the first party invited by the controller to verify the signature, including validating the certificate of the
the second party invited by the controller. In both of these signer, and compare the identity of the signer (the subject
cases, the Call-ID and CSeq of the original request (3PCC AltName) with, in the INVITE case, the domain portion
INVITE or REFER) would not correspond with that of the of the URI in the From header field of the request (for non-
request in by the subsequent INVITE, nor would the To or INVITE requests, other headers may be subject to this com-
From. parison). The two should correspond exactly; if they do not,
In both the REFER case and the 3PCC case, the Call-ID the UA must report this condition to its user before proceed-
and CSeq cannot be used to guarantee reference integrity, ing. UAs may distinguish between plausibly minor variations
and it is therefore much harder to correlate an AIB to a sub- (the difference between example.com and sip.example.com)
sequent INVITE request. Thus, in these cases, some other and major variations (example.com versus example.org) when
headers might be used to provide reference integrity between reporting these discrepancies in order to give the user some
the headers in a supplemental AIB with the headers of a idea of how to handle this situation.
3PCC or REFER-generated INVITE; however, this usage Analysis and comparison of the Date, Call-ID, and
is outside of the scope of this document. In order for AIBs Contact header fields, as explained later for providing secu-
to be used in these third-party contexts, further specification rity, must also be performed. Any discrepancies or violations
700 ◾ Handbook on Session Initiation Protocol
Accordingly, if an AIB is replayed within the Date interval, previous association, and still have a reasonable assurance
receivers will recognize that it is invalid because of a Call-ID that the person’s displayed Caller-ID is accurate. A crypto-
duplication; if an AIB is replayed after the Date interval, graphic approach can probably provide a much stronger and
receivers will recognize that it is invalid because the Date is less-spoofable assurance of identity than the S/MIME (see
stale. The Contact header field is included to tie the AIB to Section 19.6) provides for SIP networks. Cryptographically
a particular device instance that generated the request. Were assuring the identity of the end users that originate SIP
an active attacker to intercept a request containing an AIB, requests is an essential need especially in an interdomain
and cut-and-paste the AIB into their own request (reusing context. RFC 4474 enhances the authenticated identity
the From, Contact, Date, and Call-ID fields that appear in management specifying a mechanism for securely identify-
the AIB), they would not be eligible to receive SIP requests ing originators of SIP messages. It does so by defining two
from the called UA, since those requests are routed to the new SIP header fields—Identity, for conveying a signature
URI identified in the Contact header field. used for validating the identity, and Identity-Info, for con-
The To and CSeq header fields provide properties that are veying a reference to the certificate of the signer satisfying
generally useful, but not for all possible applications of AIBs. the following critical requirements:
If a new AIB is issued each time a new SIP transaction is ini-
tiated in a dialog, the CSeq header field provides a valuable ◾◾ The mechanism allows a UAC or a proxy server to
property (replay protection for this particular transaction). provide a strong cryptographic identity assurance in a
If, however, one AIB is used for an entire dialog, subsequent request that can be verified by a proxy server or UAS.
transactions in the dialog would use the same AIB that ◾◾ UAs that receive identity assurances is able to validate
appeared in the INVITE transaction. Using a single AIB these assurances without performing any network
for an entire dialog reduces the load on the generator of the lookup.
AIB. The To header field usually designates the original URI ◾◾ UAs that hold certificates on behalf of their user is
that the caller intended to reach, and therefore it may vary capable of adding this identity assurance to requests.
from the Request-URI if retargeting occurs at some point ◾◾ Proxy servers that hold certificates on behalf of their
in the network. Accordingly, including the To header field domain is capable of adding this identity assurance to
in the AIB helps identify cut-and-paste attacks in which an requests. However, a UAC is not required to support
AIB sent to a particular destination is reused to impersonate this mechanism in order for an identity assurance to be
the sender to a different destination. However, the inclusion added to a request in this fashion.
of the To header field probably would not make sense for ◾◾ The mechanism prevents replay of the identity assur-
many third-party AIB cases, nor is its inclusion necessary for ance by an attacker.
responses. ◾◾ The mechanism is capable of protecting the integrity
of SIP message bodies (to ensure that media offers
and answers are linked to the signaling identity) and
19.4.8 Cryptographic Authentication Scheme thereby provides full replay protection.
In SIP, an identity is usually defined as a SIP URI, commonly ◾◾ It is also possible for a user to have multiple AORs (i.e.,
a canonical AOR employed to reach a user (such as sip:alice@ accounts or aliases) that it is authorized to use within a
atlanta.example.com). RFC 3261 stipulates several places domain, and for the UAC to assert one identity while
within a SIP request where a user can express an identity authenticating itself as another, related, identity, as
for themselves, notably the user-populated From header field. permitted by the local policy of the domain.
However, the recipient of a SIP request has no way to verify
that the From header field has been populated appropriately,
in the absence of some sort of cryptographic authentication
19.4.8.1 Background
mechanism. RFC 3261 also specifies a number of security
mechanisms that can be employed by SIP UAs, including The usage of many SIP applications and services is governed
Digest, TLS, and S/MIME (implementations may support by authorization policies as described earlier (see Section
other security schemes as well). However, few SIP UAs today 19.1) and in Section 19.5. These policies may be automated,
support the end-user certificates necessary to authenticate or they may be applied manually by humans. An example
themselves (via S/MIME, e.g., see Section 19.6), and fur- of the latter would be an Internet telephone application that
thermore Digest authentication is limited by the fact that the displays the Caller-ID of a caller, which a human may review
originator and destination must share a prearranged secret. before answering a call. An example of the former would be a
It is desirable for SIP UAs to be able to send requests to presence service that compares the identity of potential sub-
destinations with which they have no previous association; scribers to a whitelist before determining whether it should
one can receive a call from someone with whom one has no accept or reject the subscription. In both of these cases,
702 ◾ Handbook on Session Initiation Protocol
attackers might attempt to circumvent these authorization RFC 3261 already describes an architecture very simi-
policies through impersonation. Since the primary identifier lar to this (see Sections 19.4.1 through 19.4.5), in which a
of the sender of a SIP request, the From header field, can be UA authenticates itself to a local proxy server, which in turn
populated arbitrarily by the controller of a UA, imperson- authenticates itself to a remote proxy server via mutual TLS,
ation is very simple today. creating a two-link chain of transitive authentication between
The mechanism (RFC 4474) that is described here aspires the originator and the remote domain. While this works well
to provide a strong identity system for SIP in which authoriza- in some architectures, there are a few respects in which this
tion policies cannot be circumvented by impersonation. All is impractical. For one, transitive trust is inherently weaker
UAs being RFC 3261-compliant support Digest authentica- than an assertion that can be validated end-to-end. It is pos-
tion, which utilizes a shared secret, as a means for authenticat- sible for SIP requests to cross multiple intermediaries in sepa-
ing themselves to a SIP registrar. Registration allows a UA to rate administrative domains, in which case transitive trust
express that it is an appropriate entity to which requests should becomes even less compelling.
be sent for a particular SIP AOR URI (e.g., sip:alice@atlanta One solution to this problem is to use trusted SIP inter-
.example.com). By the definition of identity used here, registra- mediaries that assert an identity for users in the form of a
tion is a proof of the identity of the user to a registrar. However, privileged SIP header. A mechanism for doing so (with
the credentials with which a UA proves its identity to a regis- the P-Asserted-Identity header) is given in RFC 3325 (see
trar cannot be validated by just any UA or proxy server—these Sections 2.8 and 20.3). However, this solution allows only
credentials are only shared between the UA and their domain hop-by-hop trust between intermediaries, not end-to-end
administrator. Thus, this shared secret does not immediately cryptographic authentication, and it assumes a managed
help a user to authenticate to a wide range of recipients. network of nodes with strict mutual trust relationships, an
Recipients require a means of determining whether or assumption that is incompatible with widespread Internet
not the return address identity of a non-REGISTER request deployment. Accordingly, this document specifies a means
(i.e., the From header field value) has legitimately been of sharing a cryptographic assurance of end-user SIP Identity
asserted. The AOR URI used for registration is also the URI in an interdomain or intradomain context that is based on
with which a UA commonly populates the From header field of the concept of an authentication service and a new SIP header,
requests in order to provide a return address identity to recipi- the Identity header. Note that the scope of this cryptographic
ents. From an authorization perspective, if you can prove you authentication mechanism is limited to providing this iden-
are eligible to register in a domain under a particular AOR, tity assurance for SIP requests; solving this problem for SIP
you can prove you can legitimately receive requests for that responses is more complicated and is a subject for future
AOR, and accordingly, when you place that AOR in the From research.
header field of a SIP request other than a registration (like an This specification allows either a UA or a proxy server to
INVITE), you are providing a return address where you can provide identity services and to verify identities. To maximize
legitimately be reached. In other words, if you are authorized end-to-end security, it is obviously preferable for end-users
to receive requests for that return address, logically, it follows to acquire their own certificates and corresponding private
that you are also authorized to assert that return address in keys; if they do, they can act as an authentication service.
your From header field. This is of course only one manner However, end-user certificates may be neither practical nor
in which a domain might determine how a particular user is affordable, given the difficulties of establishing a Public Key
authorized to populate the From header field; as an aside, for Infrastructure (PKI) that extends to end users, and more-
other sorts of URIs in the From header field (like anonymous over, given the potentially large number of SIP UAs (phones,
URIs), other authorization policies would apply. PCs, laptops, PDAs, gaming devices) that may be employed
Ideally, then, SIP UAs should have some way of prov- by a single user. In such environments, synchronizing key-
ing to recipients of SIP requests that their local domain ing material across multiple devices may be very complex
has authenticated them and authorized the population of and requires a good deal of additional end-point behavior.
the From header field. This document proposes a mediated Managing several certificates for the various devices is also
authentication architecture for SIP in which requests are sent quite problematic and unpopular with users. Accordingly, in
to a server in the user’s local domain, which authenticates the initial use of this mechanism, it is likely that intermediar-
such requests (using the same practices by which the domain ies will instantiate the authentication service role.
would authenticate REGISTER requests). Once a message
has been authenticated, the local domain then needs some
19.4.8.2 Cryptographic Operations
way to communicate to other SIP entities that the sending
user has been authenticated and its use of the From header A high-level informative overview of the mechanisms is
field has been authorized. This document addresses how that described here. Imagine the case where Alice, who has
imprimatur of authentication can be shared. the home proxy of example.com and the AOR sip:alice@
Security Mechanisms in SIP ◾ 703
example.com, wants to communicate with sip:bob@exam role perform the following steps, in order, to generate an
ple.org. Alice generates an INVITE and places her identity Identity header for a SIP request.
in the From header field of the request. She then sends an
INVITE over TLS to an authentication service proxy for Step 1:
her domain. The authentication service authenticates Alice The authentication service must extract the identity
(possibly by sending a Digest authentication challenge) and of the sender from the request. The authentication ser-
validates that she is authorized to assert the identity that is vice takes this value from the From header field; this
populated in the From header field. This value may be Alice’s AOR will be referred to here as the identity field. If
AOR, or it may be some other value that the policy of the the identity field contains a SIP or SIP Secure (SIPS)
proxy server permits her to use. URI, the authentication service must extract the host
It then computes a hash over some particular headers, name portion of the identity field and compare it to
including the From header field and the bodies in the mes- the domain(s) for which it is responsible (following
sage. This hash is signed with the certificate for the domain the procedures in RFC 3261, Chapter 9, used by a
(example.com, in Alice’s case) and inserted in a new header proxy server to determine the domain(s) for which it
field in the SIP message, the Identity header. The proxy, as is responsible). If the identity field uses the TEL URI
the holder of the private key of its domain, is asserting that scheme, the policy of the authentication service deter-
the originator of this request has been authenticated and mines whether or not it is responsible for this identity.
that she is authorized to claim the identity (the SIP AOR) If the authentication service is not responsible for the
that appears in the From header field. The proxy also inserts identity in question, it should process and forward
a companion header field, Identity-Info, that tells Bob how the request normally, but it must not add an Identity
to acquire its certificate, if he does not already have it. When header; see below for more information on authentica-
Bob’s domain receives the request, it verifies the signature tion service handling of an existing Identity header.
provided in the Identity header, and thus can validate that Step 2:
the domain indicated by the host portion of the AOR in the The authentication service must determine whether
From header field authenticated the user, and permitted or not the sender of the request is authorized to claim
the user to assert that From header field value. This same the identity given in the identity field. In order to do
validation operation may be performed by Bob’s UAS. so, the authentication service must authenticate the
sender of the message. Some possible ways in which
this authentication might be performed include the
19.4.8.3 Authentication Service Behavior following:
RFC 4474 defines a new role for SIP entities called an If the authentication service is instantiated by a SIP
authentication service. The authentication service role can be intermediary (proxy server), it may challenge the request
instantiated by a proxy server or a UA. Any entity that instan- with a 407 response code using the Digest authentica-
tiates the authentication service role must possess the private tion scheme (or viewing a Proxy-Authentication header
key of a domain certificate. Intermediaries that instantiate sent in the request, which was sent in anticipation of
this role must be capable of authenticating one or more SIP a challenge using cached credentials, as described in
users that can register in that domain. Commonly, this role RFC 3261, Section 19.4.3). Note that if that proxy
will be instantiated by a proxy server, since these entities are server is maintaining a TLS connection with the cli-
more likely to have a static host name, hold a corresponding ent over which the client had previously authenticated
certificate, and have access to SIP registrar capabilities that itself using Digest authentication, the identity value
allow them to authenticate users in their domain. It is also obtained from that previous authentication step can be
possible that the authentication service role might be instan- reused without an additional Digest challenge.
tiated by an entity that acts as a redirect server; however, that If the authentication service is instantiated by a SIP
is left as a topic for future work. UA, a UA can be said to authenticate its user on the
SIP entities that act as an authentication service must add grounds that the user can provision the UA with the
a Date header field to SIP requests if one is not already pres- private key of the domain, or preferably by providing a
ent (see Syntax in Section 16.2.7, for information on how password that unlocks said private key. Authorization
the Date header field assists verifiers). Similarly, authentica- of the use of a particular user name in the From header
tion services must add a Content-Length header field to SIP field is a matter of local policy for the authentica-
requests if one is not already present; this can help verifiers to tion service, one that depends greatly on the manner
double-check that they are hashing exactly as many bytes of in which authentication is performed. For example,
message body as the authentication service when they verify one policy might be as follows: the user name given
the message. Entities instantiating the authentication service in the username parameter of the Proxy-Authorization
704 ◾ Handbook on Session Initiation Protocol
header must correspond exactly to the user name in Date header in order to cause a request to fail verifica-
the From header field of the SIP message. However, tion; the Identity header is not intended to provide a
there are many cases in which this is too limiting or source of nonrepudiation or a perfect record of when
inappropriate; a realm might use username parameters messages are processed. Finally, the authentication
in Proxy-Authorization that do not correspond to the service must verify that the Date header falls within
user-portion of SIP From headers, or a user might the validity period of its certificate. For more informa-
manage multiple accounts in the same administrative tion on the security properties associated with the Date
domain. header field value, see Section 19.4.8.7.
In this latter case, a domain might maintain a Step 4:
mapping between the values in the username param- The authentication service must form the identity
eter of Proxy-Authorization, and a set of one or more signature and add an Identity header to the request
SIP URIs that might legitimately be asserted for that containing this signature. After the Identity header has
user name. For example, the user name can corre- been added to the request, the authentication service
spond to the private identity as defined in 3GPP, in MUST also add an Identity-Info header. The Identity-
which case the From header field can contain any one Info header contains a URI from which its certificate
of the public identities associated with this private can be acquired. Finally, the authentication service
identity. In this instance, another policy might be must forward the message normally.
as follows: the URI in the From header field must
correspond exactly to one of the mapped URIs
associated with the username given in the Proxy-
19.4.8.3.1 Identity within a Dialog and Retargeting
Authorization header. Various exceptions to such
policies might arise for cases like anonymity; if the Retargeting is broadly defined as the alteration of the
AOR asserted in the From header field uses a form Request-URI by intermediaries. More specifically, retarget-
like sip:[email protected], then the exam- ing supplants the original target URI with one that corre-
ple.com proxy should authenticate that the user is a sponds to a different user, a user that is not authorized to
valid user in the domain and insert the signature over register under the original target URI. By this definition,
the From header field as usual. retargeting does not include translation of the Request-URI
Note that this check is performed on the addr-spec to a contact address of an end point that has registered under
in the From header field (e.g., the URI of the sender, the original target URI, for example. When a dialog-forming
like sip:[email protected]); it does not con- request is retargeted, this can cause a few wrinkles for the
vert the display-name portion of the From header field Identity mechanism when it is applied to requests sent in
(e.g., Alice Atlanta). Authentication services may check the backwards direction within a dialog. This section pro-
and validate the display name as well, and compare it vides some nonnormative considerations related to this case.
to a list of acceptable display-names that may be used When a request is retargeted, it may reach a SIP end point
by the sender; if the display name does not meet policy whose user is not identified by the URI designated in the To
constraints, the authentication service must return a header field value.
403 response code. The reason phrase should indicate The value in the To header field of a dialog-forming
the nature of the problem; for example, Inappropriate request is used as the From header field of requests sent in the
Display Name. However, the display name is not backwards direction during the dialog, and is accordingly
always present, and in many environments the requi- the header that would be signed by an authentication service
site operational procedures for display-name validation for requests sent in the backwards direction. In retargeting
may not exist. cases, if the URI in the From header does not identify the
Step 3: sender of the request in the backwards direction, then clearly
The authentication service should ensure that any it would be inappropriate to provide an Identity signature
preexisting Date header in the request is accurate. over that From header. As specified above, if the authentica-
Local policy can dictate precisely how accurate the tion service is not responsible for the domain in the From
Date must be; a recommended maximum discrepancy header field of the request, it must not add an Identity header
of 10 minutes will ensure that the request is unlikely to to the request, and it should process/forward the request
upset any verifiers. If the Date header contains a time normally.
different by more than 10 minutes from the current Any means of anticipating retargeting, and so on, is out-
time noted by the authentication service, the authenti- side the scope of this mechanism, and likely to have equal
cation service should reject the request. This behavior applicability to response identity as it does to requests in
is not mandatory because a UAC could only exploit the the backwards direction within a dialog. Consequently, no
Security Mechanisms in SIP ◾ 705
special guidance is given for implementers here regarding the a message, an entity acting as a verifier must perform the fol-
connected party problem; authentication service behavior is lowing steps, in the order here specified.
unchanged if retargeting has occurred for a dialog-forming
request. Ultimately, the authentication service provides an Step 1:
Identity header for requests in the backwards dialog when The verifier must acquire the certificate for the
the user is authorized to assert the identity given in the From signing domain. Implementations supporting this
header field, and if they are not, an Identity header is not specification should have some means of retaining
provided. Although retargeting has some benefits from a domain certificates (in accordance with normal prac-
communications efficiency point of view, it may have many tices for certificate lifetimes and revocation) in order
security problems, namely as follows: service hijacking, inse- to prevent themselves from needlessly downloading
cure responses, confidentiality problems, circumvention of the same certificate every time a request from the
blacklists, and rampant transitivity. However, these security same domain is received. Certificates cached in this
problems of retargeted response identity can be solved meet- manner should be indexed by the URI given in the
ing the following problems: Identity-Info header field value. Provided that the
domain certificate used to sign this message is not
◾◾ In an ideal world, it would be possible for a UAC to previously known to the verifier, SIP entities should
have a strong assurance that intermediaries were behav- discover this certificate by dereferencing the Identity-
ing properly, and furthermore to have the capability to Info header, unless they have some more efficient
differentiate between properly behaving intermediaries implementation-specific way of acquiring certificates
and attackers. for that domain. If the URI scheme in the Identity-
◾◾ It must be possible for a UAC to detect when a request Info header cannot be dereferenced, then a 436 Bad
has been retargeted. Identity-Info response must be returned. The verifier
◾◾ A domain that changes the target of a request must be processes this certificate in the usual ways, includ-
capable of informing the UAC of the new target(s). ing checking that it has not expired, that the chain
◾◾ The mechanism must allow simple intradomain retar- is valid back to a trusted certification authority (CA),
geting in cases where persistent TLS connections are and that it does not appear on revocation lists. Once
used as a network address translation (NAT) traversal the certificate is acquired, it must be validated fol-
mechanism. lowing the procedures in RFC 3280. If the certificate
◾◾ It must be possible for a domain that changes the target cannot be validated (it is self-signed and untrusted,
of a request to inform the UAC of the new target(s) or signed by an untrusted or unknown certificate
before contacting any of the new target(s). There must authority, expired, or revoked), the verifier must send
furthermore be a way for intermediaries to determine a 437 Unsupported Certificate response.
when UACs require prior information about new Step 2:
targets. The verifier must follow the process described in
◾◾ It must be possible to preserve the privacy of targets Section 19.4.8.11.4 to determine if the signer is author-
and potential targets of requests. itative for the URI in the From header field.
◾◾ It must be possible to preserve the ordering of a target Step 3:
set desired by the domain that changes the target of a The verifier must verify the signature in the Identity
request. header field, following the procedures for generating
the hashed digest-string described in Section 19.4.8.7.
If a verifier determines that the signature on the mes-
sage does not correspond to the reconstructed digest
19.4.8.4 Verifier Behavior
string, then a 438 Invalid Identity Header response
RFC 4474 introduces a new logical role for SIP entities called must be returned.
a server. When a verifier receives a SIP message containing Step 4:
an Identity header, it may inspect the signature to verify the The verifier must validate the Date, Contact, and
identity of the sender of the message. Typically, the results of Call-ID headers in the manner described in Section
a verification are provided as input to an authorization pro- 19.4.8.11.1; recipients that wish to verify Identity sig-
cess that is outside the scope of this document. If an Identity natures must support all of the operations described
header is not present in a request, and one is required by there. It must furthermore ensure that the value of the
local policy (e.g., based on a per-sending-domain policy, or Date header falls within the validity period of the cer-
a per-sending-user policy), then a 428 Use Identity Header tificate whose corresponding private key was used to
response must be sent. To verify the identity of the sender of sign the Identity header.
706 ◾ Handbook on Session Initiation Protocol
◾◾ The addr-spec component of the Contact header field be of the form application/pkix-cert described in that specifi-
value. If the request does not contain a Contact header, cation. Note that this introduces key life-cycle management
this field must be empty (i.e., there will be no white concerns; were a domain to change the key available at the
space between the fourth and fifth “|” characters in the Identity-Info URI before a verifier evaluates a request signed
canonical string). by an authentication service, this would cause obvious veri-
◾◾ The body content of the message with the bits exactly fier failures. When a rollover occurs, authentication services
as they are in the message (in the ABNF for SIP, the should thus provide new Identity-Info URIs for each new
message body). This includes all components of multi- certificate, and should continue to make older key acquisi-
part message bodies. Note that the message body does tion URIs available for a duration longer than the plausible
not include the CRLF separating the SIP headers from lifetime of a SIP message (an hour would most likely suffice).
the message body, but does include everything that fol- The Identity-Info header field must contain an alg param-
lows that CRLF. If the message has no body, then mes- eter. No other parameters are defined for the Identity-Info
sage body will be empty, and the final “|” will not be header in this document. Future Standards Track RFCs may
followed by any additional characters. define additional Identity-Info header parameters.
RFC 4474 adds the Identity and Identity-Info header
For more information on the security properties of these field with new entries to Table 2 of RFC 3261 (see Table
headers, and why their inclusion mitigates replay attacks, see 2.5, Section 2.8). Note that this mechanism does not pro-
Section 19.4.8.11 and RFC 3893 (see Sections 19.4.6.9 and tect the CANCEL method as indicated in Table 2.5, Section
19.4.7). The precise formulation of this digest-string is, there- 2.8. The CANCEL method cannot be challenged, because
fore (following the ABNF in RFC 3261) it is hop-by-hop, and accordingly the authentication service
behavior for CANCEL would be significantly limited. Also,
digest-string=addr-spec "|" addr-spec "|" note that the REGISTER method uses Contact header fields
callid "|"
in very unusual ways that complicate its applicability to this
1*DIGIT SP Method "|" SIP-date "|" [addr-
spec] "|" mechanism, and the use of Identity with REGISTER is
message-body consequently a subject for future study, although it is left as
optional here for forward-compatibility reasons. The Identity
Note again that the first addr-spec must be taken from and Identity-Info header must not appear in CANCEL.
the From header field value, the second addr-spec must be
taken from the To header field value, and the third addr-spec
19.4.8.8 Compliance Tests and Examples
must be taken from the Contact header field value, provided
the Contact header is present in the request. After the digest- The examples in this section illustrate the use of the Identity
string is formed, it must be hashed and signed with the cer- header in the context of a SIP transaction. Implementers
tificate for the domain. The hashing and signing algorithm are advised to verify their compliance with the specification
is specified by the alg parameter of the Identity-Info header. against the following criteria:
This document defines only one value for the alg parame-
ter: rsa-sha1;; further values must be defined in a Standards ◾◾ Implementations of the authentication service role must
Track RFC. All implementations of this specification must generate identical base64 identity strings to the ones
support rsa-sha1. When the rsa-sha1 algorithm is specified shown in the Identity headers in these examples when pre-
in the alg parameter of Identity-Info, the hash and signature sented with the source message and utilizing the appropri-
must be generated as follows: compute the results of sign- ate supplied private key for the domain in question.
ing this string with shalWithRSAEncryption as described in ◾◾ Implementations of the verifier role must correctly vali-
RFC 3370 and base64 encode the results as specified in RFC date the given messages containing the Identity header
4648. A 1024-bit or longer RSA key must be used. The result when utilizing the supplied certificates (with the caveat
is placed in the Identity header field. For detailed examples of about self-signed certificates below).
the usage of this algorithm, see Section 19.4.8.8.
The absoluteURI portion of the Identity-Info header Note that the following examples use self-signed certifi-
must contain a URI that dereferences to a resource contain- cates, rather than certificates issued by a recognized certificate
ing the certificate of the authentication service. All imple- authority. The use of self-signed certificates for this mechanism
mentations of this specification must support the use of is not recommended, and it appears here only for illustrative
HTTP and HTTPS URIs in the Identity-Info header. Such purposes. Therefore, in compliance testing, implementations
HTTP and HTTPS URIs must follow the conventions of of verifiers should generate appropriate warnings about the
RFC 2585, and for those URIs the indicated resource must use of self-signed certificates. Also, the example certificates
708 ◾ Handbook on Session Initiation Protocol
in this section have placed their domain name subject in Encoded Reference Files:
the subjectAltName field; in practice, certificate authorities
may place domain names in other locations in the certificate -- BEGIN MESSAGE ARCHIVE --
H4sICFfaz0QCA25ld2lkZW50LnRhcgDsW0us5NhZ7gUS
(see Section 19.4.8.11.4 for more information). Note that all
wqiF2CAhFikiIQhFt992+U46it+u8qPK5Uc9WPlVfj/Kd
examples use the rsa-sha1 algorithm (RFCs 3110, 3370, 4359, pXtomEDCxaAhFggISE2WSHCIoIFioQQC8gqAhRAQQTY8J
and 4648). Bit-exact reference files for these messages and JAbMgGIYTv7b7T09PT0xNl+mqS3F8qVd3jY/uc85//+87
their various transformations are supplied here. /nXOLoIv9oGjBB2/PIAiDSBwfv1GERInxG8EwAh6/37UH
MIQRKIljCI4+gGCUGKtP8Ad3YKemderJ5EFSBW1QN2Xxm
Bit-Exact Archive of Examples of Messages: np5GtblqXqUPfIffBdZcet/p82conUee0H9sfsfhiACw17
nfwQa y+Dra+MkQGFkrI+TOPJgAt37/63bo2tjeHGuTVh
The following text block is an encoded, gzip-
+bc6FOUub/E0poM7nLGqyLJ06Id3NGTocPxytMWF6jNJY
compressed TAR archive of files that represent the pDqIoXVLoDlmr+pNx+o7ztZ1ke8WtnXhFUClU5GGLZ6lO
transformations performed on the examples of messages 3YN8T3P0Usm1GyG9lQGEiBXFE6+yPecSSvPykuV4TPB5n
discussed in Section 2.6. It includes for each example: e9xNEO8KxQVXnk3cqn/TaK3C3T7A08cRGokyJPUzmrV7k
◾◾ (foo).message: the original message 5pHK7i5bQyOambNcDLxUmH9zMD2sl8FGa+WGtBG6bGe5n
◾◾ (foo).canonical: the canonical string constructed from HafvFnK5n0dnT6N1nmF0mgt3EK3OxQVdiuMzZrNOhPxNOF
37W7w4LmsLOA0Mpeqt7RTKTrDX1CztZgezbM7rLlvQeBn
that message
hWzWOV5qDZEdMahLZTo8Wq0oZOL4XFgkgMhY4pNBdU53sH
◾◾ (foo).sha1: the SHA1 hash of the canonical string VvlaIX5TjqH0+JkYXAXmmzgSI7H9N3RvHingrIOAUIzCp
(hexadecimal) h4GhsdHGDwET+WCO5SuDtwxXKNvneGYrWiQ5WhaTEJXb0
◾◾ (foo).signed: the RSA-signed SHA1 hash of the canon- LXb6Trgd2DS0ZZscLWm6Bau3aO48HZK4GEWgzN2oRTuBaG/
ical string (binary) vLXA+aZKh8kDBYyJj7bHWREXgjMWxIgFQrxPyxb3eUc3E
◾◾ (foo).signed.enc: the base64 encoding of the RSA- EH6iEptuYL1zFRCpr22rPXujFs9EPx0s+o67pbhzRa/eO
jvEZX+wjt1hHgKpDHdvdXJA5er1Y22tRXXed+KwyxzFad
signed SHA1 hash of the canonical string as it would
FtZyW1st4E7V7ROO4Rqw5Cnx6ncXb/Z5ztdUOmx34dX3C
appear in the request k8cydPc76+a5uO4XLTMI9Q3iIwDJBOloNbUahd5OK7FnQ
◾◾ (foo).identity: the original message with the Identity u637tL/cQdlSHel5tRVjh84Jfhl7pDfV2zZyPeEVs3D
and Identity-Info headers added 3t8XoKAVzDo3YAad6sp4r8nCUbUmxUUWAL9lRiS848gHA
Also included in the archive are two public key/ m+nZNcQF78RIY2lk6qq6DnFO30Q4B2JaLG2WTkcZ2uVx7
certificate pairs, for atlanta.example.com and biloxi. ezqGS4vqngA30c5r3KsI8ODevsvtFf6v6vicBsMd8j+ME
+Qt/0PjAnCsT5AQes//d8z/a4OerNZze4z+iczvXqwBtv
example.org, respectively, including
rI+7TMhDq3WqlMK9nlKt3a0z2RHGGlCQ8jMtubakAY2zoc
◾◾ (foo).cer: the certificate of the domain FupKgghFgbyFoS8BZx7Yl3mZXDZt5ZwYcj5kezmjEwY/YC
◾◾ (foo).privkey: the private key of the domain O4rk+lFQc+26mK7GYb+rhviUDaVKy2X5DZUvOAOd8VeYQ
◾◾ (foo).pubkey: the public key of the domain, extracted UtOfJ6QxVKtCW0DakDRBDOb3cIk3hF7toGs5wBFldupDk
from the cert file for convenience xU1TXS7dnKN1mgFumFWGNmhb8AJH0omt08VC23Jtj1O0A
To recover the compressed archive file intact, the 9snZMFvA6KMp8s6FYZmkbj7RdcoudzWYdsCq+3SmrVIvq
9iqJOxaIu1+6ho406UU2vFohHFJNVUDOr4sEIxeK0O6nJ
text of this document may be passed as input to the
KHFZhclxeLK4DpvUqSdSqG1+eerx35ELXrPfF5gzqBWs4
following Perl script (the output should be redirected joD2qSUehFTp8aXsremUp0mrLxp+tnVMFALaFWhZHg6HW
to a file or piped to tar -xzvf -): orIohz2um5KZcV4QUcNh4BdC9HZV8ikckSn5WM83neiON
KavbQlS4MlANoplaQn67JbMLQ2XSPumQa1OD9iBLYPiyD
#!/usr/bin/perl judXR4en9xuHQdHmIDGp6VsjyyBvTE85DwIJMty65T2PD
use strict; tkJqa4GzVa/KPcjRF8i38qUytVhdmrEUb1rqHDnx7lFyG
my $bdata=""; d+2RC1FCYwFOMErfKO3oymKyceFn8Q7oyfs1eqMEFsqJw
use MIME::Base64; 1oOfhmaoQNCmJluerLmeSox20+g1idmdZA7zKolVXLMvK
while(<>) { YTpCp3KwzlSHYhjpmBCGHXZEp1CnlI0nalZdxHPxtUDLs
if (/-- BEGIN MESSAGE ARCHIVE --/../-- END EFlNGfqGBRCgY9CCd97wYpuQ4HlY8Kyus6wBZ3LIb0tNX
MESSAGE ARCHIVE --/) { x2XmpOdd9EwqPv1VlB8Dgvdbr2S4dNWBnZVirLpQbgsh
if (m/^\s*[^\s]+\s*$/) { 0MSKJ646reXI3K8nKSLaHL9nlrRQdVtsbWRviDVDwyrTz
$bdata=$bdata. $_; D+n9yPGf7fhP8j5kO3+I/AN/k/gZHYPf7fMf6vLEaZs++
} FfvGg0pDIGkfRmLsj2PLX6R5NY6JGcywT6x9OCcDrOOGj
} UgLwOk74qJQAvJYT3o3O93f6e3b958ZZ2cdvQ/55s/6D
} vEf/QbBr/YeAifv4/yToP3DCsnQyfZP+s32j/mOO6Tp3ub
print decode_base64($bdata); 75uf6TLipXpDDH5DWVbp7VCzvesGxrnfDuWEEErgvprjN2
eda4aFS9PzVXGWzLmTSsmvSgcTQyfgYtK6/LkOsy4D2Fn
X15k4AAm6p+k9Y/FxD2LOBs+nMgph+o/YgXev+u9pM/74
Alternatively, the base64 encoded block can be 6BZ4EotJ7YZ0qunQHXZJni8v5B4wWaXjKJTnfhLmWvRYM
edited by hand to remove document structure lines zIXYbFjI5jFzInZwlZZR0gmoAGoi39e6ENYEk HsO0UyJ
and fed as input to any base64 decoding utility. 7umXRkl/i+LGOLxE6zD3bkFOqoJYZrS3Mo5bYjjSc16cL
Security Mechanisms in SIP ◾ 709
jwvABjZ3TbgwEIHu51MYjruBLihkPUwjBwTDKJjJ0MqZL BhbaKb6cHiUHOYxWe8SBkK1CTFVTWbSpDDAGwjZ1vATeR
pQpjMVG40i2HhaHDtNTcH08ZDpASGdmVh2T7DzUC/SINb vaWPWnbFIhmsyQmKNYmhz38Sa7yG+ckGy5vJKSlF5E8v0
E6epSnaWfJNGP36oT2b+QcHeOFULeg/XStYOQGpFdc6+EM ev8mq3bwHPCTYqv9mVEAN9//p+Z+mf9qCMMvqv/+k4fnf
cDBKfXviBR7sukN3IxIljBR2fkm/UvlF3SHaEOu9Kng98 EiqCJbcJfVPnuyR/9XS0YxBorSR4jTK/zWywKUlfjUftl
MJNO5PObPM9s20E9IU2zrbVNVXduLbrRP35fLmVfYCXdZ EvWa4qqzKsSE0pyvrf629Ubir6awigcGnVEnP0IiZ5wjr4e
9mrHGr+yzi5y5+n7CIsCNRdBx901oTYGirG/vMgJcPmP/ zjNiqr/IZ9IBl2eo6PU5BrITiUwg5p5yxcsOWqKUKXvOL
XeqHOxIMszduZuT2I2qEqFtsYT9j4suzz3WwHhFkxa4eV4 E7kHEhQBbtU0/Ek4+p4NDnGZ7zh0FiJvpETJxKFhKx6Is6
ATDkcJN0Tub7Obil4xiVww3PVTrTb0F53O84Qlbcl16TB AXxicGmYUJmvxjXmDTk+qzBSuZMxq0aUKTszlE6WhdM3F
nsXHb33UWn26oCVojgnBJk1lLYPuAkDTkfL8mhkBJ2iWCp BkU5XZLCPT2l8UlHKOT1ubOBsqtnREzwI5G436TkSgkxz
iC5OB8ScQXFWUTvJ47o+sYS6nRFWkbHTIfaBwTGDU7PBx YVkxr9bYbTDCFT/r0y9yshXUrRhlxRFG0sprxm2SY0q2/
RN5hsMn97rPvb3K/29B/nmz/kOit/wPI+NaYFz/49j9/s8 NYCrMGwkDAo6GZ/t+MCqhh/4/MVf2Pvv7DDMz/wP8Pg/+
nR/8Jb/UfFixdZqes1VXSpDV93CxjcUVb/RwFc6SNybjHP DyQEHyP+bUQE23P+JqD/zfxpZ9P5fewv8vwXo/d/W7Oec
OfImvRJ2OKeEoQ6QBb58aQspcM86u350UQOEGHRULYsEc jaRZhGWaZq04LtGUjCPIwkUQkrUXmI1xEstIUQmbOVD/
0uDzIlkqqZ2q6txQOdKTuL4xNyu1G4OXtA95ICEEINTlm IdN/EyrAPfZ/Ff2z+v5P7RD03wpit+2TyoevQvtisv3jf
B7GqdqrH0TG7jhdyXvs2yPshFrEmJ1dTmymAmDflxuQHl Jz48e1pxN3xs+1I74vpO89MxqurnY/XnlxeLFx702lcIj
pgjqeJi/pP8syEMjzOWtnCabMJmljbhsIwM1CpjqVwY78 vurZ8ods/MHQtevPD+bbBr+dR5amnN25XtflV+/fCLPbs
D7TH/gcWSUkqF0uQRaDK2/pxB6UAouR+r3iqCEHiQ/mogx 62/fO+OD7yqzx9EzqbtfLk4GznxZurp+JHZ0+7l5+tPr8
SvcX05ukQ6jt7cTwPEr9uiHq7BWMT2xU51cIUhPOxTu0r vtj2OfXr0sLKnHgrqM6DAv9H/f/bCnCP/Z+ufzOm9Pyfh
qannADguEKwdDeu1GNJz6bxXbOVynFKywvH7qaS7J1ZZb fVfS9hvJkXsN4ci/iZ7gtkGAAAAAAAAAAAAAAAAAAAAAA
IUp4WYQ7+LMtf5DoESp0loF6Q4K5LsNryOnNhebXZ9ujc AAAABAPX4DY+BfEQB4AAA=-- END MESSAGE ARCHIVE --
PAuPDMZJcd2w5Q4TNrBLsMy4WAaO7eoGbKZSo6CB4d5mIH
LiQZKDjKXfKzmXWj/zBro/IxNzemOTZbgzDarnmDbqXj4
GtxsYVSA1xHnVSTeSqZFpqCKiD0etuj2BwV5Yuz79UCog 19.4.8.8.1 Identity-Info with a Singlepart MIME body
lCNqgzaEh+IUyD1Y2YIgak3kTDfnaKW2XV7jkvYzcRL0v Consider the following private key and certificate pair assigned
Akdal3OL3Z0tAbEmp3VOqKMtQsmpJcxDMmytnzEcHh7Wto
B1yzTsNZhfJCYJ1Ap3SS+ACJj3MV5mGRp0y1Zos25ebOT
to atlanta.example.com (rendered in OpenSSL format).
47nU8kSB8RD/UuR8cWGddFYbKR2F0op5BLi2jaLdE8Big -----BEGIN RSA PRIVATE KEY-----
UVLYbE/b8eGdXOeNJ3M1I51WYCsm035/wcEMbO/yUnKcCq MIICXQIBAAKBgQDPPMBtHVoPkXV+Z6jq1LsgfTELVWpy2
66gTedIeGQW29O0lQNgtUB9ZL7Yy71YZETcymuNFIN1RK BVUffJMPH06LL0cJSQOaIeVzIojzWtpauB7IylZKlAjB5
0MGUr3Y5osBHZ9bhaYVlYvEewnVwN6Bf8/fvnnW9N/yBv9 f429tRuoUiedCwMLKblWAqZt6eHWpCNZJ7lONcIEwnmh2n
B8Wge/z/jtB/Xk8JwOs44aNSAvA6TviolAC8lhPu9Z9X4n AccKk83Lp/VH3tgAS/43DQoX2sndnYh+g8522Pzwg7EGWs
8IHntOURax52R3G//jAvD5+S8MxbGb9R8K38f/nVgTV1du pzzwIDAQABAoGBAK0W3tnEFD7AjVQAnJNXDtx59Aa1Vu2
6X7+OfwHvZNXWfC4rMOn15ecLPaCz9/uDdxe9cr8qTPDX JEXe6oi+OrkFysJjbZJwsLmKtrgttPXOU8t2mZpi0wK4h
MwjiYAgRtx+iqDwhNnxT83o9DMTBJ4IgTtBRkdPYOwKpq X4tZhntiwGKkUPC3h9Bjp+GerifP341RMyMO+6fPgjqOz
5weCKq5tOn9wnXJzn+b37F7cdM/2/M/2AUe3H+E7vZ/0 UDw+rPjjMpwD7AkcEcqDgbTrZnWv/QnCSaaF3xkUGfFkLx
eg+/2fO7ExZicvAr3yUPTxB0T7xJivQOQx9BCwY+fq9i/QV 5OKcRAkEA7UxnsE8XaT30tP/UUc51gNk2KGKgxQQTHopBc
IwJTI2/HiOPsXfc2im86MmFikTMlQunifwGHm9Rnf6RUN ew9yfeCRFhvdL7jpaGatEi5iZwGGQQDVOVHUN1H0YLpHQ
adU/vN1YQcS4S6zK8mTOlOPvt6/PncO60TPnEIb4Z7h4e jRowJBAN+R2bvA/Nimq464ZgnelEDPqaEAZWaD3kOfhS9+
AWV5N6OtGPrvntcD07LaxVTMUgkkSewhwThtcTT4UmB4Cr vL7oqES+u5E0J7kXb7ZkiSVUg9XU/8PxMKx/DAz0dUm
JNlj+bc1eRlXBsvGMHxavIc3h4C8+chcjX5dHPGWbOEcPl OL+UCQH8C9ETUMI2uEbqHbBdVUGNk364CDFcndSxVh+34
YGXkrtajv8fEShNmNaezbQkRjewoX+alWtjYo5e2gGaTS KqJdjiYSx6VPPv26X9m7S0OydTkSgs3/4ooPxo8HaMqXm
1iHlZ326uZQPgckLCyzSJ5f2TOoC0+RK10bj1szDVccKi 80CQB+rxbB3UlpOohcBwFK9mTrlMB6Cs9ql66KgwnlL9u
cPn6sDPUZ80Bg2BB40rEX4NLs9h20HKCfeaefXSw6rVcRn kEhHHYozGatdXeoBCyhUsogdSU6/aSAFcvWEGtj7/vyJE
Cp23hXyRXJPM1sc4oprAi6XSw126Fw2qBdlB4sJonn37Rp CQQCCS1lKgEXoNQPqONalvYhyyMZRXFLdD4gbwRPK1uXKY
0fz4jCO8mejtq2aKxB81Sfv2SX63DtOFj6pG+dREznwOE pk3CkfFzOyfjeLcGPxXzq2qzuHzGTDxZ9PAepwX4RSk
5l0Y6PeaQERdhGV5Nx6O7R9TsM//OgaZwwuOP9Pwh7cf57 -----END RSA PRIVATE KEY-----
hH7i5vw3gd/j/z3+fyz4/1Gh/XsSwV6K/2skfwvveFP8Qy
Rxm/9hY43r+Efg+/Ofd2KGRMM/9VLu/5knkwM5IyjUP6A -----BEGIN CERTIFICATE-----
4jPuI5wfUGEw4jsEocX2ghnQdGMbgA3bP8N9l8R+HReDf MIIC3TCCAkagAwIBAgIBADANBgkqhkiG9w0BAQUFADBZ
efwj7/7/H0ZCOPHs/A95H/93YV/6P0b7Veqnf3f9W3/5n9/ MQswCQYDVQQGEwJVUzELMAkGA1UECAwCR0ExEDAOBgNVB
42+/75f/65g/4f3X4+p/9w0/8wt8Mv/97f/jX/zt88Stf+/ AcMB0F0bGFudGExDTALBgNVBAoMBElFVEYxHDAaBgNVBA
Ljv/unb379+OvZvw3aN/7jn59+6vt/Q7n6sU3/RS36oT/ MME2F0bGFudGEuZXhhbXBsZS5jb20wHhcNMDUxMDI0MDY
5cS+a/8pXGLL7gy+ReY1dET/8qa/+8Q9Wf/HlP6r/9DNf+ zNjA2WhcNMDYxMDI0MDYzNjA2WjBZMQswCQYDVQQGEwJV
J9f+8Wf/c3f/vs/z4p/Eb8Q/PePfu2Xfu53rB/59381fv UzELMAkGA1UECAwCR0ExEDAOBgNVBAcMB0F0bGFudGExD
IfH05+Xr6PwE9c/D8OCu9u4/+F/nt9BOBG/yXuz//djf77 TALBgNVBAoMBElFVEYxHDAaBgNVBAMME2F0bGFudGEuZX
bYoYwLcrXADfilhxv+B4a/EfF+e4fTtbQG+Kfxy6Pv+D4S hhbXBsZS5jb20wgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJ
iMosTN+V9yzAnu4/9O4v9DN3k+ZHfoffs/6JgQ4NRkrtl AoGBAM88wG0dWg+RdX5nqOrUuyB9MQtVanLYFVR98kw8fT
z84N2gdArCLmC0JtdoDfrDU/PT8bsu3xiNUFN/3875/PaN osvRwlJA5oh5XMiiPNa2lq4HsjKVkqUCMHl/jb21G6hSJ
BiH8Yt6CBS0Q2SDYcYEkSl9k75Nmkmn7ebWde2WLm3646 50LAwspuVYCpm3p4dakI1knuU41wgTCeaHacBxwqTzcu
Jp2q7FtU2btq496EGcKMgu4sH5a4dN8NccCMLYP6AMwcv n9Ufe2ABL/jcNChfayd2diH6DznbY/PCDsQZaynPPAgMB
+Bg/e1NMuZimTdlvXyWxx4/s5pQ0N5SXPk/d9nrclaSuHr AAGjgbQwgbEwHQYDVR0OBBYEFNmU/MrbVYcEKDr/20WIS
710 ◾ Handbook on Session Initiation Protocol
from the same headers of the SIP request. Using this canoni- Bob ([email protected]) now wants to send a BYE
cal string, the signed digest in the Identity header, and the request to Alice at the end of the dialog initiated in the previ-
certificate discovered by dereferencing the Identity-Info ous example. He therefore creates the following BYE request,
header, Bob can verify that the given set of headers and the which he forwards to the biloxi.example.org proxy server
message body have not been modified. that instantiates the authentication service role:
BYE sip:[email protected]
19.4.8.8.2 Identity for a Request with No MIME SIP/2.0
Body or Contact Via: SIP/2.0/TLS 192.0.2.4;branch=z9hG4bKnas
hds10
Consider the following private key and certificate pair Max-Forwards: 70
assigned to biloxi.example.org. From: Bob <sip:[email protected]>;
tag=a6c85cf
-----BEGIN RSA PRIVATE KEY----- To: Alice <sip:[email protected]>;
MIICXgIBAAKBgQC/obBYLRMPjskrAqWOiGPAUxI3/m2t tag=1928301774
i7ix4caqCTAuFX5cLegQ7nmquLOHfIhxVIqT2f06UA0lO Call-ID: a84b4c76e66710
o2NVofK9G7MTkVbVNiyAlLYUDEj7XWLDICf3ZHL6Fr/+C CSeq: 231 BYE
F7wrQ9r4kv7XiJKxodVCCd/DhCT9Gp+VDoe8HymqOW/Ks Content-Length: 0
neriyIwIDAQABAoGBAJ7fsFIKXKkjWgj8ksGOthS3Sn19
xPSCyEdBxfEm2Pj7/Nzzeli/PcOaic0kJALBcnqN2fHEe When the authentication service receives the BYE, it
IGK/9xUBxTufgQYVJqvyHERs6rXX/iT4Ynm9t1905EiQ9 authenticates Bob by sending a 407 response. As a result,
ZpHsrI/AMMUYA1QrGgAIHvZLVLzq+9KLDEZ+HQbuCLJXF Bob adds an Authorization header to his request, and resends
+6bl0Eb5BAkEA636oMANp0Qa3mYWEQ2utmGsYxkXSfyBb to the biloxi.example.org authentication service. Now that
18TCOwCty0ndBR24zyOJF2NbZS98Lz+Ga25hfIGw/JHKn
the service is sure of Bob’s identity, it prepares to calculate an
D9bOE88UwJBANBRSpd4bmS+m48R/13tRESAtHqydNinX0
kS/RhwHr7mkHTU3k/MFxQtx34I3GKzaZxMn0A66KS9v/S Identity header for the request. Note that this request does
HdnF+ePECQQCGe7QshyZ8uitLPtZDclCWhEKHqAQHmUEZ not have a Date header field. Accordingly, the biloxi.example
vUF2VHLrbukLLOgHUrHNa24cILv4d3yaCVUetymNcuyTw .org will add a Date header to the request before calculating
hKj24wFAkAOz/jx1EplN3hwL+NsllZoWI58uvu7/Aq2c3 the identity signature. If the Content-Length header were
czqaVGBbb317sHCYgKk0bAG3kwO3mi93/LXWT1cdiYVpm not present, the authentication service would add it as well.
BcHDBAkEAmpgkFj+xZu5gWASY5ujv+FCMP0WwaH5hTnXu
The baseline message is thus
+tKePJ3d2IJZKxGnl6itKRN7GeRh9PSK0kZSqGFeVrvsJ
4Nopg== BYE sip:[email protected] SIP/2.0
-----END RSA PRIVATE KEY----- Via: SIP/2.0/TLS 192.0.2.4;branch=z9hG4bKnas
hds10
-----BEGIN CERTIFICATE----- Max-Forwards: 70
MIIC1jCCAj+gAwIBAgIBADANBgkqhkiG9w0BAQUFADBXM From: Bob <sip:[email protected]>;
QswCQYDVQQGEwJVUzELMAkGA1UECAwCTVMxDzANBgNVBA tag=a6c85cf
cMBkJpbG94aTENMAsGA1UECgwESUVURjEbMBkGA1UEAww To: Alice <sip:[email protected]>;
SYmlsb3hpLmV4YW1wbGUuY29tMB4XDTA1MTAyNDA2NDAy tag=1928301774
NloXDTA2MTAyNDA2NDAyNlowVzELMAkGA1UEBhMCVVMxC Date: Thu, 21 Feb 2002 14:19:51 GMT
zAJBgNVBAgMAk1TMQ8wDQYDVQQHDAZCaWxveGkxDTALBg Call-ID: a84b4c76e66710
NVBAoMBElFVEYxGzAZBgNVBAMMEmJpbG94aS5leGFtcGx CSeq: 231 BYE
lLmNvbTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEA Content-Length: 0
v6GwWC0TD47JKwKljohjwFMSN/5trYu4seHGqgkwLhV+X
C3oEO55qrizh3yIcVSKk9n9OlANJTqNjVaHyvRuzE5FW1 Also note that this request contains no Contact header
TYsgJS2FAxI+11iwyAn92Ry+ha//ghe8K0Pa+JL+14iSs field. Accordingly, biloxi.example.org will place no value in
aHVQgnfw4Qk/RqflQ6HvB8pqjlvyrJ3q4siMCAwEAAaOBs the canonical string for the addr-spec of the Contact address.
TCBrjAdBgNVHQ4EFgQU0Z+RL47W/APDtc5BfSoQXuEFE/ Also note that there is no message body, and accordingly, the
wwfwYDVR0jBHgwdoAU0Z+RL47W/APDtc5BfSoQXuEFE/y
hW6RZMFcxCzAJBgNVBAYTAlVTMQswCQYDVQQIDAJNUzEPM
signature string will terminate, in this case, with two verti-
A0GA1UEBwwGQmlsb3hpMQ0wCwYDVQQKDARJRVRGMRswGQ cal bars. The canonical string over which the identity signa-
YDVQQDDBJiaWxveGkuZXhhbXBsZS5jb22CAQAwDAYDVR0 ture will be generated is the following (note that the first line
TBAUwAwEB/zANBgkqhkiG9w0BAQUFAAOBgQBiyKHIt8TXf wraps because of RFC editorial conventions):
GNfpnJXi5jCizOxmY8Ygln8tyPFaeyq95TGcvTCWzdoBL
VpBD+fpRWrX/II5sE6VHbbAPjjVmKbZwzQAtppP2Fauj28 sip:[email protected]|sip:alice@atlanta.
t94ZeDHN2vqzjfnHjCO24kG3Juf2T80ilp9YHcDwxjUFr example.com|
t86UnlC+yidyaTeusW5Gu7v1g== a84b4c76e66710|231 BYE|Thu, 21 Feb 2002
-----END CERTIFICATE----- 14:19:51 GMT||
712 ◾ Handbook on Session Initiation Protocol
The resulting signature (sha1WithRsaEncryption) using this form of the TEL URI is more common for the To header
the private RSA key given above for biloxi.example.org, with field and Request-URI in SIP than in the From header field,
base64 encoding, is the following: since the UAC has no option but to provide a TEL URI
alone when the remote domain to which a request is sent is
sv5CTo05KqpSmtHt3dcEiO/1CWTSZtnG3iV+1nmurLXV/ unknown.
HmtyNS7Ltrg9dlxkWzoeU7d7OV8HweTTDobV3itTmgPwC
The local domain, however, is usually known by the
FjaEmMyEI3d7SyN21yNDo2ER/Ovgtw0Lu5csIppPqOg1u
XndzHbG7mR6Rl9BnUhHufVRbp51Mn3w0gfUs= UAC, and accordingly it can form a proper From header
field containing a SIP URI with a user name in TEL URI
Accordingly, the biloxi.example.org authentication ser- form. Implementations that intend to send their requests
vice will create an Identity header containing that base64 through an authentication service SHOULD put telephone
signature string. It will also add an HTTPS URL where its numbers in the From header field into SIP or SIPS URIs
certificate is made available. With those two headers added, whenever possible. If the local domain is unknown to a
the message looks like the following: UAC formulating a request, it most likely will not be able to
locate an authentication service for its request, and therefore
BYE sip:[email protected] the question of providing identity in these cases is some-
SIP/2.0 what moot. However, an authentication service MAY sign
Via: SIP/2.0/TLS 192.0.2.4;branch=z9hG4bKnas a request containing a TEL URI in the From header field.
hds10 This is permitted in this specification strictly for forward
Max-Forwards: 70 compatibility purposes. In the longer-term, it is possible that
From: Bob <sip:[email protected]>;
tag=a6c85cf ENUM (see Section 8.3) may provide a way to determine
To: Alice <sip:[email protected]>; which administrative domain is responsible for a telephone
tag=1928301774 number, and this may aid in the signing and verification
Date: Thu, 21 Feb 2002 14:19:51 GMT of SIP identities that contain telephone numbers. This is a
Call-ID: a84b4c76e66710 subject for future work.
CSeq: 231 BYE
without a resulting integrity violation. RFC 3325 (see Date header field (the recommended interval is that the
Sections 2.8, 10.4, and 20.3) defines the id priv-value token, Date header must indicate a time within 3600 seconds of
which is specific to the P-Asserted-Identity header. The sort of the receipt of a message). Implementations must also record
assertion provided by the P-Asserted-Identity header is very Call-IDs received in valid requests containing an Identity
different from the Identity header presented in this section. header, and must remember those Call-IDs for at least the
It contains additional information about the sender of duration of a single Date interval (i.e., commonly 3600 sec-
a message that may go beyond what appears in the From onds). Because a SIP-compliant UA never generates the same
header field; P-Asserted-Identity holds a definitive identity Call-ID twice, verifiers can use the Call-ID to recognize cut-
for the sender that is somehow known to a closed network and-paste attacks; the Call-ID serves as a nonce.
of intermediaries that presumably the network will use this The result of this is that if an Identity header is replayed
identity for billing or security purposes. The danger of this within the Date interval, verifiers will recognize that it is
network-specific information leaking outside of the closed invalid because of a Call-ID duplication; if an Identity header
network motivated the id priv-value token. The id priv-value is replayed after the Date interval, verifiers will recognize
token has no implications for the Identity header, and pri- that it is invalid because the Date is stale. The CSeq header
vacy services must not remove the Identity header when a field contains a numbered identifier for the transaction, and
priv-value of id appears in a Privacy header. Finally, note the name of the method of the request; without this infor-
that unlike RFC 3325 (see Sections 2.8, 10.4, and 20.3), the mation, an INVITE request could be cut-and-pasted by an
mechanism described in this specification adds no informa- attacker and transformed into a BYE request without chang-
tion to SIP requests that have privacy implications. ing any fields covered by the Identity header, and, moreover,
requests within a certain transaction could be replayed in
potentially confusing or malicious ways. The Contact header
19.4.8.11 Security Considerations field is included to tie the Identity header to a particular UA
instance that generated the request. Were an active attacker
19.4.8.11.1 Handling of Digest-String Elements
to intercept a request containing an Identity header, and
RFC 4474 describes a mechanism that provides a sig- cut-and-paste the Identity header field into its own request
nature over the Contact, Date, Call-ID, CSeq, To, and (reusing the From, To, Contact, Date, and Call-ID fields
From header fields of SIP requests. While a signature over that appear in the original message), the attacker would
the From header field would be sufficient to secure a URI not be eligible to receive SIP requests from the called UA,
alone, the additional headers provide replay protection and since those requests are routed to the URI identified in the
reference integrity necessary to make sure that the Identity Contact header field.
header will not be used in cut-and-paste attacks. In general, However, the Contact header is only included in dialog-
the considerations related to the security of these headers are forming requests, so it does not provide this protection in
the same as those given in RFC 3261 for including headers all cases. It might seem attractive to provide a signature
in tunneled message/sip MIME bodies (see Section 19.6.3 over some of the information present in the Via header field
in particular). The following section details the individual value(s). For example, without a signature over the sent-by
security properties obtained by including each of these field of the topmost Via header, an attacker could remove
header fields within the signature; collectively, this set of that Via header and insert its own in a cut-and-paste attack,
header fields provides the necessary properties to prevent which would cause all responses to the request to be routed
impersonation. The From header field indicates the identity to a host of the attacker’s choosing. However, a signature
of the sender of the message, and the SIP AOR URI in the over the topmost Via header does not prevent attacks of this
From header field is the identity of a SIP user, for the pur- nature, since the attacker could leave the topmost Via intact
poses of this document. and merely insert a new Via header field directly after it,
The To header field provides the identity of the SIP user which would cause responses to be routed to the attacker’s
that this request targets. Providing the To header field in the host on their way to the valid host, which has exactly the
Identity signature serves two purposes: first, it prevents cut- same end result.
and-paste attacks in which an Identity header from legiti- Although it is possible that an intermediary-based
mate request for one user is cut-and-pasted into a request for authentication service could guarantee that no Via hops are
a different user; second, it preserves the starting URI scheme inserted between the sending UA and the authentication
of the request, which helps prevent downgrade attacks service, it could not prevent an attacker from adding a Via
against the use of SIPS. The Date and Contact headers pro- hop after the authentication service, and thereby preempt-
vide reference integrity and replay protection, as described ing responses. It is necessary for the proper operation of
in RFC 3261 (see Section 19.6.4.2). Implementations of this SIP for subsequent intermediaries to be capable of inserting
specification must not deem valid a request with an outdated such Via header fields, and thus it cannot be prevented.
714 ◾ Handbook on Session Initiation Protocol
As such, though it is desirable, securing Via is not possible 19.4.8.11.2 Display Names and Identity
through the sort of identity mechanism described in this
As a matter of interface design, SIP UAs might render the
document; the best known practice for securing Via is the
display-name portion of the From header field of a caller as
use of SIPS.
the identity of the caller; there is a significant precedent in
This mechanism also provides a signature over the bod-
e-mail user interfaces for this practice. As such, it might seem
ies of SIP requests. The most important reason for doing so
that the lack of a signature over the display-name is a signifi-
is to protect SDP bodies carried in SIP requests. There is
cant omission. However, there are several important senses
little purpose in establishing the identity of the user that
in which a signature over the display-name does not prevent
originated a SIP request if this assurance is not coupled with
impersonation. In the first place, a particular display-name,
a comparable assurance over the media descriptors. Note,
like Jon Peterson, is not unique in the world; many users
however, that this is not perfect end-to-end security. The
in different administrative domains might legitimately claim
authentication service itself, when instantiated at an inter-
that name. Furthermore, enrollment practices for SIP-based
mediary, could conceivably change the SDP (and SIP head-
services might have a difficult time discerning the legitimate
ers, for that matter) before providing a signature. Thus, while
display-name for a user; it is safe to assume that imperson-
this mechanism reduces the chance that a replayer or MITM
ators will be capable of creating SIP accounts with arbitrary
will modify SDP, it does not eliminate it entirely. Since it is
display-names. The same situation prevails in e-mail today.
a foundational assumption of this mechanism that the users
Note that an impersonator who attempted to replay a message
trust their local domain to vouch for their security, they must
with an Identity header, changing only the display-name in
also trust the service not to violate the integrity of their mes-
the From header field, would be detected by the other replay
sage without good reason. Note that RFC 3261 (see Section
protection mechanisms described in Section 19.4.8.11.1.
3.11.6) states that SIP proxy servers must not add to, modify,
Of course, an authentication service can enforce policies
or remove the message body.
about the display-name even if the display-name is not signed.
In the end analysis, the Identity and Identity-Info head-
The exact mechanics for creating and operationalizing such
ers cannot protect themselves. Any attacker could remove
policies is outside the scope of this document. The effect of
these headers from a SIP request, and modify the request
this policy would not be to prevent impersonation of a par-
arbitrarily afterwards. However, this mechanism is not
ticular unique identifier like a SIP URI (since display-names
intended to protect requests from MITMs who interfere
are not unique identifiers), but to allow a domain to manage
with SIP messages; it is intended only to provide a way
the claims made by its users. If such policies are enforced,
that SIP users can prove definitively that they are who they
users would not be free to claim any display-name of their
claim to be. At best, by stripping identity information from
choosing. In the absence of a signature, MITM attackers
a request, an MITM could make it impossible to distinguish
could conceivably alter the display-names in a request with
any illegitimate messages he would like to send from those
impunity.
messages sent by an authorized user. However, it requires
Note that the scope of this specification is impersonation
a considerably greater amount of energy to mount such an
attacks, however, and that an MITM might also strip the
attack than it does to mount trivial impersonations by just
Identity and Identity-Info headers from a message. There are
copying someone else’s From header field. This mechanism
many environments in which policies regarding the display-
provides a way that an authorized user can provide a defini-
name are not feasible. Distributing bit-exact and internation-
tive assurance of his identity that an unauthorized user, an
alizable display-names to end-users as part of the enrollment
impersonator, cannot.
or registration process would require mechanisms that are not
One additional respect in which the Identity-Info header
explored in this document. In the absence of policy enforce-
cannot protect itself is the alg parameter. The alg parameter is
ment regarding domain names, there are conceivably attacks
not included in the digest-string, and accordingly, an MITM
that an adversary could mount against SIP systems that rely
might attempt to modify the alg parameter. However, it is
too heavily on the display-name in their user interface; how-
important to note that preventing MITMs is not the pri-
ever, this argues for intelligent interface design, not changes
mary impetus for this mechanism. Moreover, changing the
to the mechanisms. Relying on a nonunique identifier for
alg would at worst result in some sort of bid-down attack,
identity would ultimately result in a weak mechanism.
and at best cause a failure in the verifier. Note that only one
valid alg parameter is defined in this document and that thus
there is currently no weaker algorithm to which the mecha-
19.4.8.11.3 Securing the Connection
nism can be bid down. alg has been incorporated into this
to the Authentication Service
mechanism for forward compatibility reasons in case the
current algorithm exhibits weaknesses, and requires swift The assurance provided by this mechanism is strongest when
replacement, in the future. a UA forms a direct connection, preferably one secured by
Security Mechanisms in SIP ◾ 715
TLS, to an intermediary-based authentication service. The Section 8.2.4) that allow requests for, say, example.com to
reasons for this are twofold: be routed to sip.example.com. As a result, a user with the
AOR sip:[email protected] may process its requests through
◾◾ If a user does not receive a certificate from the authen- a host like sip.example.com, and it may be that latter host
tication service over this TLS connection that cor- that acts as an authentication service. To meet the second
responds to the expected domain (especially when of these problems, a domain that deploys an authentication
the user receives a challenge via a mechanism such service on a subordinate host MUST be willing to supply
as Digest), then it is possible that a rogue server is that host with the private keying material associated with a
attempting to pose as an authentication service for a certificate whose subject is a domain name that corresponds
domain that it does not control, possibly in an attempt to the domain portion of the AORs that the domain distrib-
to collect shared secrets for that domain. utes to users.
◾◾ Without TLS, the various header field values and the Note that this corresponds to the comparable case
body of the request will not have integrity protection of routing inbound SIP requests to a domain. When the
when the request arrives at an authentication service. Naming Authority Pointer (NAPTR) and SRV procedures
Accordingly, a prior legitimate or illegitimate interme- of RFC 3263 (see Section 8.2.4) are used to direct requests
diary could modify the message arbitrarily. to a domain name other than the domain in the original
Request-URI (e.g., for sip:[email protected], the corre-
Of these two concerns, the first is most material to sponding SRV records point to the service sip1.example.org),
the intended scope of this mechanism. This mechanism the client expects that the certificate passed back in any TLS
is intended to prevent impersonation attacks, not MITM exchange with that host will correspond exactly with the
attacks; integrity over the header and bodies is provided by domain of the original Request-URI, not the domain name
this mechanism only to prevent replay attacks. However, it of the host. Consequently, to make inbound routing to such
is possible that applications relying on the presence of the SIP services work, a domain administrator must similarly be
Identity header could leverage this integrity protection, espe- willing to share the domain’s private key with the service.
cially body integrity, for services other than replay protection. This design decision was made to compensate for the insecu-
Accordingly, direct TLS connections should be used rity of the DNS, and it makes certain potential approaches to
between the UAC and the authentication service whenever DNS-based virtual hosting unsecurable for SIP in environ-
possible. The opportunistic nature of this mechanism, how- ments where domain administrators are unwilling to share
ever, makes it very difficult to constrain UAC behavior, and, keys with hosting services. A verifier MUST evaluate the
moreover, there will be some deployment architectures where a correspondence between the user’s identity and the signing
direct connection is simply infeasible and the UAC cannot act certificate by following the procedures defined in RFC 2818,
as an authentication service itself. Accordingly, when a direct (Section 3.1 of RFC 2818). While RFC 2818 deals with the
connection and TLS are not possible, a UAC should use the use of HTTP in TLS, the procedures described are appli-
SIPS mechanism, Digest auth-int for body integrity, or both cable to verifying identity if one substitutes the host name
when it can. The ultimate decision to add an Identity header to of the server in HTTP for the domain portion of the user’s
a request lies with the authentication service, of course; domain identity in the From header field of a SIP request with an
policy must identify those cases where the UAC’s security Identity header.
association with the authentication service is too weak. Because the domain certificates that can be used by
authentication services need to assert only the host name of
the authentication service, existing certificate authorities can
19.4.8.11.4 Domain Names and Subordination
provide adequate certificates for this mechanism. However,
When a verifier processes a request containing an Identity- not all proxy servers and UAs will be able to support the
Info header, it must compare the domain portion of the URI root certificates of all certificate authorities, and, moreover,
in the From header field of the request with the domain there are some significant differences in the policies by which
name that is the subject of the certificate acquired from the certificate authorities issue their certificates. This document
Identity-Info header. While it might seem that this should be makes no recommendations for the usage of particular cer-
a straightforward process, it is complicated by two deploy- tificate authorities, nor does it describe any particular poli-
ment realities. In the first place, certificates have varying cies that certificate authorities should follow; however, it is
ways of describing their subjects, and may indeed have mul- anticipated that operational experience will create de facto
tiple subjects, especially in virtual hosting cases where mul- standards for authentication services. Some federations of
tiple domains are managed by a single application. Secondly, service providers, for example, might only trust certificates
some SIP services may delegate SIP functions to a subordi- that have been provided by a certificate authority operated by
nate domain and utilize the procedures in RFC 3263 (see the federation. It is strongly recommended that self-signed
716 ◾ Handbook on Session Initiation Protocol
domain certificates should not be trusted by verifiers, unless 19.4.9 HTTP Digest Authentication
some previous key exchange has justified such trust. For fur- Using AKA in SIP
ther information on certificate security and practices, see
RFC 5280. The Security Considerations of RFC 5280 are 19.4.9.1 Background
applicable to this specification (RFC 4474). The HTTP Authentication Framework, described in RFC
2617 (see Sections 19.4.5 and 19.12.2.3), includes two
19.4.8.11.5 Authorization and Transitional Strategies authentication schemes: Basic and Digest. Both schemes
employ a shared secret based mechanism for access
Ultimately, the worth of an assurance provided by an Identity authentication. The Basic scheme is inherently insecure
header is limited by the security practices of the domain that in that it transmits user credentials in plain text. The
issues the assurance. Relying on an Identity header generated Digest scheme improves security by hiding user creden-
by a remote administrative domain assumes that the issuing tials with cryptographic hashes, and additionally by pro-
domain used its administrative practices to authenticate its viding limited message integrity. The Authentication and
users. However, it is possible that some domains will imple- Key Agreement (AKA) [2] mechanism performs authen-
ment policies that effectively make users unaccountable (e.g., tication and session key distribution in Universal Mobile
ones that accept unauthenticated registrations from arbitrary Telecommunications System (UMTS) networks. AKA is
users). The value of an Identity header from such domains a challenge–response-based mechanism that uses sym-
is questionable. While there is no magic way for a verifier metric cryptography. AKA is typically run in a UMTS
to distinguish good from bad domains by inspecting a SIP IM Services Identity Module (ISIM), which resides on a
request, it is expected that further work in authorization smart card-like device that also provides tamper-resistant
practices could be built on top of this identity solution; with- storage of shared secrets.
out such an identity solution, many promising approaches RFC 3310 that is described here specifies a mapping
to authorization policy are impossible. That much said, it is of AKA parameters onto HTTP Digest authentication. In
recommended that authentication services based on proxy essence, this mapping enables the usage of AKA as a one-
servers employ strong authentication practices such as token- time password generation mechanism for Digest authentica-
based identifiers. tion. As the SIP Authentication Framework closely follows
One cannot expect the Identity and Identity-Info head- the HTTP Authentication Framework (see Sections 19.4.5
ers to be supported by every SIP entity overnight. This leaves and 19.12.2.3), Digest AKA is directly applicable to SIP as
the verifier in a compromising position; when it receives a well as any other embodiment of HTTP Digest. The follow-
request from a given SIP user, how can it know whether or ing terminologies are defined in RFC 3310:
not the sender’s domain supports Identity? In the absence of
ubiquitous support for identity, some transitional strategies ◾◾ AKA: authentication and key agreement.
are necessary. A verifier could remember when it receives a ◾◾ AuC: authentication center. The network element in
request from a domain that uses Identity, and in the future, mobile networks that can authorize users either in
view messages received from that domain without Identity GSM or in UMTS networks.
headers with skepticism. A verifier could query the domain ◾◾ AUTN: authentication token. A 128-bit value gen-
through some sort of callback system to determine whether erated by the AuC, which together with the RAND
or not it is running an authentication service. There are a parameter authenticates the server to the client.
number of potential ways in which this could be imple- ◾◾ AUTS: authentication token. A 112-bit value generated
mented; use of the SIP OPTIONS method is one possibility. by the client upon experiencing an SQN synchroniza-
This is left as a subject for future work. In the long term, tion failure.
some sort of identity mechanism, either the one documented ◾◾ CK: cipher key. An AKA session key for encryption.
in this specification or a successor, must become mandatory- ◾◾ IK: integrity key. An AKA session key for integrity
to-use for the SIP protocol; that is the only way to guaran- check.
tee that this protection can always be expected by verifiers. ◾◾ ISIM: IP Multimedia Services Identity Module.
Finally, it is worth noting that the presence or absence of ◾◾ PIN: personal identification number. Commonly as
the Identity headers cannot be the sole factor in making an signed passcodes for use with automatic cash machines,
authorization decision. Permissions might be granted to a smart cards, etc.
message on the basis of the specific verified Identity or really ◾◾ RAND: random challenge. Generated by the AuC
on any other aspect of a SIP request. Authorization policies using the SQN.
are outside the scope of this specification; however, this spec- ◾◾ RES: authentication response. Generated by the ISIM.
ification advises any future authorization work not to assume ◾◾ SIM: subscriber identity module. GSM counterpart for
that messages with valid Identity headers are always good. ISIM.
Security Mechanisms in SIP ◾ 717
◾◾ SQN: sequence number. Both AuC and ISIM main- 19.4.9.3 Specification of Digest AKA
tain the value of the SQN.
In general, the Digest AKA operation is identical to the
◾◾ UMTS: Universal Mobile Telecommunications System.
Digest operation in RFC 2617 (see Sections 19.4.5 and
◾◾ XRES: expected authentication response. In a success-
19.12.2.3). This chapter specifies the parts in which Digest
ful authentication, this is equal to RES.
AKA extends the Digest operation.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
RAND
AUTH
Server data …
Figure 19.9 Generating the nonce value. (Copyright IETF. Reproduced with permission.)
nonce: auts:
A parameter that is populated with the base64 (RFC 20454) A string carrying a base64-encoded AKA AUTS parameter.
encoding of the concatenation of the AKA authentication This directive is used to resynchronize the server side SQN. If
challenge RAND, the AKA AUTN token, and optionally the directive is present, the client does not use any password
some server specific data, as in Figure 19.9. when calculating its credentials. Instead, the client MUST cal-
Example: culate its credentials using an empty password (password of “”.
Example:
nonce = "MzQ0a2xrbGtmbGtsZm9wb2tsc2tqaHJzZX
Ny9uQyMzMzMzQK=" auts = "CjkyMzRfOiwg5CfkJ2UK="
If the server receives a client authentication containing Upon receiving the auts parameter, the server will check
the auts parameter defined in the next section that includes the validity of the parameter value using the shared secret K.
a valid AKA AUTS parameter, the server MUST use it to A valid AUTS parameter is used to resynchronize the SQN
generate a new challenge to the client. Note that when the in the AuC. The synchronized SQN is then used to gener-
AUTS is present, the included response parameter is calcu- ate a fresh authentication vector AV, with which the client is
lated using an empty password (password of “”), instead of then rechallenged.
a RES.
19.4.9.3.5 Server Authentication
19.4.9.3.3 Client Authentication Even though AKA provides inherent mutual authentication
When a client receives a Digest AKA authentication chal- with the AKA AUTN token, mutual authentication mecha-
lenge, it extracts the RAND and AUTN from the nonce nisms provided by Digest may still be useful in order to provide
parameter, and assesses the AUTN token provided by the message integrity. In Digest AKA, the server uses the AKA
server. If the client successfully authenticates the server with XRES parameter as password when calculating the response-
the AUTN, and determines that the SQN used in generating auth of the Authentication-Info header defined in RFC 2617.
the challenge is within expected range, the AKA algorithms
are run with the RAND challenge and shared secret K. The 19.4.9.4 Example Digest AKA Operation
resulting AKA RES parameter is treated as a password when
calculating the response directive of RFC 2617 (see Sections Figure 19.10 shows a message flow describing a Digest AKA
19.4.5 and 19.12.2.3). process of authenticating a SIP request, namely the SIP
REGISTER request.
Client Server
F1 REGISTER
Figure 19.10 Message flow representing a successful authentication. (Copyright IETF. Reproduced with permission.)
Client Server
F1 REGISTER
Client runs AKA algorithms on ISIM, verifies AUTN, but discovers that it
contains an invalid sequence number. The client then generates an AUT token
Figure 19.11 Message flow representing an authentication synchronization failure. (Copyright IETF. Reproduced with
permission.)
cnonce="0a4f113b", tamper-resistant smart card. Interfaces to the ISIM exist,
response="4429ffe49393c02397450934607c which enable the host device to request authentication to
4ef1",
opaque="5ccc069c403ebaf9f0171e9517f
be performed on the card. However, these interfaces do
40e41", not allow access to the long-term secret outside the ISIM,
auts="5PYxMuX2NOT2NeQ=" and the authentication can only be performed if the device
accessing the ISIM has knowledge of a PIN code, shared
F4: Response containing a new challenge between the user and the ISIM. Such PIN codes are typi-
cally obtained from user input, and are usually required
SIP/2.0 401 Unauthorized
when the device is powered on. The use of tamper-resistant
WWW-Authenticate: Digest
realm="[email protected]", cards with secure interfaces implies that Digest AKA is typ-
qop="auth,auth-int", ically more secure than regular Digest implementations, as
nonce="9uQzNPbk9jM05Pbl5Pbl5DIz9uTl9uT neither possession of the host device nor Trojan horses in
l9jM0NTHk9uXk== ", the software give access to the long-term secret. Where a
opaque="dcd98b7102dd2f0e8b11d0f600bf PIN scheme is used, the user is also authenticated when the
b0c093",
device is powered on. However, there may be a difference in
algorithm=AKAv1-MD5
the resulting security of Digest AKA, compared with tra-
ditional Digest implementations, depending of course on
19.4.9.5 Security Considerations whether those implementations cache/store passwords that
In general, Digest AKA is vulnerable to the same security are received from the user.
threats as HTTP authentication (RFC 2617) that has been
described in Sections 19.4.5 and 19.12.2.3. However, we are
describing some relevant exceptions here in the context of 19.4.9.5.2 Limited Use of Nonce Values
Digest AKA. The Digest scheme uses server-specified nonce values to seed
the generation of the request-digest value. The server is free
to construct the nonce in such a way that it may only be used
19.4.9.5.1 Authentication of Clients
from a particular client, for a particular resource, for a limited
Using Digest AKA
period of time or number of uses, or any other restrictions.
AKA is typically, though this is not a theoretical limita- Doing so strengthens the protection provided against, for
tion, run on an ISIM application that usually resides in a example, replay attacks. Digest AKA limits the applicability
Security Mechanisms in SIP ◾ 721
of a nonce value to a particular ISIM. Typically, the ISIM is 19.4.9.5.5 Session Protection
accessible only to one client device at a time. However, the
Digest AKA is able to generate additional session keys for
nonce values are strong and secure even though limited to a
integrity (IK) and confidentiality (CK) protection. Even
particular ISIM. Additionally, this requires that the server
though this document does not specify the use of these addi-
is provided with the client identity before an authentication
tional keys, they may be used for creating additional secu-
challenge can be generated. If a client identity is not avail-
rity within HTTP authentication or some other security
able, an additional round trip is needed to acquire it. Such
mechanism.
a case is analogous to an AKA synchronization failure. A
server may allow each nonce value to be used only once by
sending a next-nonce directive in the Authentication-Info 19.4.9.5.6 Replay Protection
header field of every response. However, this may cause a
AKA allows sequence numbers to be tracked for each authen-
synchronization failure, and consequently some additional
tication, with the SQN parameter. This allows authentica-
round trips in AKA, if the same SQN space is also used for
tions to be replay protected even if the RAND parameter
other access schemes at the same time.
happened to be the same for two authentication requests.
More important, this offers additional protection for the case
19.4.9.5.3 Multiple Authentication Schemes where an attacker replays an old authentication request sent
and Algorithms by the network. The client will be able to detect that the
request is old, and refuse authentication. This proves live-
In HTTP authentication, a UA MUST choose the strongest
liness of the authentication request even in the case where
authentication scheme it understands and request credentials
an MITM attacker tries to trick the client into providing
from the user, based on that challenge. In general, using pass-
an authentication response, and then replaces parts of the
words generated by Digest AKA with other HTTP authen-
message with something else. In other words, a client chal-
tication schemes is not recommended even though the realm
lenged by Digest AKA is not vulnerable for chosen plain
values or protection domains would coincide. In these cases, a
text attacks. Finally, frequent sequence number errors would
password should be requested from the end-user instead. Digest
reveal an attack where the tamper-resistant card has been
AKA passwords must not be reused with such HTTP authen-
cloned and is being used in multiple devices. The downside
tication schemes, which send the password in clear. In partic-
of sequence number tracking is that servers must hold more
ular, AKA passwords must not be reused with HTTP Basic.
information for each user than just their long-term secret,
The same principle must be applied within a scheme if several
namely the current SQN value. However, this information
algorithms are supported. A client receiving an HTTP Digest
is typically not stored in the SIP nodes, but in dedicated
challenge with several available algorithms MUST choose the
authentication servers instead.
strongest algorithm it understands. For example, Digest with
AKAv1-MD5 would be stronger than Digest with MD5.
19.4.9.5.7 Improvements to AKA Security
19.4.9.5.4 Online Dictionary Attacks Even though AKA is perceived as a secure mechanism,
Since user-selected passwords are typically quite simple, it Digest AKA is able to improve it. More specifically, the AKA
has been proposed that servers should not accept passwords parameters carried between the client and the server during
for HTTP Digest that are in the dictionary of RFC 2617. authentication may be protected along with other parts of
This potential threat does not exist in HTTP Digest AKA the message by using Digest AKA. This is not possible with
because the algorithm will use ISIM-originated passwords. plain AKA.
However, the end user must still be careful with PIN codes.
Even though HTTP Digest AKA password requests are
never displayed to the end user, the end user will be authenti- 19.4.10 Key-Derivation Authentication
cated to the ISIM via a PIN code. Commonly known initial Scheme in SIP
PIN codes are typically installed to the ISIM during manu-
19.4.10.1 Background
facturing and if the end users do not change them, there is
a danger that an unauthorized user may be able to use the SIP uses the Digest Authentication schemes (see Sections
device. Naturally, this requires that the unauthorized user 19.4.5 and 19.12.2.3) with the general framework for access
has access to the physical device, and that the end user has control and authentication, which is used by a server to chal-
not changed the initial PIN code. For this reason, end users lenge a client request and by a client to provide authentica-
are strongly encouraged to change their PIN codes when tion information. The challenge–response framework relies
they receive an ISIM. on passwords chosen by users that usually have low entropy
722 ◾ Handbook on Session Initiation Protocol
and weak randomness, and as a result cannot be used as With the challenge–response framework, the ini-
cryptographic keys. While they cannot be used directly tial request from the client is sent without providing any
as cryptographic keys, the passwords can still be used to credentials. When the server receives the initial request
derive cryptographic keys, by using Key Derivation Function from the client, the server fetches the master-key associ-
(KDF). ated with the user name provided in the request. The server
A Standards Track IETF draft [3] defines a key-derivation then uses the master-key to create a pop using an HMAC-
authentication scheme based on the KDF that could be used Hash function with the digest-string and nonce from the
with the challenge–response authentication framework used challenge. The digest-string, as defined in RFC 4474 (see
by SIP to authenticate the user. The scheme allows two par- Sections 2.8 and 19.4.8), is a list of SIP headers that must
ties to establish a mutually authenticated communication be hashed to create the pop defined in this document. The
channel based on a shared password, without ever send- server then challenges the request and includes the Key-
ing the password on the wire. That is, the Key-Derivation Derivation scheme with a kdf, a salt, a key size, an iteration
scheme ensures that the password is never sent on the wire, count, a nonce, and pop. To be able to provide credentials
and allows for a better secure storage of passwords, as it sig- to the server, the client must create the master-key as was
nificantly increases the amount of computation needed to done by the server when the account was initially created,
derive a key from a password in a dictionary attack. The as described above, using the parameters provided by the
Key-Derivation scheme creates a master-key that is derived server in the challenge.
from the password, which has a much better entropy than The client will then verify the pop sent by the server
the password, to calculate a proof-of-possession (pop) for the using its master-key, the digest-string of the incoming
shared password. request, and the nonce provided in the challenge. The cli-
ent then creates an initial request (F1) with a pop using
an HMAC-Hash function and the master-key using the
19.4.10.2 Operations
digest-string from the response concatenated with the
When an account is created, the server uses a KDF, a salt, nonce to be sent to the server. A valid response from the cli-
a key length, and an iteration count to create a master-key ent will contain the Key-Derivation scheme, a nonce, and
based on the user’s password, as defined in Ref. [4]. The the pop parameter. When the server receives the response,
server then stores the following information in the database: it verifies the pop, and if that is valid, it sends a confirma-
user name, iteration count, salt and master-key. Figure 19.12 tion. At the end of the above process, the client and the
describes the flow of messages at a high level based on the server would have established a communication channel
challenge–response framework. after completing a mutual authentication using the same
Client Server
F1 Initial-Request ([email protected])
Figure 19.12 Message flows for key-derivation authentication scheme. (Copyright IETF. Reproduced with permission.)
Security Mechanisms in SIP ◾ 723
for DANE and the different ways a TLSA record can be port 5042, then the corresponding TLSA record will be
constructed are described in RFC 6698. found using name_5042._tcp.sip01.example.com.
trusted by both the UAC and the UAS that wish to share a client will need to authenticate itself to an authorization
session. For that reason, the authorization services described service before it receives an assertion.
in this document are most applicable to clients either in a sin- This authentication could use any of the standard mecha-
gle domain or in federated domains that have agreed to trust nisms described in RFC 3261, or use some other means of
one another’s authorization services. This could be common authentication. Once a SIP UA has an assertion, it will need
in academic environments, or business partnerships that some way to carry an assertion within a SIP request. It is
wish to share attributes of principals with one another. Some possible that this assertion could be provided by reference
trait-based authorization architectures have been proposed or by value. For example, a SIP UA could include a MIME
to provide single sign-on services across multiple providers. body within a SIP request that contains the assertion; this
Although trait-based identity offers an alternative to tradi- would be inclusion by value. Alternatively, content indirec-
tional identity architectures, this effort should be considered tion specified in RFC 4483 (see Section 16.6), or some new
complementary to the end-to-end cryptographic SIP Identity header, could be used to provide a URI (perhaps an HTTP
effort specified in RFC 4474 (see Sections 2.8 and 19.4.8). URL) where interested parties could acquire the assertion;
An authentication service might also act as an authorization this is inclusion by reference. The basic model is shown in
service, generating some sort of trait assertion token instead Figure 19.13.
of an authenticated identity body. The entity requesting authorization assertions (or the
entity that gets some assertions granted) and the entity
using these authorization assertions might be colocated in
19.5.1.2 Trait-Based Authorization Framework
the same host or domain, or they might be entities in dif-
A trait-based authorization architecture entails the exis- ferent domains that share a federate with one another. The
tence of an authorization service. Devices must send same is true for the entity that grants these assertions to a
requests to an authorization service in order to receive an particular entity and the entity that verifies these assertions.
assertion that can be used in the context of a given network From a protocol point of view, it is worth noting that the
request. Different network request types will often necessi- process of obtaining some assertions might occur sometime
tate different or additional attributes in assertions from the before the usage of these assertions. Furthermore, different
authorization service. For the purposes of SIP, SIP requests protocols might be used and the assertions may have a life-
might be supplied to an authorization service to provide time that might allow that these assertions are presented to
the basis for an assertion. It could be the case that a UA the verifying entity multiple times (during the lifetime of the
will take a particular SIP request, such as an INVITE, for assertion). Some important design decisions are associated
which it wishes to acquire an assertion and forward this to with carrying assertions in a SIP request. If an assertion is
the authorization service (in a manner similar to the way carried by value, or uses a MIME-based content-indirection
that an authenticated identity body is requested in RFC system, then proxy servers will be unable to inspect the asser-
4474; see Sections 2.8 and 19.4.8). UAs might also use a tion themselves. If the assertion were referenced in a header,
separate protocol to request an assertion. In either case, the however, it might be possible for the proxy to acquire and
Request
Entity
requesting Assertion-
Assertion
authorization granting entity
assertion
Figure 19.13 Basic trait-based authorization model. (Copyright IETF. Reproduced with permission.)
Security Mechanisms in SIP ◾ 727
inspect the assertion itself. There are certainly architectures the operators of the domain or a particular user), and some
in which it would be meaningful for proxy servers to apply accounting operations might need to complete before a call is
admission controls based on assertions. terminated. For example, a caller in one domain might want
It is also the case that carrying assertions by reference to access a conference bridge in another domain, and the
allows versatile access controls to be applied to the asser- called domain might wish to settle for the usage of the bridge
tion itself. For instance, an HTTP URL where an asser- with the calling domain. Or in a wireless context, a roaming
tion could be acquired could indicate a web server that user might want to use services in a visited network, and the
challenged requests, and only allowed certain authorized visited network might need to understand how to settle with
sources to inspect the assertion, or that provided differ- the user’s home network for these services.
ent versions of the assertion depending on who is asking. Assuming that the calling domain constitutes some sort
When a SIP UA initiates a request with privacy controls of commercial service capable of exchanging accounting
described in RFC 4474 (see Sections 2.8 and 19.4.8), a web information, the called domain may want to verify that the
server might provide only trait information (faculty, stu- remote user has a billable account in good standing before
dent, or staff ) to most queries, but provide more detailed allowing a remote user access to valuable resources. Moreover,
information, including the identity of the originator of the the called domain may need to discover the network address
SIP request, to certain privileged askers. The end users that of an accounting server and some basic information about
make requests should have some way to inform authoriza- how to settle with it. An authorization assertion created by
tion services of the attributes that should be shared with the calling domain could provide the called domain with an
particular destinations. assurance that a user’s account can settle for a particular ser-
Assertions themselves might be scoped to a particular vice. In some cases, no further information may be required
SIP transaction or SIP dialog, or they might have a longer to process a transaction; however, if more specific accounting
lifetime. The recipient of an assertion associated with a SIP data is needed, traits could also communicate the network
request needs to have some way to verify that the authoriza- address of an accounting server, the settlement protocol that
tion service intended that this assertion could be used for the should be used, and so on.
request in question. However, the format of assertions is not
specified by these requirements. Trait assertions for responses
19.5.1.3.2 Associating Gateways with Providers
to SIP requests are outside the scope of these requirements; it
is not clear if there is any need for the recipient of a request Imagine a case where a particular telephone service pro-
to provide authorization data to the requestor. Trait-based vider has deployed numerous PSTN–SIP gateways. When
authorization has significant applicability to SIP. There are calls come in from the PSTN, they are eventually proxied
numerous instances in which it is valuable to assert particu- to various SIP UAs. Each SIP UAS is interested to know the
lar facts about a principal other than the principal’s identity identity of the PSTN caller, of course, which could be given
to aid the recipient of a request in making an authorization within SIP messages in any number of ways (in SIP headers,
policy decision. For example, a telephony service provider bodies, etc.). However, in order for the recipient to be able
might assert that a particular user is a customer as a trait. An to trust the identity (in this instance, the calling party’s tele-
emergency services network might indicate that a particular phone number) stated in the call, they must first trust that
user has a privileged status as a caller. the call originated from the gateway and that the gateway is
operated by a known (and trusted) provider.
There are a number of ways that a service provider might
19.5.1.3 Example Use Cases
try to address this problem. One possibility would be routing
The following use cases are by no means exhaustive, but pro- all calls from gateways through a recognizable edge proxy
vide a few high-level examples of the sorts of services that server (say, sip.example.com). Accordingly, any SIP entity
trait-based authorization might provide. All of the cases that received a request via the edge proxy server (assuming
below consider interdomain usage of authorization assertions. the use of hop-by-hop mutual cryptographic authentication)
would know the service provider from whom the call origi-
nated. However, it is possible that requests from the origi-
19.5.1.3.1 Settlement for Services
nating service provider’s edge proxy might be proxied again
When end points in two domains share real-time communi- before reaching the destination UAS, and thus in many cases
cations services, sometimes there is a need for the domains the originating service provider’s identity would be known
to exchange accounting and settlement information in real only transitively. Moreover, in many architectures, requests
time. The operators of valuable resources (e.g., PSTN trunk- that did not originate from PSTN gateways could be sent
ing, conference bridges, or the like) in the called domain through the edge proxy server. In the end analysis, the recipi-
may wish to settle with the calling domain (either with ent of the request is less interested in knowing which carrier
728 ◾ Handbook on Session Initiation Protocol
the request came from than in knowing that the request on the identity of the user, with the presumption that pri-
came from a gateway. oritized calls will be granted preferential treatment when
Another possible solution is to issue certificates to every network resources are scarce. Different domains might have
gateway corresponding to the host name of the gateway different criteria for assigning priority, and it is unlikely that
(gateway1.example.com). Gateways could therefore sign SIP a domain would correlate the identity of a nonlocal user with
requests directly, and this property could be preserved end- the need for priority, even in situations where domains would
to-end. However, depending on the public key infrastruc- like to respect one another’s prioritization policies. Existing
ture, this could become costly for large numbers of gateways, proposals have focused largely on adding a new header field
and, moreover, a UAS that receives the request has no direct to SIP that might carry a priority indicator. This use case
assurance from a typical certificate that the host is in fact a does not challenge this strategy, but merely shows by way
gateway just because it happens to be named gateway1. Trait- of example how this requirement might be met with a trait-
based authorization would enable the trait is a gateway to be based authorization system. As such, the limitations of the
associated with an assertion that is generated by the service header field approach will not be contrasted here with a
provider (i.e., signed by example.com). Since these assertions hypothetical trait-based system. An assertion created by a
would travel end-to-end from the originating service provider domain for a particular request might have an associated pri-
to the destination UAS, SIP requests that carry them can ority attribute. Recipients of the request could inspect and
pass through any number of intermediaries without discard- verify the signature associated with the assertion to deter-
ing cryptographic authentication information. This mecha- mine which domain had authenticated the user and made
nism also does not rely on host-name conventions to identify the priority assessment. If the assertion’s creator is trusted by
what constitutes a gateway and what does not—it relies on the evaluator, the given priority could be factored into any
an explicit and unambiguous attribute in an assertion. relevant request processing.
the federation that would honor the assertion generated to provided by the assertion. Reference integrity is neces-
authorize the SIP signaling would similarly honor the use of sary to prevent various sorts of relay and impersonation
the assertion in the context of QOS. Upon the initial genera- attacks. Note that reference integrity may apply on a
tion of the assertion by an authorization server, traits could per-message, per-transaction, or per-dialog basis.
be added that specify the desired level of quality that should ◾◾ The assertion schemes used for this mechanism must
be granted to the media associated with a SIP session. be capable of asserting attributes or traits associated
with the identity of the principal originating a SIP
request. No specific traits or attributes are required by
19.5.1.4 Trait-Based Authorization
this specification.
Requirements
◾◾ The mechanism must support a means for end users
The following are the constraints and requirements for trait- to specify policies to an authorization service for the
based authorization in SIP: distribution of their traits or attributes to various
destinations.
◾◾ The mechanism must support a way for SIP UAs to ◾◾ The mechanism must provide a way of preventing
embed an authorization assertion in SIP requests. unauthorized parties (either intermediaries or end
Assertions can be carried either by reference or by points) from viewing the contents of assertions.
value. ◾◾ The assertion schemes must provide a way of selectively
◾◾ The mechanism must allow SIP UACs to deliver to an sharing the traits or attributes of the principal in ques-
authorization service those SIP requests that need to tion. In other words, it must be possible to show only
carry an assertion. The mechanism should also provide some of the attributes of a given principal to particu-
a way for SIP intermediaries to recognize that an asser- lar recipients, based on the cryptographically assured
tion will be needed, and either forward requests to an identity of the recipient.
authorization service themselves or notify the UAC of ◾◾ It must be possible to provide an assertion that con-
the need to do so. tains no identity—that is, to present only attributes or
◾◾ Authorization services must be capable of delivering traits of the principal making a request, rather than the
an assertion to a SIP UAC, either by reference or by identity of the principal.
value. It may also be possible for an authorization ◾◾ The manner in which an assertion is distributed MUST
service to add assertions to requests itself, if the user permit cryptographic authentication and integrity
profile permits this, for example, through the use of properties to be applied to the assertion by the autho-
content indirection as described in RFC 4483 (see rization service.
Section 16.6). ◾◾ It must be possible for a UAS or proxy server to reject
◾◾ Authorization services must have a way to authenticate a request that lacks a present and valid authorization
a SIP UAC. assertion, and to inform the sending UAC that it must
◾◾ The assertions generated by authorization services must acquire such an assertion in order to complete the
be capable of providing a set of values for a particular request.
trait that a principal is entitled to claim. ◾◾ The recipient of a request containing an assertion must
◾◾ The mechanism must provide a way for authorized be able to ascertain which authorization service gener-
SIP intermediaries (e.g., authorized proxy servers) to ated the assertion.
inspect assertions. ◾◾ It must be possible for a UAS or proxy server to reject
◾◾ The mechanism must have a single baseline mandatory- a request containing an assertion that does not provide
to-implement authorization assertion scheme. The any attributes or traits that are known to the recipient
mechanism must also allow support of other assertion or that are relevant to the request in question.
schemes, which would be optional to implement. One ◾◾ It should be possible for a UAC to attach multiple
example of an assertion scheme is Security Assertion assertions to a single SIP request, in cases where mul-
Markup Language (SAML) [7] and another is RFC tiple authorization services must provide assertions in
3281 X.509 Attribute Certificates. order for a request to complete.
◾◾ The mechanism must ensure reference integrity
between a SIP request and assertion. Reference integ-
19.5.1.5 SAML Assertion for Role/Trait-Based
rity refers to the relationship between a SIP message
Authorization in SIP
and the assertion authorizing the message. For exam-
ple, a reference integrity check would compare the Earlier in introducing the authentication services in SIP, we
sender of the message (as expressed in the SIP request, described how basic identity information could be asserted
e.g., in the From header field value) with the identity (see Section 19.2). However, additional identity information
730 ◾ Handbook on Session Initiation Protocol
can also be asserted according to the same manner for autho- SAML (e.g., asserting party is making the statement, “Alice
rization once authentication is provided. The requirements has these profile attributes and her domain’s certificate is
for asserting additional identity information, referred to as available over there, and I’m making this statement, and
roles or traits, are described above; however, RFC 4484 (see here’s who I am.”). Figure 19.14 shows the call flows of the
Section 19.5.1) offers only some scenarios for the role or trait- above SIP authorization system.
based authorization that will offer scalability, but not any In this example, Alice wants to call Bob. The outgoing
actual solutions in SIP. In this regard, we are describing a SIP proxy that acts as the virtual SIP authentication service
method [7] for using the SAML in collaboration with SIP. for asserting identity authenticates Alice (F1–F4 SIP mes-
This scheme defines the SAML assertions in SIP messages for sages) after challenge and response with authentication cre-
trait-based authorization. dentials, and then forwards Alice’s SIP INVITE message
onto Bob’s inbound SIP proxy that serves as the relaying
party. This SIP message includes Alice’s identity informa-
19.5.1.5.1 SIP Authorization System
tion as blessed by her outgoing proxy, along with a reference
with SAML Assertion
to a SAML assertion, which asserts various traits of Alice
We are considering a SIP authorization system as depicted and points to Alice’s domain certificate. If the assertion and
in Figure 16.31, Section 16.6.1. SIP user profiles that are domain certificate pass verification by Bob’s inbound proxy
relevant to the trait-based authentication can be encoded in (relaying party), then the call setup continues.
F1 SIP INVITE
F3 SIP ACK
F9 SIP 200 OK
F 10 SIP 200 OK
F 11 SIP 200 OK
Figure 19.14 Role/trait-based authorization in SIP using SAML. (Copyright IETF. Reproduced with permission.)
Security Mechanisms in SIP ◾ 731
(i.e., the caller’s) being asserted by the caller domain’s mechanisms and enable trait-based authorization. The
outbound SIP proxy/authentication service. SAML assertion profile based on roles/traits along with iden-
◾◾ The assertion does not contain an authentication state- tity will make the authorization decision enhance security in
ment. It means that no party faithfully implementing SIP. The protection of user privacy in relation to the user’s
the cryptographic authentication scheme specified in identity, including the asserted identity, is described in RFC
RFC 4474 (see Sections 2.8 and 19.4.8) should be rely- 3323 (see Section 20.2).
ing on SAML assertions as sufficient to allow access to
resources.
◾◾ The assertion identifies the targeted relaying party and 19.5.2 Authorization through Dialog
the assertion user. Owing to these properties, an entity Identification in SIP
receiving such an assertion is able to ascertain whether
19.5.2.1 Overview
the assertion was targeted to them, as well as who orig-
inated it, and thus make an informed decision whether We have defined authorization earlier (see Section 19.1). In
to proceed with SIP session establishment. addition, authorization is defined in the context of crypto-
◾◾ The assertion explicitly stipulates its validity period. graphic authentication (see Section 19.4.8). It appears that
This is simply the well-known and oft-used technique authorization is almost tightly related to authentication, and
of having security tokens explicitly reflect the time there can be many kinds of policies for providing authoriza-
period within which they may be relied upon. tion. However, we solely discuss authorization through dia-
◾◾ The assertion contains or refers to the originating log identification as specified in RFC 4538. RFC 4538 that is
user’s domain’s public key certificate. In addition to described here defines the Target-Dialog header field for the
all of the above, with this property the assertion refers SIP, and the corresponding option tag, tdialog. This header
to (or actually contains) the user’s domain public key field is used in requests that create SIP dialogs. It indicates
certificate. to the recipient that the sender is aware of an existing dialog
with the recipient, either because the sender is on the other
The key is that the above linkage is protected by the sig- side of that dialog, or because it has access to the dialog iden-
nature on the assertion, and the reference to the assertion in tifiers. The recipient can then authorize the request based on
the SIP message’s Identity-Info header field (as well as several this awareness.
other SIP header fields) is protected via signature. We have SIP defines the concept of a dialog as a persistent rela-
seen that there is a verifiable chain from the SIP message to tionship between a pair of UAs. Dialogs provide context,
the user’s domain public key certificate described above. It including sequence numbers, proxy routes, and dialog
is quite possible that any of the links in the chain explained identifiers. Dialogs are established through the transmis-
above may not verify in certain circumstances. If this occurs, sion of SIP requests with particular methods. Specifically,
a relaying party shall not continue with SIP session establish- the INVITE, REFER, and SUBSCRIBE requests all cre-
ment in view of maintaining the security of the session. ate dialogs. When a UA receives a request that creates a dia-
log, it needs to decide whether to authorize that request. For
some requests, authorization is a function of the identity of
19.5.1.6 Role/Trait-Based Authorization
the sender, the request method, and so on. However, many
Benefits in SIP situations have been identified in which a UA’s authorization
An authorization system for SIP that is not predicated on the decision depends on whether the sender of the request is cur-
distribution of end-users’ identities, but rather shares traits rently in a dialog with that UA, or whether the sender of the
of the users, is described here (RFC 4484). The distribution request is aware of a dialog the UA has with another entity.
of authorization assertions requires numerous security prop- One such example is call transfer, accomplished through
erties. An authorization service must be able to sign asser- REFER. If UAs A and B are in an INVITE dialog, and UA
tions, or provide some similar cryptographic assurance that A wishes to transfer UA B to UA C, UA A needs to send a
can provide nonrepudiation for assertions as detailed in RFC REFER request to UA B, asking UA B to send an INVITE
3323 (see Section 20.2). We have described an implemen- request to UA C. UA B needs to authorize this REFER. The
tation of role/trait-based authorization scheme using SAML proper authorization decision is that UA B should accept
assertion profile in conjunction with SIP for secured SIP ses- the request if it came from a user with whom B currently
sion setups satisfying requirements along the line articulated has an INVITE dialog relationship. Current implemen-
here (RFC 4484). The SAML assertion is extremely flexible tations deal with this by sending the REFER on the same
and allows for the encoding not only of identity information dialog as the one in place between UAs A and B. However,
about the user, but also generic authentication and autho- this approach has numerous problems as specified in RFC
rization attributes that provide much richer authorization 5057 (see Sections 3.6.5 and 16.2). These problems include
Security Mechanisms in SIP ◾ 733
difficulties in determining the life cycle of the dialog and its acts as a UA and sends the request to UA B. This request is
usages, and in determining which messages are associated addressed to the URI of UA B, which server A learned from
with each application usage. Instead, a better approach is for inspecting the Contact header field in the 200 OK of the
UA A to send the REFER request to UA B outside of the INVITE request. If this URI has the GRUU (see Section
dialog. In that case, a means is needed for UA B to authorize 4.3) property (it can be used by any element on the Internet,
the REFER. such as server A, to reach the specific UA instance that gener-
Another example is the SIP application interaction ated that 200 OK to the INVITE), then the mechanism will
framework specified in RFC 5629. In that framework, work across NAT boundaries.
proxy servers on the path of a SIP INVITE request can The request generated by server A will contain a Target-
place user interface components on the UA that generated Dialog header field. This header field contains the dialog iden-
or received the request. To do this, the proxy server needs to tifiers for the INVITE dialog between UAs A and B, composed
send a REFER request to the UA, targeted to its GRUU (see of the Call-ID, local tag, and remote tag. Server A knew to
Section 4.3), asking the UA to fetch an HTTP resource con- include the Target-Dialog header field in the REFER request
taining the user interface component. In such a case, a means because it knows that UA B supports it. When the request
is needed for the UA to authorize the REFER. arrives at UA B, it needs to make an authorization decision.
The application interaction framework recommends that Because the INVITE dialog was established using a sips URI,
the request be authorized if it was sent from an entity on the and because the dialog identifiers are cryptographically ran-
path of the original dialog. This can be done by including dom (RFC 3261, see Section 3.6), no entity except for UA A
the dialog identifiers in the REFER, which prove that the or the proxies on the path of the initial INVITE request can
UA that sent the REFER is aware of those dialog identifiers know the dialog identifiers. Thus, because the request contains
(this needs to be secured against eavesdroppers through the those dialog identifiers, UA B can be certain that the request
sips mechanism, of course). Another example is if two UAs came from UA A, the two proxies, or an entity to whom the
share an INVITE dialog, and an element on the path of the UA or proxies gave the dialog identifiers. As such, it authorizes
INVITE request wishes to track the state of the INVITE. In the request and performs the requested actions.
such a case, it sends a SUBSCRIBE request to the GRUU of
the UA, asking for a subscription to the dialog event package.
19.5.2.3 UAC Behavior
If the SUBSCRIBE request came from an element on the
INVITE request path, it should be authorized. A UAC should include a Target-Dialog header field in a
request if the following conditions are all true:
19.5.2.2 Operation
◾◾ The request is to be sent outside of any existing dialog.
Figure 19.16 shows the basic model of operation. UA A sends ◾◾ The UAC believes that the request may not be autho-
an INVITE to UA B, traversing two servers, server A and rized by the UAS unless the UAC can prove that it is
server B. Both servers act as proxies for this transaction. User aware of the dialog identifiers for some other dialog.
B sends a 200 OK response to the INVITE. This 200 OK Call this dialog the target dialog.
includes a Supported header field indicating support for this ◾◾ The request does not otherwise contain information
specification (through the presence of the tdialog option tag). that indicates that the UAC is aware of those dialog
The 200 OK response establishes a dialog between the identifiers.
two UAs. Next, an entity that was present along the request ◾◾ The UAC knows that the UAS supports the Target-
path (e.g., server A) wishes to send a dialog-forming request Dialog header field. It can know this if it has seen a
(such as REFER) to UA A or B (e.g., user B). Thus, the entity request or response from the UAS within the target
dialog that contained a Supported header field that
INVITE included the tdialog option tag.
Server Server
REFER
A B If the fourth condition is not met, the UAC should not
use this specification. Instead, if it is currently within a dialog
IN
TE
FE
that indicates awareness of the target dialog. RFC field that includes the tdialog option tag. This request
5629 also mandates that the REFER be sent only should, in principle, never fail with a 420 Bad Extension
if the UA indicates support for the target dialog response, because the UAC would not have sent the request
specification. unless it believed the UAS supported the extension. If a
◾◾ User A is in separate calls with users B and C. User A Require header field was not included, and the UAS did
decides to start a three-way call, and so morphs into not support the extension, it would normally reject the
a focus (RFC 4353). User B would like to learn the request because it was unauthorized, probably with a 403
other participants in the conference. Thus, it sends a Forbidden. However, without the Require header field,
SUBSCRIBE request to user A (who is now acting the UAC would not be able to differentiate between the
as the focus) for the conference event package (RFC following:
4575). It is sent outside of the existing dialog between
user B and the focus, and it would be authorized by A ◾◾ A 403 Forbidden that arrived because the UAS did not
if user B could prove that it knows the dialog identi- actually understand the Target-Dialog header field (in
fiers for its existing dialog with the focus. Thus, the which case the client should send the request within
Target-Dialog header field would be included in the the target dialog if it can)
SUBSCRIBE. ◾◾ A 403 Forbidden that arrived because the UAS under-
stood the Target-Dialog header field, but elected not
The following are examples of use cases in which these to authorize the request although the UAC proved its
conditions are not met: awareness of the target dialog (in which case the client
should not resend the request within the target dialog,
◾◾ A server acting as a proxy is a participant in an INVITE even if it could)
dialog that establishes a session. The server would
like to use the Keypad Markup Language (KPML)
19.5.2.4 UAS Behavior
event package (RFC 4730) to find out about key-
presses from the originating UA. To do this, it sends If a UAS receives a dialog-creating request and wishes to
a SUBSCRIBE request. However, the Event header authorize the request, and if that authorization depends on
field of this SUBSCRIBE contains event parameters whether or not the sender has knowledge of an existing dia-
that indicate the target dialog of the subscription. As log with the UAS, and information outside of the Target-
such, the request can be authorized without additional Dialog header field does not provide proof of this knowledge,
information. the UAS should check the request for the existence of the
◾◾ A server acting as a proxy is a participant in an INVITE Target-Dialog header field. If this header field is not present,
dialog that establishes a session. The server would like the UAS may still authorize the request by other means. If
to use the dialog event package (RFC 4235) to find out the header field is present, and the value of the called produc-
about dialogs at the originating UA. To do this, it sends tion, the remote-tag, and local-tag values match the Call-ID,
a SUBSCRIBE request. However, the Event header remote tag, and local tag of an existing dialog, and the dialog
field of this SUBSCRIBE contains event parameters that they match was established using a sips URI, the UAS
that indicate the target dialog of the subscription. As should authorize the request if it would authorize any entity
such, the request can be authorized without additional on the path of the request that created that dialog, or any
information. entity trusted by an entity on the path of the request that
created that dialog.
Specifications that intend to make use of the Target- If the dialog identifiers match, but they match a dialog
Dialog header field should discuss specific conditions in not created with a sips URI, the UAS may authorize the
which it is to be included. Assuming it is to be included, the request if it would authorize any entity on the path of
value of the callid production in the Target-Dialog header the request that created that dialog, or any entity trusted
field must be equal to the Call-ID of the target dialog. The by an entity on the path of the request that created that
remote-tag header field parameter must be present and dialog. However, in this case, any eavesdropper on the
must contain the tag that would be viewed as the remote original dialog path would have access to the dialog iden-
tag from the perspective of the recipient of the new request. tifiers, and thus the authorization is optional. If the dialog
The local-tag header field parameter must be present and identifiers do not match, or if they do not contain both a
must contain the tag that would be viewed as the local tag remote-tag and local-tag parameter, the header field must
from the perspective of the recipient of the new request. The be ignored, and authorization may be determined by other
request sent by the UAC should include a Require header means.
Security Mechanisms in SIP ◾ 735
19.5.2.10 Example Call Flow The INVITE indicates that the caller supports GRUU
(note its presence in the Contact header field of the INVITE)
In this example, UA A and UA B establish an INVITE-
and the Target-Dialog header field. This INVITE is for-
initiated dialog through server A and server B, each of which
warded to the callee (messages F2 and F3), which generates a
acts as a proxy for the INVITE. Server B would then like
200 OK response that is forwarded back to the caller (mes-
to use the application interaction framework (RFC 5629) to
sages F4 and F5). Message F5 might look like
request that UA A fetch an HTML user interface component.
To do that, it sends a REFER request to A’s URI. The flow for SIP/2.0 200 OK
this is shown in Figure 19.17. The conventions of RFC 4475 Via: SIP/2.0/TLS host.example.com;
are used to describe representation of long message lines. branch=z9hG4bK9zz8
First, the caller sends an INVITE, as shown in message From: Caller <sip:[email protected]>;tag=kkaz-
To: Callee <sip:[email protected]>;tag=6544
F1.
Call-ID: [email protected]
CSeq: 1 INVITE
INVITE sips:[email protected] SIP/2.0
Contact: <sips:[email protected]>
Via: SIP/2.0/TLS host.example.com;
branch=z9hG4bK9zz8
Content-Length:...
From: Caller <sip:[email protected]>;tag=kkaz-
To: Callee <sip:[email protected]>
Content-Type: application/sdp
Call-ID: [email protected]
CSeq: 1 INVITE
Max-Forwards: 70 --SDP not shown--
Supported: tdialog
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, In this case, the called party does not support GRUU
REFER or the Target-Dialog header field. The caller generates an
Accept: application/sdp, text/html ACK (message F7). Server B then decides to send a REFER
to user A:
<allOneLine>
Contact: <sips:[email protected];
<allOneLine>
gruu;opaque=urn:uuid:f81d4f
REFER sips:[email protected];
ae-7dec-11d0-a765-
gruu;opaque=urn:uuid:f81d4f
00a0c91e6bf6;grid=99a>;schemes="
ae-7dec-11d0-a765-
http,sip,sips"
00a0c91e6bf6;grid=99a SIP/2.0
</allOneLine>
</allOneLine>
Content-Length:...
Via: SIP/2.0/TLS serverB.example.org;
Content-Type: application/sdp branch=z9hG4bK9zz10
From: Server B <sip:serverB.example.org>;
--SDP not shown-- tag=mreysh
UA A Server A Server B UA B
F1. INVITE
F2. INVITE
F3. INVITE
F4. 200 OK
F5. 200 OK
F6. 200 OK
F7. ACK
F8. REFER
F9. REFER
F10. 200 OK
F11. 200 OK
Figure 19.17 Basic call flows for authorization through dialog identification. (Copyright IETF. Reproduced with permission.)
Security Mechanisms in SIP ◾ 737
<allOneLine>
To: Caller <sips:[email protected]; Proxy
gruu;opaque=urn:uuid:f81d4f
ae-7dec-11d0-a765-00a0c91e6bf6;grid=99a>
</allOneLine>
A B
Target-Dialog: fa77as7dad8-sd98ajzz@host.
example.com
;local-tag=kkaz-
;remote-tag=6544
C Edge
Refer-To: http://serverB.example.org/ UA
ui-component.html router
Call-ID: [email protected].
com Figure 19.18 Basic architecture for providing media autho-
CSeq: 1 REFER
rization. (Copyright IETF. Reproduced with permission.)
Max-Forwards: 70
Require: tdialog streams established via the SIP providing appropriate security.
Allow: INVITE, OPTIONS, BYE, CANCEL, ACK,
We assume an architecture that integrates call signaling with
NOTIFY
Contact: <sips:serverB.example.org> media authorization (see Section 19.5), as illustrated in Figure
Content-Length: 0 19.18. The solid lines (A and B) show interfaces, whereas the
dotted line (C) illustrates the QOS-enabled media flow.
This REFER will be delivered to server A because it was In this architecture, we assume a SIP UA connected to a
sent to the GRUU. From there, it is forwarded to UA A QOS-enabled network with an edge router acting as a policy
(message F9) and authorized because of the presence of the enforcement point (PEP) (RFC 2753). We further assume that
Target-Dialog header field. a SIP UA that wishes to obtain QOS initiates sessions through
a proxy that can interface with the QOS policy control for the
data network being used. We will refer to such a proxy as a
19.5.3 Media Authorization in SIP QOS-enabled proxy. We assume that the SIP UA needs to pres-
19.5.3.1 Overview ent an authorization token to the network in order to obtain
QOS (C). The SIP UA obtains this authorization token via SIP
RFC 3313 that is described here defines the P-Media- (A) from the QOS-enabled proxy by means of an extension
Authorization header (see Section 2.8), a private extension SIP header, defined in this document. The proxy, in turn, com-
in SIP, for media authorization that is needed for QOS municates either directly with the edge router or with a Policy
(see Section 15.4) and media authorization. The P-Media- Decision Point (PDP), not shown in Figure 19.18, in order to
Authorization header can be used to integrate QOS admis- obtain a suitable authorization token for the UA.
sion control with call signaling and help guard against DoS A session that needs to obtain QOS for the media
attacks. The use of this header is only applicable in admin- streams in accordance with our basic architecture described
istrative domains, or among federations of administrative above goes through the following steps. The SIP UA sends an
domains with previously agreed-upon policies, where both INVITE to the QOS-enabled proxy, which for each result-
the SIP proxy authorizing the QOS, and the policy con- ing dialog includes one or more media authorization tokens
trol of the underlying network providing the QOS, belong in all unreliable provisional responses (except 100), the first
to that administrative domain or federation of domains. reliable 1xx or 2xx response, and all retransmissions of that
Furthermore, the mechanism is generally incompatible with reliable response for the dialog. When the UA requests QOS,
end-to-end encryption of message bodies that describe media it includes the media authorization tokens with the resource
sessions. This is in contrast with general Internet principles, reservation. A SIP UA may also receive an INVITE from
which separate data transport from applications. its QOS-enabled proxy, which includes one or more media
authorization tokens. In that case, when the UA requests
QOS, it includes the media authorization tokens with the
19.5.3.2 Basic Architecture for Media
resource reservation. The resource reservation mechanism is
Authorization not part of SIP and is not described here.
In general, QOS provides preferential treatment (see Sections
15.2, 15.3, and 16.11) of one flow, at the expense of another.
19.5.3.3 Media Authorization Header Syntax
Consequently, it is important to have policy control over
whether a given flow should have access to QOS. This will The P-Media-Authorization header field (see Section 2.8)
not only enable fairness in general, but can also prevent DoS contains one or more media authorization tokens that are
attacks. We are concerned with providing QOS for media to be included in subsequent resource reservations for the
738 ◾ Handbook on Session Initiation Protocol
media flows associated with the session, that is, passed to would typically contain the authorizing entity and creden-
an independent resource reservation mechanism, which is tials, and be used in an RSVP request for media data stream
not specified here. The media authorization tokens are used QOS resources.
for authorizing QOS for the media stream(s). The P-Media-
Authorization header field is described by the following
19.5.3.4.2 UAS
ABNF for convenience, although all SIP ABNF syntaxes are
provided in Section 2.4.1: The UAS receives the P-Media-Authorization-Token in an
INVITE (or other) message from the QOS-enabled SIP
P-Media-Authorization = "P-Media- proxy. If the response contains a message body that describes
Authorization" media streams for which the UA desires QOS, it is recom-
HCOLON
mended that this message body not be encrypted end-to-end.
P-Media-
Authorization- The UAS should use all the P-Media-Authorization-Tokens
Token from the most recent request–response that contained the
*(COMMA P-Media-Authorization-Token) P-Media-Authorization header when requesting QOS for the
P-Media-Authorization-Token = 1*HEXDIG associated media stream(s). This applies both to initial and
subsequent refresh reservation messages (e.g., in an RSVP-
Note that the P-Media-Authorization header field can be based reservation system). A reservation function within the
used only in SIP requests or responses that can carry a SIP UAS should convert each string of hex digits into binary, and
offer or answer. utilize each result as a Policy-Element, as defined in RFC 2750
(excluding Length, but including P-Type, which is included
19.5.3.4 SIP Procedures for Media in each token). These Policy-Elements would typically con-
Authorization tain the authorizing entity and credentials, and be used in an
RSVP request for media data stream QOS resources.
We are describing SIP procedures for usage in media autho-
rization-compatible systems, from the point of view of the
authorizing QOS. 19.5.3.4.3 Originating Proxy
When the originating QOS-enabled proxy (OP) receives
an INVITE (or other) message from the UAC, the proxy
19.5.3.4.1 UAC
authenticates the caller, and verifies that the caller is autho-
The initial SIP INVITE message, mid-call messages that result rized to receive QOS. In cooperation with an originating PDP
in network QOS resource changes, and mid-call changes in (PDP-o), the OP obtains or generates one or more media autho-
call destination should be authorized. These SIP messages are rization tokens. These contain sufficient information for the
sent through the QOS-enabled proxies to receive this autho- UAC to get the authorized QOS for the media streams. Each
rization. In order to authorize QOS, the QOS-enabled SIP media authorization token is formatted as a Policy-Element, as
proxy may need to inspect message bodies that describe the defined in RFC 2750 (excluding Length, but including P-Type,
media streams (e.g., SDP). Hence, it is recommended that which is included in each token), and then converted to a
such message bodies not be encrypted end-to-end. string of hex digits to form a P-Media-Authorization-Token.
The P-Media-Authorization-Token, which is contained in The proxy’s resource management function may inspect mes-
the P-Media-Authorization header, is included for each dia- sage bodies that describe the media streams (e.g., SDP), in
log in all unreliable provisional responses (except 100), the both requests and responses, in order to decide what QOS to
first reliable 1xx or 2xx response, and all retransmissions of authorize. For each dialog that results from the INVITE (or
that reliable response for the dialog sent by the QOS-enabled other) message received from the UAC, the originating proxy
SIP proxy to the UAC. The UAC should use all the P-Media- must add a P-Media-Authorization header with the P-Media-
Authorization-Tokens from the most recent request–response Authorization-Token in all unreliable provisional responses
that contained the P-Media-Authorization header when (except 100), the first reliable 1xx or 2xx response, and all
requesting QOS for the associated media stream(s). This retransmissions of that reliable response the proxy sends to the
applies to both initial and subsequent refresh reservation UAC, if that response may result in network QOS changes. A
messages, for example, in a Resource ReSerVation Protocol response with an SDP may result in such changes.
(RSVP)-based reservation system. A reservation function
within the UAC should convert each string of hex digits into
19.5.3.4.4 Destination Proxy
binary, and utilize each result as a Policy-Element, as defined
in RFC 2750 (excluding Length, but including P-Type, The Destination QOS-Enabled Proxy (DP) verifies that the
which is included in each token). These Policy-Elements called party is authorized to receive QOS. In cooperation
Security Mechanisms in SIP ◾ 739
F13. DEC
Using the Auth-Token and Authorized Profile
F14. RSVP- that is set by SIP Proxy, the PDP makes
PATH the decision
F15. RSVP-PATH
F16. RSVP- Copies the RSVP policy object
RESV from the P-Media-Authorization
F17. REQ
F18. DEC
Using the Auth-Token and Authorized
F19. RSVP- Profile that is set by SIP Proxy, the PDP makes
RESV the decision
F20. RSVP-RESV
Figure 19.19 Media authorization for a UAC with RSVP. (Copyright IETF. Reproduced with permission.)
740 ◾ Handbook on Session Initiation Protocol
UAC collects the dialed digits and sends the initial INVITE regarding the end points, bandwidth, and characteristics of
message (F1) to the originating SIP proxy. The originating the media exchange, it initiates a Policy-Setup message to the
SIP proxy (OP) authenticates the user/UAC and forwards the PDP-t on receipt of the INVITE (F1).
INVITE message (F2) to the proper SIP proxy. Assuming PDP-t stores the authorized media description in its local
the call is not forwarded, the terminating end-point sends store, generates an authorization token that points to this
a 18x response (F3) to the initial INVITE via OP. Included description, and returns the authorization token to DP. The
in this response is an indication of the negotiated band- token is placed in the INVITE message (F4) and forwarded
width requirement for the connection in the form of an SDP to the UAS. Assuming that the call is not forwarded, the
description (see Section 7.7). When OP receives the 18x (F3), UAS sends an 18x response (F5) to the initial INVITE mes-
it has sufficient information regarding the end points, band- sage, which is forwarded back to UAC. At the same time,
width, and characteristics of the media exchange. It initi- the UAS sends an RSVP-PATH message (F7) that includes
ates a Policy-Setup message to PDP-o, AuthProfile (F4). The the previously stored P-Media-Authorization-Token as a
PDP-o stores the authorized media description in its local Policy-Element. ER-t, upon receiving the RSVP-PATH mes-
store, generates an authorization token that points to this sage (F7), checks the authorization through a PDP-t COPS
description, and returns the authorization token to the OP, message exchange. PDP-t checks the authorization using
AuthToken (F5). the stored authorized media description that was linked to
The OP includes the authorization token in the P-Media- the authorization token it returned to DP. If authorization
Authorization header extension of the 18x message (F6). is successful, PDP-t returns an install Decision, DEC (F9).
Upon receipt of the 18x message (F6), the UAC stores the ER-t checks the admissibility for the request, and if admis-
media authorization token from the P-Media-Authorization sion succeeds, it forwards the RSVP-PATH message (F10).
header. Also, the UAC acknowledges the 18x message by Once the UAS receives the RSVP-PATH message (F11), it
sending a PRACK (F7) message, which is responded to with sends the RSVP-RESV message (F12) to reserve the network
200 OK (F10). Before sending any media, the UAC requests resources. ER-t, upon reception of the RSVP-RESV mes-
QOS by sending an RSVP-PATH message (F11), which sage (F12), checks the authorization through a PDP-t COPS
includes the previously stored P-Media-Authorization-Token message exchange. PDP-t checks the authorization using the
as a Policy-Element. ER-o, upon receipt of the RSVP-PATH stored authorized media description that was linked to the
message (F11), checks the authorization through a PDP-o authorization token that it returned to DP. If authorization
Common Open Policy Service (COPS) message exchange, is successful, PDP-t returns an install Decision, DEC (F14).
REQ (F12). PDP-o checks the authorization using the stored ER-t checks the admissibility for the request and if admission
authorized media description that was linked to the authori- succeeds, it forwards the RSVP-RESV message (F15). Upon
zation token it returned to OP. If authorization is successful, receiving the RSVP-RESV message (F16), network resources
PDP-o returns an install Decision, DEC (F13). ER-o checks have been reserved in both directions. For completeness, we
the admissibility for the request, and if admission succeeds, show the PRACK message (F5) for the 18x response (F6)
it forwards the RSVP-PATH message (F14). Once UAC and the resulting 200 OK (F19) response acknowledging the
receives the (15) RSVP-PATH message (F15) from UAS, it PRACK.
sends the RSVP-RESV message (F16) to reserve the network
resources. ER-o, upon receiving the RSVP-RESV message
19.5.3.6 Advantages of Media Authorization
(F16), checks the authorization through a PDP-o COPS
message exchange, REQ (F17). PDP-o checks the authoriza- The use of media authorization makes it possible to control
tion using the stored authorized media description that was the usage of network resources. In turn, this makes SIP tele-
linked to the authorization token it returned to OP. If autho- phony more robust against DoS attacks and various kinds
rization is successful, PDP-o returns an install Decision, of service frauds. By using the authorization capability,
DEC (F18). ER-o checks the admissibility for the request, the number of flows and the amount of network resources
and if admission succeeds, it forwards the RSVP-RESV mes- reserved can be controlled, thereby making the SIP tele-
sage (F19). Upon receiving the RSVP-RESV message (F20), phony system dependable in the presence of scarce resources.
network resources have been reserved in both directions.
19.5.3.7 Security Considerations
19.5.3.5.2 UAS Side
To control access to QOS, a QOS-enabled proxy should
Figure 19.20 presents a high-level overview of a call flow authenticate the UA before providing it with a media autho-
with media authorization from the viewpoint of the UAS. rization token. Both the method and policy associated with
Some policy interactions have been omitted for brevity. Since such authentication are outside the scope of this document;
the destination SIP proxy (DP) has sufficient information however, it could, for example, be done by using standard
Security Mechanisms in SIP ◾ 741
F1. INVITE
F2. Proxy authentication and
AuthProfile call authorization
F3.
AuthToken
F4. INVITE
F5. PRACK
F6. 18x
F7. Copies the RSVP policy object
RSVP-PATH from the P-Media-
Authorization
F8. REQ
F9. DEC
Using the Auth-Token and Authorized
F10. RSVP- Profile that is set by SIP proxy, the PDP
PATH makes the decision
F11. RSVP-PATH
F13. REQ
F14. DEC
Using the Auth-Token and Authorized
F15. RSVP-RESV Profile that is set by SIP proxy, the
PDP makes the decision
F16. RSVP-RESV
F17. PRACK
F18. PRACK
F19. 200 OK
(PRACK)
F20. 200 OK
(PRACK)
Figure 19.20 Media authorization for UAS with RSVP. (Copyright IETF. Reproduced with permission.)
SIP authentication mechanisms, as described in RFC 3261 Consequently, the P-Media-Authorization header field
(see Section 19.4). Media authorization tokens sent in the must not be available to any untrusted intermediary in the
P-Media-Authorization header from a QOS-enabled proxy clear or without integrity protection. There is currently no
to a UA must be protected from eavesdropping and tamper- mechanism defined in SIP that would satisfy these require-
ing. This can, for example, be done through a mechanism ments. Until such a mechanism exists, proxies must not send
such as IPSec or TLS. However, this will only provide hop- P-Media-Authorization headers through untrusted interme-
by-hop security. If there are one or more intermediaries (e.g., diaries, which might reveal or modify the contents of this
proxies) between the UA and the QOS-enabled proxy, these header. Note that S/MIME-based encryption (see Section
intermediaries will have access to the P-Media-Authorization 19.6) in SIP is not available to proxy servers, as proxies are
header field value, thereby compromising confidentiality and not allowed to add message bodies. QOS-enabled proxies may
integrity. This will enable both theft-of-service and DoS need to inspect message bodies describing media streams
attacks against the UA. (e.g., SDP). Consequently, such message bodies should not
742 ◾ Handbook on Session Initiation Protocol
be encrypted. In turn, this will prevent end-to-end confiden- the Internet at large. Furthermore, since the early-media
tiality of the said message bodies, which lowers the overall requests are not cryptographically certified, they are subject
security possible. to forgery, replay, and falsification in any architecture that
SIP messages carry MIME bodies (RFC 3261, see does not meet the requirements of the Trust Domain. An
Sections 2.4.2.4 and 3.9), and the MIME standard includes early-media request also lacks an indication of who specifi-
mechanisms for securing MIME contents to ensure both cally is making or modifying the request, and so it must
integrity and confidentiality (including the multipart/signed be assumed that the Trust Domain is making the request.
and application/pkcs7-mime MIME types). Implementers Therefore, the information is only meaningful when securely
should note, however, that there may be rare network inter- received from a node known to be a member of the Trust
mediaries (not typical proxy servers) that rely on viewing or Domain. Although this extension can be used with parallel
modifying the bodies of SIP messages (especially SDP), and forking, it does not improve on the known problems with
that secure MIME may prevent these sorts of intermediaries early media and parallel forking, as described in RFC 3960
from functioning. This applies particularly to certain types (see Section 11.4.8), unless one can assume the use of sym-
of firewalls. The PGP mechanism for encrypting the header metric RTP. Despite these limitations, there are sufficiently
fields and bodies of SIP messages described in RFC 2543 useful specialized deployments that meet the assumptions
(obsoleted by 3261) has been deprecated. described above, and can accept the limitations that result, to
warrant publication of this mechanism. An example deploy-
ment would be a closed network that emulates a traditional
19.5.4 Early-Media Authorization in SIP circuit switched telephone network.
Early media (see Chapter 11), which refers to media such as
audio and video, and is exchanged between the callee and 19.5.5 Framework for Session Setup
the caller before a particular session is accepted by the called
party, may need authorization based on policies in certain
with Media Authorization
circumstances. For example, 3GPP’s early-media policy pro- During the session setup (e.g., SIP session establishment),
hibits the exchange of early media between end users; it is policies may be enforced to ensure that the media streams
interconnected with other SIP networks that have unknown, being requested lie within the bounds of the service pro-
untrusted, or different policies regarding early media; and it file established for the requesting host. RFC 3521 that is
has the capability to gate (enable/disable) the flow of early described here specifies such a linkage through use of a token
media to/from user equipment. Because of the peculiar that provides capabilities and of a ticket in the push model
behavior of early media, RFC 5009 (see Section 11.5) speci- specified in authentication, authorization, and accounting
fies a private SIP header field specified as P-Early-Media that framework of RFC 2904. The token is generated by a policy
is used for authorization of early media in SIP. The P-Early- server or a session management server (e.g., SIP proxy) and is
Media header field is used within SIP messages in certain SIP transparently relayed through the end host to the edge router
networks to authorize the cut-through of backward or for- where it is used as part of the policy-controlled flow admis-
ward early media when permitted by the early-media policies sion process.
of the networks involved. Within an isolated SIP network, it In some environments, authorization of media streams can
is possible to gate early media associated with all end points exploit the fact that preestablished relationships exist between
within the network to enforce a desired early-media policy elements of the network (e.g., session management servers,
among network end points. However, when a SIP network edge routers, policy servers, and end hosts). Preestablished
is interconnected with other SIP networks, only the bound- relationships assume that the different network elements are
ary node connected to the external network can determine configured with the identities of the other network elements
which early-media policy to apply to a session established and, if necessary, are configured with security keys, and other
between end points on different sides of the boundary. The features required to establish a trust relationship. In other
P-Early-Media header field provides a means for this bound- environments, however, such preestablished relationships may
ary node to communicate this early-media policy decision to not exist either due to the complexity of creating these associa-
other nodes within the network. tions a priori, for example, in a network with many elements,
The use of this extension is only applicable inside a Trust or due to the different business entities involved (e.g., service
Domain as defined in RFC 3325 (see Sections 2.8, 10.4, and provider and access provider), or due to the dynamic nature of
20.3). Nodes in such a Trust Domain are explicitly trusted these associations (e.g., in a mobile environment).
by its users and end systems to authorize early-media requests RFC 3521 describes the media authorization concepts
only when allowed by early-media policy within the Trust using SIP for session signaling, RSVP (RFC 3540) for
Domain. This document does not offer a general early-media resource reservation, and COPS (RFCs 2749 and 3084) for
authorization model suitable for interdomain use or use in interaction with the policy servers. The linkage of a token
Security Mechanisms in SIP ◾ 743
established during the SIP session setup to the network layer originating URI of signaling (the From header field) with
entity such as an IP router for resources reservation for the the same AOR. Any mechanisms depending on the existence
session maintaining the QOS through a policy server is artic- of end-user certificates are seriously limited in that there is
ulated in this specification. It also facilitates preventing fraud virtually no consolidated authority today that provides cer-
and ensuring accurate billing, but some linkage is required to tificates for end-user applications.
verify that the resources being used to provide the requested However, users should acquire certificates from known
QOS are in line with the media streams requested and public certificate authorities. As an alternative, users may
authorized for the session. However, no specific standardiza- create self-signed certificates. The implications of self-
tion, either Informational or Standards Track, for using the signed certificates are explored further in Section 19.4.8.
token and related objects extending SIP, RSVP, and COPS Implementations may also use preconfigured certificates in
has been done in RFC 3521 other than some conceptual sce- deployments in which a previous trust relationship exists
narios for media authorization linking with the session setup. between all SIP entities. Above and beyond the problem of
RFC 3520 defines a session object known as AUTH_ acquiring an end-user certificate, there are few well-known
SESSION along with many other detailed functional features centralized directories that distribute end-user certificates.
that represent a session authorization policy element for sup- However, the holder of a certificate should publish their cer-
porting policy-based per-session authorization and admission tificate in any public directories as appropriate. Similarly,
control. The host must obtain an AUTH_SESSION element UACs should support a mechanism for importing (manually
from an authorizing entity via a session signaling protocol or automatically) certificates discovered in public directories
such as SIP. The host then inserts the AUTH_SESSION ele- corresponding to the target URIs of SIP requests.
ment into the resource reservation message to allow verifica-
tion of the network resource request. For brevity, this was not
described in detail. However, more work is needed to allow
19.6.2 S/MIME Key Exchange
use of all those capabilities in this specification integrating SIP itself can also be used as a means to distribute public
with SIP, RSVP, and COPS for interoperability using com- keys in the following manner (RFC 3261). Whenever the
mon standards. CMS SignedData message is used in S/MIME for SIP, it
must contain the certificate bearing the public key neces-
sary to verify the signature. When a UAC sends a request
containing an S/MIME body that initiates a dialog, or sends
19.6 Integrity and Confidentiality in SIP a non-INVITE request outside the context of a dialog, the
UAC should structure the body as an S/MIME multipart/
19.6.1 S/MIME Certificates
signed CMS SignedData body. If the desired CMS service
RFC 3261 specifies the use of S/MIME certificates in SIP. is EnvelopedData (and the public key of the target user is
The certificates that are used to identify an end user for the known), the UAC should send the EnvelopedData message
purposes of S/MIME differ from those used by servers in one encapsulated within a SignedData message. When a UAS
important respect—rather than asserting that the identity of receives a request containing an S/MIME CMS body that
the holder corresponds to a particular host name, these cer- includes a certificate, the UAS should first validate the certif-
tificates assert that the holder is identified by an end-user icate, if possible, with any available root certificates for certif-
address. This address is composed of the concatenation of icate authorities. The UAS should also determine the subject
the userinfo, @, and domainname portions of a SIP or SIPS of the certificate (for S/MIME, the SubjectAltName will
URI (in other words, an e-mail address of the form bob@ contain the appropriate identity) and compare this value to
biloxi.com), most commonly corresponding to a user’s AOR. the From header field of the request. If the certificate cannot
These certificates are also associated with keys that are used be verified, because it is self-signed, or signed by no known
to sign or encrypt bodies of SIP messages. Bodies are signed authority, or if it is verifiable but its subject does not cor-
with the private key of the sender (who may include their respond to the From header field of request, the UAS must
public key with the message as appropriate); however, bodies notify its user of the status of the certificate (including the
are encrypted with the public key of the intended recipient. subject of the certificate, its signer, and any key fingerprint
Obviously, senders must have foreknowledge of the public information) and request explicit permission before proceed-
key of recipients in order to encrypt message bodies. Public ing. If the certificate was successfully verified and the subject
keys can be stored within a UA on a virtual keyring. of the certificate corresponds to the From header field of the
Each UA that supports S/MIME must contain a keyring SIP request, or if the user (after notification) explicitly autho-
specifically for end-user certificates. This keyring should map rizes the use of the certificate, the UAS should add this cer-
between AORs and corresponding certificates. Over time, tificate to a local keyring, indexed by the AOR of the holder
users should use the same certificate when they populate the of the certificate.
744 ◾ Handbook on Session Initiation Protocol
When a UAS sends a response containing an S/MIME certs-only smime-type parameter. A 493 Undecipherable
body that answers the first request in a dialog, or a response sent without any certificate indicates that the respondent
to a non-INVITE request outside the context of a dia- cannot or will not utilize S/MIME encrypted messages,
log, the UAS should structure the body as an S/MIME though they may still support S/MIME signatures. Note
multipart/signed CMS SignedData body. If the desired that a UA that receives a request containing an S/MIME
CMS service is EnvelopedData, the UAS should send the body that is not optional (with a Content-Disposition
EnvelopedData message encapsulated within a SignedData header handling parameter of required) must reject the
message. When a UAC receives a response containing an request with a 415 Unsupported Media Type response if
S/MIME CMS body that includes a certificate, the UAC the MIME type is not understood. A UA that receives such
should first validate the certificate, if possible, with any a response when S/MIME is sent should notify its user
appropriate root certificate. The UAC should also deter- that the remote device does not support S/MIME, and it
mine the subject of the certificate and compare this value may subsequently resend the request without S/MIME, if
to the To field of the response; however, the two may very appropriate; however, this 415 response may constitute a
well be different, and this is not necessarily indicative of a downgrade attack.
security breach. If the certificate cannot be verified because If a UA sends an S/MIME body in a request, but receives
it is self-signed, or signed by no known authority, the UAC a response that contains a MIME body that is not secured,
must notify its user of the status of the certificate (includ- the UAC should notify its user that the session could not be
ing the subject of the certificate, its signator, and any key secured. However, if a UA that supports S/MIME receives a
fingerprint information) and request explicit permission request with an unsecured body, it should not respond with
before proceeding. If the certificate was successfully veri- a secured body; if it expects S/MIME from the sender (e.g.,
fied, and the subject of the certificate corresponds to the because the sender’s From header field value corresponds to
To header field in the response, or if the user (after notifi- an identity on its keychain), the UAS should notify its user
cation) explicitly authorizes the use of the certificate, the that the session could not be secured. A number of condi-
UAC should add this certificate to a local keyring, indexed tions that arise in the previous text call for the notification of
by the AOR of the holder of the certificate. the user when an anomalous certificate-management event
If the UAC had not transmitted its own certificate occurs. Users might well ask what they should do under
to the UAS in any previous transaction, it should use a these circumstances.
CMS SignedData body for its next request or response. First and foremost, an unexpected change in a certifi-
On future occasions, when the UA receives requests or cate, or an absence of security when security is expected,
responses that contain a From header field corresponding is a cause for caution but not necessarily an indication that
to a value in its keyring, the UA should compare the cer- an attack is in progress. Users might abort any connection
tificate offered in these messages with the existing certifi- attempt or refuse a connection request they have received; in
cate in its keyring. If there is a discrepancy, the UA must telephony parlance, they could hang up and call back. Users
notify its user of a change of the certificate (preferably in may wish to find an alternate means to contact the other
terms that indicate that this is a potential security breach) party and confirm that their key has legitimately changed.
and acquire the user’s permission before continuing to pro- Note that users are sometimes compelled to change their
cess the signaling. If the user authorizes this certificate, certificates, for example, when they suspect that the secrecy
it should be added to the keyring alongside any previous of their private key has been compromised. When their pri-
value(s) for this AOR. vate key is no longer private, users must legitimately gen-
Note, however, that this key exchange mechanism does erate a new key and reestablish trust with any users that
not guarantee the secure exchange of keys when self-signed held their old key. Finally, if during the course of a dialog
certificates, or certificates signed by an obscure authority, a UA receives a certificate in a CMS SignedData message
are used—it is vulnerable to well-known attacks. In the that does not correspond with the certificates previously
opinion of the authors of the RFC, however, the security exchanged during a dialog, the UA must notify its user of
it provides is better than nothing; it is, in fact, compa- the change, preferably in terms that indicate that this is a
rable to the widely used SSH application. These limitations potential security breach.
are explored in greater detail in Section 19.4.8. If a UA
receives an S/MIME body that has been encrypted with
a public key unknown to the recipient, it must reject the
19.6.3 Securing MIME Bodies
request with a 493 Undecipherable response. This response Two types of secure MIME bodies are specified to SIP in
should contain a valid certificate for the respondent (cor- RFC 3261. The use of these bodies should follow the S/
responding, if possible, to any AOR given in the To header MIME specification (RFC 5751) with a few variations, as
field of the rejected request) within a MIME body with a follows:
Security Mechanisms in SIP ◾ 745
Route, Max-Forwards, and Proxy-Authorization. If these integrity and confidentiality properties of such header fields.
header fields are not intact end-to-end, implementations If a SIP UA encounters an unknown header field with an
should not consider this a breach of security. Changes integrity violation, it must ignore the header field.
to any other header fields defined in this document con-
stitute an integrity violation; users must be notified of a
discrepancy. 19.6.4.2 Tunneling Integrity
and Authentication
19.6.4.1.2 Confidentiality Tunneling SIP messages within S/MIME bodies can pro-
vide integrity for SIP header fields if the header fields that
When messages are encrypted, header fields may be included in
the sender wishes to secure are replicated in a message/
the encrypted body that are not present in the outer message.
sip MIME body signed with a CMS detached signature.
Some header fields must always have a plaintext version because
Provided that the message/sip body contains at least the
they are required header fields in requests and responses—
fundamental dialog identifiers (To, From, Call-ID, CSeq),
these include To, From, Call-ID, CSeq, and Contact. While
then a signed MIME body can provide limited authentica-
it is probably not useful to provide an encrypted alternative for
tion. At the very least, if the certificate used to sign the
the Call-ID, CSeq, or Contact, providing an alternative to the
body is unknown to the recipient and cannot be verified,
information in the outer To or From is permitted. Note that
the signature can be used to ascertain that a later request in
the values in an encrypted body are not used for the purposes
a dialog was transmitted by the same certificate holder that
of identifying transactions or dialogs—they are merely infor-
initiated the dialog. If the recipient of the signed MIME
mational. If the From header field in an encrypted body dif-
body has some stronger incentive to trust the certificate
fers from the value in the outer message, the value within the
(they were able to validate it, they acquired it from a trusted
encrypted body should be displayed to the user, but must not be
repository, or they have used it frequently), then the signa-
used in the outer header fields of any future messages. Primarily,
ture can be taken as a stronger assertion of the identity of
a UA will want to encrypt header fields that have an end-to-end
the subject of the certificate.
semantic, including Subject, Reply-To, Organization, Accept,
To eliminate possible confusion about the addition or
Accept-Encoding, Accept-Language, Alert-Info, Error-Info,
subtraction of entire header fields, senders should replicate
Authentication-Info, Expires, In-Reply-To, Require, Supported,
all header fields from the request within the signed body.
Unsupported, Retry-After, User-Agent, Server, and Warning.
Any message bodies that require integrity protection must
If any of these header fields are present in an encrypted body,
be attached to the inner message. If a Date header is pres-
they should be used instead of any outer header fields, whether
ent in a message with a signed body, the recipient should
this entails displaying the header field values to users or setting
compare the header field value with its own internal clock,
internal states in the UA. They should not, however, be used in
if applicable. If a significant time discrepancy is detected
the outer headers of any future messages. If present, the Date
(on the order of an hour or more), the UA should alert the
header field MUST always be the same in the inner and outer
user to the anomaly, and note that it is a potential security
headers.
breach.
Since MIME bodies are attached to the inner message,
If an integrity violation in a message is detected by its
implementations will usually encrypt MIME-specific header
recipient, the message may be rejected with a 403 Forbidden
fields, including MIME-Version, Content-Type, Content-
response if it is a request, or any existing dialog may be ter-
Length, Content-Language, Content-Encoding, and
minated. UAs should notify users of this circumstance and
Content-Disposition. The outer message will have the proper
request explicit guidance on how to proceed. The following is
MIME header fields for S/MIME bodies. These header fields
an example of the use of a tunneled message/sip body:
(and any MIME bodies they preface) should be treated as
normal MIME header fields and bodies received in a SIP
INVITE sip:[email protected] SIP/2.0
message. It is not particularly useful to encrypt the follow-
Via: SIP/2.0/UDP pc33.atlanta.com;
ing header fields: Min-Expires, Timestamp, Authorization, branch=z9hG4bKnashds8
Priority, and WWWAuthenticate. This category also To: Bob <sip:[email protected]>
includes those header fields that can be changed by proxy From: Alice <sip:[email protected]>;
servers (described in the preceding section). UAs should never tag=1928301774
include these in an inner message if they are not included in Call-ID: a84b4c76e66710
CSeq: 314159 INVITE
the outer message. UAs that receive any of these header fields
Max-Forwards: 70
in an encrypted body should ignore the encrypted values. Date: Thu, 21 Feb 2002 13:02:03 GMT
Note that extensions to SIP may define additional header Contact: <sip:[email protected]>
fields; the authors of these extensions should describe the Content-Type: multipart/signed;
Security Mechanisms in SIP ◾ 747
Content-Type: application/sdp * (e.g., the Request-URI is a SIP URI that is associated with
v=0 * a URI list at the server). External URI lists are typically set
o=alice 53655765 2353687637 IN IP4 pc33. up using out-of-band mechanisms (e.g., XML Configuration
atlanta.com * Access Protocol [XCAP], RFC 4825). An example of a URI-
s=Session SDP * list service for SUBSCRIBE requests that uses stored URI
t=0 0 * lists is described in RFC 4662. The remainder of this docu-
c=IN IP4 pc33.atlanta.com *
ment provides requirements and a framework for URI-list
m=audio 3456 RTP/AVP 0 1 3 99 *
a=rtpmap:0 PCMU/8000 * services using request-contained URI lists, external URI
lists, or both.
--boundary42
Thus, URI-list services must not perform any request 19.7.4.3 General Issues
explosion for an unauthorized user. URI-list services must
URI-list services may have policies that limit the number of
authenticate users and check whether they are authorized
URIs in the lists they accept, as a very long list could be used
to request the service before performing any request fan-
in a DoS attack to place a large burden on the URI-list service
out. Note that the risk of this attack also exists when a
to send a large number of SIP requests. A URI-list service
client uses stored URI lists. Application servers must use
generates a set of requests from a URI list. RFC 3261 (see
authentication and authorization mechanisms with equiv-
Section 4.2.1.4) provides recommendations that need to be
alent security properties when dealing with stored and
taken into consideration when forming a request from a URI.
request-contained URI lists.
Naturally, those recommendations apply to all SIP URI-list
Even though the previous rule keeps unauthorized
services. The general requirement states that URI-list services
users from using URI-list services, authorized users may
need to authenticate their clients, and the previous rules are
still launch attacks using these services. To prevent these
applicable to URI-list services in general. In addition, speci-
attacks, we introduce the concept of opt-in lists. That is,
fications dealing with individual methods must describe the
URI-list services should not allow a client to place a user
security issues that relate to each particular method.
(identified by his or her URI) in a URI list unless the
user has previously agreed to be placed in such a URI list.
Therefore, URI-list services must not send a request to a
destination that has not agreed to receive requests from 19.8 Consent-Based Communications
the URI-list service beforehand. Users can agree to receive
requests from a URI-list service in several ways, such as for Enhancing Security in SIP
filling a web page, sending an e-mail, signing a contract, or 19.8.1 Objective
using the consent-based communications in SIP specified
in RFC 5360 (see Section 19.8), whose requirements are The SIP supports communications for several services,
discussed in RFC 4453 (see Section 19.8.1). Additionally, including real-time audio, video, text, instant messaging,
users must be able to further describe the requests they and presence. This communication is established by the
are willing to receive. For example, a user may only want transmission of various SIP requests (such as INVITE and
to receive requests from a particular URI-list service on MESSAGE) from an initiator to the recipient with whom
behalf of a particular user. Effectively, these rules make communication is desired. Although a recipient of such a SIP
URI lists that used by URI-list services into opt-in lists. request can reject the request, and therefore decline the ses-
When a URI-list service receives a request with a URI list sion, a network of SIP proxy servers will deliver a SIP request
from a client, the URI-list service checks whether all the to its recipients without their explicit consent. Receipt of
destinations have agreed beforehand to receive requests these requests without explicit consent can cause a number
from the service on behalf of this client. of problems. These include amplification and DoS attacks.
If the URI list has permission to send requests to all These problems are described in more detail in a compan-
of the targets in the request, it does so. If not, it does not ion requirements document (RFC 4453). The consent-based
send any request at all. RFC 5360 (see Section 19.8) speci- communications specified in RFC 5370 that is described
fies a means for the URI-list service to inform the client here solves these security problems in SIP meeting the fol-
that some permissions were missing and how to request lowing requirements specified in RFC 4453:
them. Note that the mechanism used to obtain permis-
sions should not create opportunities to launch DoS ◾◾ A relay is defined as any SIP server, be it a proxy,
amplification attacks. These attacks would be possible if, B2BUA, or some hybrid, that receives a request and
for instance, the URI-list service automatically contacted translates the request URI into one or more next-hop
the full set of targets for which it did not have permissions URIs to which it then delivers a request.
in order to request permissions. The URI-list service would ◾◾ The solution keeps relays from delivering a SIP request
be receiving one SIP request and sending out a number of to a recipient unless the recipient has explicitly granted
authorization request messages. The consent-based com- permission to the relay using appropriately authenti-
munications (RFC 5360) avoids this type of attack by hav- cated messages.
ing the client generate roughly the same amount of traffic ◾◾ The solution prevents relays from generating more
toward the URI-list service as the service generates toward than one outbound request in response to an inbound
the destinations. To have an interoperable way to meet the request, unless permission to do so has been granted by
requirements related to opt-in lists described in this sec- the resource to whom the outbound request was to be
tion, URI-list services MUST implement and should use targeted. This mechanism avoids the consent mecha-
the consent-based communications (RFC 5360). nism itself becoming the focus of DoS attacks.
Security Mechanisms in SIP ◾ 751
◾◾ The permissions are capable of specifying that mes- Request-URI into one or more next-hop URIs (i.e.,
sages from a specific user, identified by a SIP URI that recipient URIs), and delivers the request to those URIs.
is an AOR, are permitted. ◾◾ Target URI: The Request-URI of an incoming request
◾◾ Each recipient AOR is able to specify permissions sepa- that arrives to a relay that will perform a translation
rately for each SIP service that forwards messages to the operation.
recipient. For example, Alice may authorize forwarding ◾◾ Translation logic: The logic that defines a translation
to her from domain A, but not from domain B. operation at a relay. This logic includes the translation’s
◾◾ It is possible for a user to revoke permissions at any target and recipient URIs.
time. ◾◾ Translation o peration: Operation by which a relay
◾◾ It is not required for a user or UA to store information translates the Request-URI of an incoming request
in order to be able to revoke permissions that were pre- (i.e., the target URI) into one or more URIs (i.e., recip-
viously granted for a relay resource. ient URIs) that are used as the Request-URIs of one or
◾◾ The solution works in an interdomain context, with- more outgoing requests.
out requiring preestablished relationships between
domains.
◾◾ The solution works for all current and future SIP
19.8.3 Relays and Translations
methods. Relays play a key role in this framework. A relay is defined
◾◾ The solution is applicable to forking proxies. as any SIP server, be it a proxy, B2BUA, or some hybrid,
◾◾ The solution is applicable to URI-list services, such that receives a request, translates its Request-URI into one
as resource list servers (RFC 5365, see Sections 6.3.2 or more next-hop URIs, and delivers the request to those
and 6.3.3), MESSAGE URI-list services (RFC 5365), URIs. The Request-URI of the incoming request is referred
and conference servers performing dial-out functions to as target URI, and the destination URIs of the outgo-
(RFC 5366, see Section 16.7.3). ing requests are referred to as recipient URIs, as shown in
◾◾ In SIP, URI lists can be stored on the URI-list server Figure 19.21.
or provided in a SIP request. The consent framework Thus, an essential aspect of a relay is that of translation.
works in both cases. When a relay receives a request, it translates the Request-
◾◾ The solution allows anonymous communications, as URI (target URI) into one or more additional URIs (recipi-
long as the recipient is willing to accept anonymous ent URIs). Through this translation operation, the relay can
communications. create outgoing requests to one or more additional recipi-
◾◾ If the recipient of a request wishes to be anonymous ent URIs, thus creating the consent problem. The consent
with respect to the original sender, it is possible for the problem is created by two types of translations: translations
recipient to grant permission for the sender without the based on local data and translations that involve amplifica-
original sender learning the recipient’s identity. tions. Translation operations based on local policy or local
◾◾ The solution prevents attacks that seek to undermine data (such as registrations) are the vehicle by which a request
the underlying goal of consent. That is, it is not possi- is delivered directly to an end point, when it would not oth-
ble to fool the system into delivering a request for which erwise be possible to.
permission was not, in fact, granted. In other words, if a spammer has the address of a user,
◾◾ The solution does not require the recipient of the com- sip:[email protected], it cannot deliver a MESSAGE
munications to be connected to the network at the request to the UA (UA) of that user without having access to
time communications are attempted. the registration data that maps sip:[email protected] to the
◾◾ The solution does not require the sender of a SIP UA on which that user is present. Thus, it is the usage of this
request to be connected at the time that a recipient registration data, and more generally, the translation logic,
provides permission. that is expected to be authorized in order to prevent unde-
◾◾ The solution scales to Internet-wide deployment. sired communications. Of course, if the spammer knows the
UA
Relay
UA Translation Proxy
logic
[…]
B2BUA
Figure 19.22 Relay performing a translation. (Copyright IETF. Reproduced with permission.)
Store-and-
forward server
Relay Permission
request
Permission
Client Translation
Permissions request
logic
Store-and-
forward server
simply ignored by the relay when performing a translation. sensible way to the user. The mechanism to be used to manip-
In principle, permissions are valid as long as the context ulate the translation logic of a particular relay depends on
where they were granted is valid, or until they are revoked. the relay. Two existing mechanisms to manipulate transla-
For example, the permissions obtained by a URI-list SIP ser- tion logic are XCAP (RFC 4825) and REGISTER transac-
vice that distributes MESSAGE requests to a set of recipi- tions. Later, we describe a URI-list service whose translation
ents will be valid as long as the URI-list SIP service exists or logic is manipulated with XCAP as an example of a transla-
until the permissions are revoked. Additionally, if a recipient tion, in order to specify this framework. We also explain
is removed from a relay’s translation logic, the relay should how to apply this consent-based framework to registrations,
delete the permissions related to that recipient. For example, which are a different type of translation. In any case, relays
if the registration of a contact URI expires or is otherwise implementing this framework should have a means to indi-
terminated, the registrar deletes the permissions related to cate that a particular recipient URI is in the states speci-
that contact address. fied in RFC 5362 (i.e., pending, waiting, error, denied, or
It is also recommended that relays request recipients to granted).
refresh their permissions periodically. If a recipient fails to
refresh its permissions for a given period of time, the relay
19.8.4.3 Store-and-Forward Servers
should delete the permissions related to that recipient. This
framework does not provide any guidance for the values of When a MESSAGE request with a permission document
the refreshment intervals because different applications can arrives to the recipient URI to which it was sent by the
have different requirements to set those values. For exam- relay, the receiving user can grant or deny the permission
ple, a relay dealing with recipients that do not implement needed to perform the translation. However, the receiv-
this framework may choose to use longer intervals between ing user may not be available when the MESSAGE request
refreshes. The refresh process in such recipients has to be arrives, or it may have expressed preferences to block all
performed manually by their users (since the recipients incoming requests for a certain time period. In such cases,
do not implement this framework), and having too short a store-and-forward server can act as a substitute for the
refresh intervals may become too heavy a burden for those user and buffer the incoming MESSAGE requests, which
users. are subsequently delivered to the user when he or she is
available again.
There are several mechanisms to implement store-and-
19.8.4.2 Consenting Manipulations
forward message services (e.g., with an instant message
on a Relay’s Translation Logic
to e-mail gateway). Any of these mechanisms can be used
This framework aims to ensure that any particular relay only between a UA and its store-and-forward server as long as they
performs translations toward destinations that have given the agree on which mechanism to use. Therefore, this framework
relay permission to perform such a translation. Consequently, does not make any provision for the interface between UAs
when the translation logic of a relay is manipulated (e.g., a and their store-and-forward servers. Note that the same
new recipient URI is added), the relay obtains permission store-and-forward message service can handle all incoming
from the new recipient in order to install the new translation MESSAGE requests for a user while they are offline, not only
logic. Relays ask recipients for permission using MESSAGE those MESSAGE requests with a permission document in
requests. For example, the relay hosting the URI-list service their bodies.
at sip:[email protected] performs a translation from that Even though store-and-forward servers perform a use-
target URI to a set of recipient URIs. When a client (e.g., the ful function and they are expected to be deployed in most
administrator of that URI-list service) adds bob@example domains, some domains will not deploy them from the out-
.org as a new recipient URI, the relay sends a MESSAGE set. However, UAs and relays in domains without store-and-
request to sip:[email protected] asking whether or not it is forward servers can still use this consent framework. When
OK to perform the translation from sip:friends@example a relay requests permissions from an offline UA that does
.com to sip:[email protected]. not have an associated store-and-forward server, the relay
The MESSAGE request carries in its message body a per- will obtain an error response indicating that its MESSAGE
mission document that describes the translation for which request could not be delivered. The client that attempted to
permissions are being requested and a human-readable part add the offline user to the relay’s translation logic will be
that also describes the translation. If the answer is positive, notified about the error, for example, using the Pending
the new translation logic is installed at the relay. That is, Additions event package (RFC 5362). This client may
the new recipient URI is added. The human-readable part attempt to add the same user at a later point, hopefully when
is included so that UAs that do not understand permission the user is online. Clients can discover whether or not a user
documents can still process the request and display it in a is online by using a presence service, for instance.
754 ◾ Handbook on Session Initiation Protocol
19.8.4.4 Recipient Grant Permissions attacks. In such an attack, the attacker would add a large num-
ber of recipient URIs to the translation logic of a relay. The
Permission documents generated by a relay include URIs that
relay would then send a MESSAGE request to each of those
can be used by the recipient of the document to grant or deny
recipient URIs. The bandwidth generated by the relay would
the relay the permission described in the document. Relays
be much higher than the bandwidth used by the attacker to
always include SIP URIs and can include HTTP (RFCs
add those recipient URIs to the translation logic of the relay.
7230–7235) URIs for this purpose. Consequently, recipi-
This framework uses a credit-based authorization mechanism
ents provide relays with permissions using SIP PUBLISH
to avoid the attack just described. It requires users adding
requests or HTTP GET requests.
new recipient URIs to a translation to generate an amount
of bandwidth that is comparable to the bandwidth the relay
19.8.4.5 Entities Implementing This Framework will generate when sending MESSAGE requests toward those
recipient URIs. When XCAP is used, this requirement is met
The goal of this framework is to keep relays from executing by not allowing clients to add more than one URI per HTTP
translations toward unwilling recipients. Therefore, all relays transaction. When a REGISTER transaction is used, this
must implement this framework in order to avoid being used requirement is met by not allowing clients to register more
to perform attacks (e.g., amplification attacks). This frame- than one contact per REGISTER transaction.
work has been designed with backwards compatibility in
mind so that legacy UAs (i.e., UAs that do not implement
this framework) can act both as clients and recipients with an 19.8.5.1.1 Relay’s Behavior
acceptable level of functionality. However, it is recommended
Relays implementing this framework must not allow clients
that UAs implement this framework, which includes support-
to add more than one recipient URI per transaction. If a cli-
ing the Pending Additions event package specified in RFC
ent using XCAP attempts to add more than one recipient
5362, the format for permission documents specified in RFC
URI in a single HTTP transaction, the XCAP server should
5361, and the header fields and response code specified in this
return an HTTP 409 Conflict response. The XCAP server
document, in order to achieve full functionality. The only
should describe the reason for the refusal in an XML body
requirement that this framework places on store-and-forward
using the <constraint-failure> element, as described
servers is that they need to be able to deliver encrypted and
in RFC 4825. If a client attempts to register more than one
integrity-protected messages to their UAs, as discussed later.
contact in a single REGISTER transaction, the registrar
However, this is not a requirement specific to this framework
should return a SIP 403 Forbidden response and explain the
but a general requirement for store-and-forward servers.
reason for the refusal in its reason phrase (e.g., maximum one
contact per registration).
19.8.5 Framework Operations
This section specifies this consent framework using an exam- 19.8.5.2 Subscription to the Permission Status
ple of the prototypical call flow. The elements described ear- Clients need a way to be informed about the status of the
lier (i.e., relays, translations, and store-and-forward servers) operations they requested. Otherwise, users can be wait-
play an essential role in this call flow. Figure 19.24 shows the ing for an operation to succeed when it has actually already
complete process of adding a recipient URI (sip:B@example failed. In particular, if the target of the request for consent
.com) to the translation logic of a relay. User A attempts to was not reachable and did not have an associated store-and-
add sip:[email protected] as a new recipient URI to the trans- forward server, the client needs to know to retry the request
lation logic of the relay (F1). User A uses XCAP (RFC 4825) later. The Pending Additions SIP event package (RFC 5362)
and the XML format for representing resource lists (RFC is a way to provide clients with that information. Clients can
4826) to perform this addition. Since the relay does not have use the Pending Additions SIP event package to be informed
permission from sip:[email protected] to perform translations about the status of the operations they requested. That is, the
toward that URI, the relay places sip:[email protected] in the client will be informed when an operation (e.g., the addition
pending state, as specified in RFC 5362. of a recipient URI to a relay’s translation logic) is authorized
(and thus executed) or rejected. Clients use the target URI
of the SIP translation being manipulated to subscribe to
19.8.5.1 Amplification Avoidance
the Pending Additions event package. In our example, after
Once sip:[email protected] is in the pending state, the relay receiving the response from the relay (F2), user A subscribes
needs to ask user B for permission by sending a MESSAGE to the Pending Additions event package at the relay (F5).
request to sip:[email protected]. However, the relay needs to This subscription keeps user A informed about the status of
ensure that it is not used as an amplifier to launch amplification the permissions (e.g., granted or denied) the relay will obtain.
Security Mechanisms in SIP ◾ 755
B’s Store-and-
[email protected] Relay [email protected]
forward server
F1 Add recipient
sip:[email protected]
F6 200 OK
F7 NOTIFY
F8 200 OK
User B
goes online
F9 Request for stored
messages
F12 200 OK
F13 NOTIFY
F14 200 OK
Figure 19.24 Prototypical call flow. (Copyright IETF. Reproduced with permission.)
19.8.5.2.1 Relay’s Behavior user B’s store-and-forward server. User B will later go online
and authorize the translation by using one of those URIs, as
Relays should support the Pending Additions SIP event
described later. The MESSAGE request also carries a body
package specified in RFC 5362.
part that contains the same information as the permission
document but in a human-readable format. When user B
uses one of the URIs in the permission document to grant
19.8.5.3 Request for Permission
or deny permissions, the relay needs to make sure that it was
A relay requests permissions from potential recipients to actually user B using that URI, and not an attacker. The relay
add them to its translation logic using MESSAGE requests. can use any of the methods described in Section 19.8.5.6 to
In our example, on receiving the request to add user B to authenticate the permission document.
the translation logic of the relay (F1), the relay generates a
MESSAGE request (F3) toward sip:[email protected]. This 19.8.5.3.1 Relay’s Behavior
MESSAGE request carries a permission document, which
describes the translation that needs to be authorized, and Relays that implement this framework must obtain permis-
carries a set of URIs to be used by the recipient to grant or to sions from potential recipients before adding them to their
deny the relay permission to perform that translation. Since translation logic. Relays request permissions from poten-
user B is offline, the MESSAGE request will be buffered by tial recipients using MESSAGE requests. Section 19.8.5.6
756 ◾ Handbook on Session Initiation Protocol
for the sender to send requests to the target URI and 19.8.5.6 Permission Grant
for a relay receiving those requests to forward them to
A recipient gives a relay permission to execute the trans-
this URI. This is also called the recipient URI.
lation described in a permission document by sending a
URIs to grant permission: URIs that recipients can use
SIP PUBLISH or an HTTP GET request to one of the
to grant the relay permission to perform the translation
URIs to grant permissions contained in the document.
described in the document. Relays MUST support the
Similarly, a recipient denies a relay permission to execute
use of SIP and SIPS URIs in permission documents
the translation described in a permission document by
and may support the use of HTTP and HTTPS URIs.
sending a SIP PUBLISH or an HTTP GET request to one
URIs to deny permission: URIs that recipients can use
of the URIs to deny permissions contained in the docu-
to deny the relay permission to perform the translation
ment. Requests to grant or deny permissions contain an
described in the document. Relays must support the
empty body. In our example, user B obtains the permis-
use of SIP and SIPS URIs in permission documents
sion document (1F0) that was received earlier by its store-
and may support the use of HTTP and HTTPS URIs.
and-forward server in the MESSAGE request (F3). User
Permission documents can contain wildcards. For
B authorizes the translation described in the permission
example, a permission document can request permis-
document received by sending a PUBLISH request (F11)
sion for any relay to forward requests coming from a
to the SIP URI to grant permissions contained in the per-
particular sender to a particular recipient.
mission document.
Such a permission document would apply to any target
URI. That is, the field containing the identity of the origi- 19.8.5.7 Relay’s Behavior after Recipient
nal recipient would match any URI. However, the recipient Granting Permission
URI must not be wildcarded. Entities implementing this
Relays must ensure that the SIP PUBLISH or the HTTP
framework must support the format for permission docu-
GET request received was generated by the recipient of the
ments defined in RFC 5361 and may support other formats.
translation and not by an attacker. Relays can use four meth-
In our example, the permission document in the MESSAGE
ods to authenticate those requests: SIP Identity, P-Asserted-
request (F3) sent by the relay contains the following values:
Identity (RFC 3325, see Sections 2.8, 10.4, and 20.3), a
return-routability test, or SIP Digest. While return-routabil-
◾◾ Identity of the sender: any sender
ity tests can be used to authenticate both SIP PUBLISH and
◾◾ Identity of the original recipient: sip:friends@example
HTTP GET requests, SIP Identity, P-Asserted-Identity, and
.com
SIP Digest can only be used to authenticate SIP PUBLISH
◾◾ Identity of the final recipient: sip:[email protected]
requests. SIP Digest can only be used to authenticate recipi-
◾◾ URI to grant permission: sips:grant-1awdch5Fasddfce
ents that share a secret with the relay (e.g., recipients that are
[email protected]
in the same domain as the relay).
◾◾ URI to grant permission: https://example.com/grant
-1awdch5Fasddfce34
◾◾ URI to deny permission: sips:deny-23rCsdfgvdT5sdf 19.8.5.7.1 SIP Identity
[email protected]
The SIP Identity (RFC 4474, see Sections 2.8 and 19.4.8)
◾◾ URI to deny permission: https://example.com/deny
mechanism can be used to authenticate the sender of a
-23rCsdfgvdT5sdfgye
PUBLISH request. The relay MUST check that the origina-
tor of the PUBLISH request is the owner of the recipient
It is expected that the Sender field often contains a wild-
URI in the permission document. Otherwise, the PUBLISH
card. However, scenarios involving request-contained URI
request should be responded to with a 401 Unauthorized
lists, such as the one described in Section 19.8.5.10, can
response and must not be processed further.
require permission documents that apply to a specific sender.
In cases where the identity of the sender matters, relays must
authenticate senders. 19.8.5.7.2 P-Asserted-Identity
The P-Asserted-Identity mechanism can also be used to
19.8.5.5 Permission Requested Notification authenticate the sender of a PUBLISH request. However,
On receiving the MESSAGE request (F3), user B’s store-and- as discussed in RFC 3325 (see Sections 2.8, 10.4, and
forward server stores it because user B is offline at that point. 20.3), this mechanism is intended to be used only within
When user B goes online, user B fetches all the requests its networks of trusted SIP servers. That is, the use of this
store-and-forward server has stored (F9). mechanism is only applicable inside an administrative
758 ◾ Handbook on Session Initiation Protocol
domain with previously agreed-upon policies. The relay 19.8.5.8 Permission Granted Notification
must check that the originator of the PUBLISH request
On receiving the PUBLISH request (F11), the relay sends a
is the owner of the recipient URI in the permission
NOTIFY request (F13) to inform user A that the permission
document. Otherwise, the PUBLISH request should be
for the translation has been received and that the translation
responded to with a 401 Unauthorized response and must
logic at the relay has been updated. That is, sip:B@example
not be processed further.
.com has been added as a recipient URI.
F7 200 OK
19.8.5.7.4 SIP Digest
F8 PUBLISH uri-deny
The SIP Digest mechanism can be used to authenticate the
sender of a PUBLISH request as long as that sender shares a
F9 200 OK
secret with the relay. The relay must check that the originator
of the PUBLISH request is the owner of the recipient URI in
the permission document. Otherwise, the PUBLISH request
should be responded to with a 401 Unauthorized response Figure 19.25 Permission revocation. (Copyright IETF.
and must not be processed further. Reproduced with permission.)
Security Mechanisms in SIP ◾ 759
translation (i.e., not only to URI-list services). Registrations (F1) and receives a 202 Accepted response (F2). Since the
are a different type of translations that deserve discussion. relay does not have permission from sip:[email protected].
Registrations are a special type of translations. The user reg- com to perform translations toward that recipient URI,
istering has a trust relationship with the registrar in its home the relay places sip:[email protected] in the pending
domain. This is not the case when a user gives any type of state. Once sip:a@ ws123.example.com is in the Permission
permissions to a relay in a different domain. Traditionally, Pending state, the registrar needs to ask sip:[email protected]
REGISTER transactions have performed two operations at ple.com for permission by sending a MESSAGE request
the same time: setting up a translation and authorizing the (F3). After receiving the response from the relay (F2), user
use of that translation. For example, a user registering its A subscribes to the Pending Additions event package at the
current contact URI is giving permission to the registrar to registrar (F5). This subscription keeps the user informed
forward traffic sent to the user’s AOR to the registered con- about the status of the permissions (e.g., granted or denied)
tact URI. This works fine when the entity registering is the the registrar will obtain. The rest of the process is similar to
same as the one that will be receiving traffic at a later point the one described later.
(e.g., the entity receives traffic over the same connection used Permission documents generated by registrars are typi-
for the registration as described in RFC 5626; see Section cally very general. For example, in one such document, a
13.2). However, this schema creates some potential attacks registrar can ask a recipient for permission to forward any
that relate to third-party registrations. request from any sender to the recipient’s URI. This is the
An attacker binds, via a registration, his or her AOR with type of granularity that this framework intends to provide
the contact URI of a victim. Now the victim will receive for registrations. Users who want to define how incoming
unsolicited traffic that was originally addressed to the requests are treated with a finer granularity (e.g., requests
attacker. The process of authorizing a registration is shown from user A are only accepted between 9:00 and 11:00) will
in Figure 19.27. User A performs a third-party registration have to use other mechanisms such as CPL (RFC 3880).
F1 REGISTER
Contact:
sip:[email protected]
F2 202 Accepted
F3 MESSAGE
sip:[email protected]
permission document
F4 200 OK
F5 SUBSCRIBE
Event: pending-additions
F6 200 OK
F7 NOTIFY
F8 200 OK
F9 PUBLISH uri-up
F10 200 OK
F11 NOTIFY
F12 200 OK
Note that, as indicated previously, UAs using the same con- Trigger-Consent: sip:[email protected]
nection to register and to receive traffic from the registrar, as ;target-uri="sip:friends@relay.
example.com"
described in RFC 5626 (see Section 13.2), do not need to use
the mechanism described in this section. A UA being regis-
tered by a third party can be unable to use the SIP Identity, 19.8.6 Security Considerations
P-Asserted-Identity, or SIP Digest mechanisms to prove to
the registrar that the UA is the owner of the URI being Security has been discussed throughout the chapter. However,
registered (e.g., sip:[email protected]), which is the recipient there are some issues that deserve special attention. Relays gen-
URI of the translation. In this case, return routability must erally implement several security mechanisms that relate to
be used. client authentication and authorization. Clients are typically
authenticated before they can manipulate a relay’s translation
logic. Additionally, clients are typically also authenticated
19.8.5.12 Relays Generating Traffic and sometimes need to perform SPAM prevention tasks
toward Recipients (RFC 5039) when they send traffic to a relay. It is important
Relays generating traffic toward recipients need to make sure that relays implement these types of security mechanisms.
that those recipients can revoke the permissions they gave at However, they fall outside the scope of this framework. Even
any time. The Trigger-Consent helps achieve this. with these mechanisms in place, there is still a need for relays
to implement this framework because the use of these mecha-
nisms does not prevent authorized clients to add recipients
19.8.5.12.1 Relay’s Behavior to a translation without their consent. Consequently, relays
A relay executing a translation that involves sending a request performing translations must implement this framework.
to a URI from which permissions were obtained previously Note that, as indicated previously, UAs using the same con-
should add a Trigger-Consent header field to the request. The nection to register and to receive traffic from the registrar, as
URI in the Trigger-Consent header field MUST have a target- described in RFC 5626 (see Section 13.2), do not need to use
uri header field parameter identifying the target URI of the this framework.
translation. On receiving a PUBLISH request addressed to Therefore, a registrar that did not accept third-party reg-
the URI that a relay previously placed in a Trigger-Consent istrations would not need to implement this framework. As
header field, the relay should send a MESSAGE request to pointed out earlier, when return-routability tests are used to
the corresponding recipient URI with a permission docu- authenticate recipients granting or denying permissions, the
ment. Therefore, the relay needs to be able to correlate the URIs used to grant or deny permissions need to be protected
URI it places in the Trigger-Consent header field with the from attackers. SIPS URIs provide a good tool to meet this
recipient URI of the translation. requirement, as described in RFC 5361. When store-and-
forward servers are used, the interface between a UA and
its store-and-forward server is frequently not based on SIP.
19.8.5.12.2 Definition of the Trigger-Consent In such a case, SIPS cannot be used to secure those URIs.
Header Field Implementations of store-and-forward servers must provide a
mechanism for delivering encrypted and integrity-protected
The following is the ABNF (RFC 5234) syntax of the messages to their UAs.
Trigger-Consent header field for convenience, although a The information provided by the Pending Additions
complete BNF for SIP messages provided in Section 2.4.1: event package can be sensitive. For this reason, relays need
to use strong means for authentication and information
Trigger-Consent = "Trigger-Consent" HCOLON
trigger-cons-spec confidentiality. SIPS URIs are a good mechanism to meet
*(COMMA this requirement. Permission documents can reveal sensitive
trigger-cons-spec) information. Attackers may attempt to modify them in order
trigger-cons-spec = (SIP-URI/SIPS-URI) to have clients grant or deny permissions different from the
*(SEMI trigger-param) ones they think they are granting or denying. For this reason,
trigger-param = target-uri/generic-param
it is recommended that relays use strong means for informa-
target-uri = "target-uri" EQUAL
LDQUOT *(qdtext/quoted- tion integrity protection and confidentiality when sending
pair) RDQUOT permission documents to clients. The mechanism used for
conveying information to clients should ensure the integrity
The target-uri header field parameter MUST contain and confidentially of the information. To achieve these, an
a URI. The following is an example of a Trigger-Consent end-to-end SIP encryption mechanism, such as S/MIME
header field: (see Section 19.6), should be used. If strong end-to-end
762 ◾ Handbook on Session Initiation Protocol
security means (such as above) are not available, it is recom- account on a single server. Consider two proxy/registrar ser-
mended that hop-by-hop security based on TLS and SIPS vices, P1 and P2, and four AORs, a@P1, b@P1, a@P2, and
URIs is used. b@P2. Using normal REGISTER requests, establish bind-
ings to these AORs as follows (nonessential details elided):
a@P1
a@P2 b@P2
Figure 19.28 Attack request propagation. (Copyright IETF. Reproduced with permission.)
This attack was realized in practice during one of the would be reduced to 14; in the variation involving one server,
SIP Interoperability Test (SIPit) sessions. The scenario was the number of stimulated messages would be reduced to 10.)
extended to include more than two proxies, and the par- However, there is a variant of the attack that uses mul-
ticipating proxies all limited Max-Forwards to be no larger tiple AORs where loop detection alone is insufficient pro-
than 20. After a handful of messages to construct the tection. In this variation, each participating AOR forks to
attack, the participating proxies began bombarding each all the other participating AORs. For small numbers of par-
other. Extrapolating from the several hours the experiment ticipating AORs (e.g., 10), paths through the resulting tree
was allowed to run, the scenario would have completed will not loop until very large numbers of messages that have
in just under 10 days. Had the proxies used the RFC been generated. Acquiring a sufficient number of AORs to
3261-recommended Max-Forwards value of 70, and assum- launch such an attack on networks currently available is
ing they performed linearly as the state they held increased, quite feasible. In this scenario, requests will often take many
it would have taken 3 trillion years to complete the process- hops to complete a loop, and there are a very large number
ing of the single INVITE request that initiated the attack. It of different loops that will occur during the attack. In fact,
is interesting to note that a few proxies rebooted during the if N is the number of participating AORs, and provided N
scenario and rejoined the attack when they restarted (as long is less than or equal to Max-Forwards, the amount of traffic
as they maintained registration state across reboots). This generated by the attack is greater than N!, even if all proxies
points out that if this attack were launched on the Internet involved are performing loop detection. Suppose we have a
at large, it might require coordination among all the affected set of N AORs, all of which are set up to fork to the entire set.
elements to stop it. Loop detection, as specified in this docu- For clarity, assume AOR 1 is where the attack begins. Every
ment, at any of the proxies in the scenarios described thus permutation of the remaining N − 1 AORs will play out,
far, would have stopped the attack immediately. (If all the defining (N − 1)! distinct paths, without repeating any AOR.
proxies involved implemented this loop detection, the total Then, each of these paths will fork N ways one last time, and
number of stimulated messages in the first scenario described a loop will be detected on each of these branches. These final
764 ◾ Handbook on Session Initiation Protocol
Table 19.1 Forwarded Requests versus increase in outstanding concurrent transactions system-wide
Number of Participating AORs may be an indication of the presence of this kind of attack
across many resources. Deployments in which it is feasible
N Requests
for an attacker to obtain a very large number of resources
1 1 are particularly at risk. If detecting and intervening in each
2 4
instance of the attack is insufficient to reduce the load, over-
load may occur.
3 15 Implementers and operators are encouraged to follow
4 64
the recommendations being developed for requirements and
design for management of overload in SIP handling overload
5 325 conditions specified in RFCs 5390 (see Section 13.3.9) and
6 1956
6357, respectively. However, RFC 7339 specifies (see Section
13.3) actual standards extending RFC 3261 for overload
7 13,699 control in SIP. Designers of protocol gateways should con-
8 109,600 sider the implications of this kind of attack carefully. As an
example, if a message transits from a SIP network into the
9 986,409 PSTN and subsequently back into a SIP network, and infor-
10 9,864,100 mation about the history of the request on either side of the
protocol translation is lost, it becomes possible to construct
loops that neither Max-Forwards nor loop detection can pro-
tect against.
branches alone total N! requests ((N − 1)! paths, with N forks This, combined with forking amplification on the SIP
at the end of each path; see Table 19.1). side of the loop, will result in an attack as described in this
In a network where all proxies are performing loop detec- document that the mechanisms here will not abate, not
tion, an attacker is still afforded rapidly increasing returns even to the point of limiting the number of concurrent mes-
on the number of AORs they are able to leverage. The Max- sages in the attack. These considerations are particularly
Breadth mechanism defined in this document is designed to important for designers of gateways from SIP to SIP (e.g.,
limit the effectiveness of this variation of the attack. In all as found in B2BUAs). Many existing B2BUA implementa-
of the scenarios, it is important to notice that at each fork- tions are under some pressure to hide as much information
ing proxy, an additional branch could be added pointing about the two sides communicating with them as possible.
to a single victim (that might not even be a SIP-aware ele- Implementers of such implementations may be tempted to
ment), resulting in a massive amount of traffic being directed remove the data that might be used by the loop-detection,
toward the victim from potentially as many sources as there Max-Forwards, or Max-Breadth mechanisms at other points
are AORs participating in the attack. in the network, taking on the responsibility for detecting
loops (or forms of this attack). However, if two such imple-
mentations are involved in the attack, neither will be able
19.9.3 Security Considerations to detect it. In addition, to limit the total number of con-
RFC 5393 is entirely about documenting and addressing a currence branches caused by a forked SIP request, RFC
vulnerability in SIP proxies as defined by RFC 3261 that can 5393 that is described in this section provides some guid-
lead to an exponentially growing message exchange attack. ance for intelligent use of the Via header and enhances the
The Max-Breadth mechanism defined by RFC 5393 (see loop detection algorithm defined in RFC 3261 (see Sections
Section 9.11.2) does not decrease the aggregate traffic caused 3.11.3, 3.11.4, and 9.11).
by the forking-loop attack. It only serves to spread the traf-
fic caused by the attack over a longer period by limiting the
number of concurrent branches that are being processed at
the same time. An attacker could pump multiple requests
19.10 Nonrepudiation Services in SIP
into a network that uses the Max-Breadth mechanism and Nonrepudiation is a huge necessity in SIP both for private
gradually build traffic to unreasonable levels. Deployments and government domain because of most essential services
should monitor carefully and react to gradual increases in the that are offered using SIP. For example, billing, call/media
number of concurrent outstanding transactions related to a content tracing/interception, and many other services related
given resource to protect against this possibility. Operators to the real-time multimedia applications that are being used
should anticipate being able to temporarily disable any by users may demand nonrepudiation services for the busi-
resources identified as being used in such an attack. A rapid ness, government, and legal reasons. SIP has a complete set of
Security Mechanisms in SIP ◾ 765
security mechanisms for both the session and the media level F1 REGISTER Bob -> Registrar
for real-time networked multimedia applications that consist
REGISTER sip:registrar.biloxi.com SIP/2.0
of one or many media such as audio, video, or data applica- Via: SIP/2.0/UDP bobspc.biloxi.com:
tions. SIP/SDP/RTP RFCs such as 3261, 3329, 5379, 3323, 5060;branch=z9hG4bKnashds7
4568, 5029, 5939, 3171, 6191, and others support authen- Max-Forwards: 70
tication (see Section 19.4), authorization (see Section 19.5), To: Bob <sip:[email protected]>
integrity (see Section 19.6), confidentiality (see Section 19.6), From: Bob <sip:[email protected]>;tag=456248
and privacy/anonymity (see Chapter 20), and as described Call-ID: 843817637684230@998sdasdh09
CSeq: 1826 REGISTER
earlier. Contact: <sip:[email protected]>
We have also illustrated earlier that SIP RFCs like Expires: 7200
3261, 4347, and others recommend TLS and DTLS for Content-Length: 0
transport-related security based on which SIP signaling
and media traffic are transferred. In addition, S/MIME- The registration expires after 2 hours. The registrar
based certificates and key exchanges are described earlier responds with a 200 OK:
in detail (see Section 19.6). In SIP, nonrepudiation that is
a value-added service can be built on the top of the basic F2 200 OK Registrar -> Bob
SIP authentication/authorization, integrity, confidential-
SIP/2.0 200 OK
ity, and privacy (see Chapter 20) standards. RFC 4740 Via: SIP/2.0/UDP bobspc.biloxi.com:
has described how the SIP-based applications can use the 5060;branch=z9hG4bKnashds7
DIAMETER protocol for Authentication, Authorization, ;received=192.0.2.4
and Accounting (AAA) services. Thus, we have left it to To: Bob <sip:[email protected]>;tag=2493k59kd
the implementers as an option to create their own nonre- From: Bob <sip:[email protected]>;tag=456248
Call-ID: 843817637684230@998sdasdh09
pudiation services.
CSeq: 1826 REGISTER
Contact: <sip:[email protected]>
Expires: 7200
Content-Length: 0
19.11 Call Flows Explaining SIP
Security Features 19.11.2 Session Setup
RFC 3261 provides some call flow examples articulating the This example contains the full details of the example session
security features in SIP. We have included those in the fol- setup in Section 3.7. The message flow is shown in Figure 3.9
lowing section often omitting the message body and the cor- of Section 3.7.1. Note that these flows show the minimum
responding Content-Length and Content-Type header fields required set of header fields—some other header fields such
for brevity. as Allow and Supported would normally be present.
Let us assume Bob registers on start-up. The message flow INVITE sip:[email protected] SIP/2.0
Via: SIP/2.0/UDP pc33.atlanta.com;
is shown in Figure 19.29. Note that the authentication usu-
branch=z9hG4bKnashds8
ally required for registration is not shown for simplicity. Bob Max-Forwards: 70
registers his softphone at the registration server (biloxi.com). To: Bob <sip:[email protected]>
From: Alice <sip:[email protected]>;
tag=1928301774
Call-ID: a84b4c76e66710
Registrar Bob’s CSeq: 314159 INVITE
(biloxi.com) softphone Contact: <sip:[email protected]>
Content-Type: application/sdp
F1 REGISTER Content-Length: 142
(Alice’s SDP not shown)
Media encryption is beyond the scope of this chapter. The for example, SIP phones, have an interest in ascertaining
considerations that follow first examine a set of classic threat the identities of originators of requests. This threat demon-
models that broadly identify the security needs of SIP. The strates the need for security services that enable SIP entities
set of security services required to address these threats is to authenticate the originators of requests.
then detailed, followed by an explanation of several security
mechanisms that can be used to provide these services. Next,
19.12.1.2 Impersonating a Server
the requirements for implementers of SIP are enumerated,
along with exemplary deployments in which these security The domain to which a request is destined is generally speci-
mechanisms could be used to improve the security of SIP. fied in the Request-URI. UAs commonly contact a server in
Some notes on privacy conclude this section. this domain directly in order to deliver a request. However,
there is always a possibility that an attacker could imperson-
ate the remote server, and that the UA’s request could be inter-
19.12.1 Attacks and Threat Models cepted by some other party. For example, consider a case in
This section details some threats that should be common to which a redirect server at one domain, chicago.com, imper-
most deployments of SIP. These threats have been chosen sonates a redirect server at another domain, biloxi.com. A UA
specifically to illustrate each of the security services that SIP sends a request to biloxi.com, but the redirect server at chi-
requires. The following examples by no means provide an cago.com answers with a forged response that has appropriate
exhaustive list of the threats against SIP; rather, these are SIP header fields for a response from biloxi.com. The forged
classic threats that demonstrate the need for particular secu- contact addresses in the redirection response could direct the
rity services that can potentially prevent whole categories originating UA to inappropriate or insecure resources, or sim-
of threats. These attacks assume an environment in which ply prevent requests for biloxi.com from succeeding.
attackers can potentially read any packet on the network—it This family of threats has a vast membership, many of
is anticipated that SIP will frequently be used on the pub- which are critical. As a converse to the registration hijacking
lic Internet. Attackers on the network may be able to mod- threat, consider the case in which a registration sent to biloxi.
ify packets (perhaps at some compromised intermediary). com is intercepted by chicago.com, which replies to the inter-
Attackers may wish to steal services, eavesdrop on communi- cepted registration with a forged 301 Moved Permanently
cations, or disrupt sessions. response. This response might seem to come from biloxi.com
yet designate chicago.com as the appropriate registrar. All
future REGISTER requests from the originating UA would
19.12.1.1 Registration Hijacking
then go to chicago.com. Prevention of this threat requires a
The SIP registration mechanism allows a UA to identify itself means by which UAs can authenticate the servers to whom
to a registrar as a device at which a user (designated by an they send requests.
AOR) is located. A registrar assesses the identity asserted in
the From header field of a REGISTER message to determine
19.12.1.3 Tampering with Message Bodies
whether this request can modify the contact addresses asso-
ciated with the AOR in the To header field. While these two As a matter of course, SIP UAs route requests through
fields are frequently the same, there are many valid deploy- trusted proxy servers. Regardless of how that trust is estab-
ments in which a third-party may register contacts on a user’s lished (authentication of proxies is discussed elsewhere in
behalf. The From header field of a SIP request, however, this section), a UA may trust a proxy server to route a request,
can be modified arbitrarily by the owner of a UA, and this but not to inspect or possibly modify the bodies contained
opens the door to malicious registrations. An attacker that in that request. Consider a UA that is using SIP message
successfully impersonates a party authorized to change con- bodies to communicate session encryption keys for a media
tacts associated with an AOR could, for example, deregister session. Although it trusts the proxy server of the domain it is
all existing contacts for a URI and then register their own contacting to deliver signaling properly, it may not want the
device as the appropriate contact address, thereby directing administrators of that domain to be capable of decrypting
all requests for the affected user to the attacker’s device. any subsequent media session. Worse yet, if the proxy server
This threat belongs to a family of threats that rely on were actively malicious, it could modify the session key,
the absence of cryptographic assurance of a request’s origina- either acting as an MITM, or perhaps changing the security
tor. Any SIP UAS that represents a valuable service (e.g., a characteristics requested by the originating UA.
gateway that interworks SIP requests with traditional tele- This family of threats applies not only to session keys,
phone calls) might want to control access to its resources by but also to most conceivable forms of content carried end-to-
authenticating requests that it receives. Even end-user UAs, end in SIP. These might include MIME bodies that should
Security Mechanisms in SIP ◾ 769
be rendered to the user, SDP, or encapsulated telephony sig- target host with a large amount of network traffic. In many
nals, among others. Attackers might attempt to modify SDP architectures, SIP proxy servers face the public Internet in
bodies, for example, in order to point RTP media streams order to accept requests from worldwide IP end points. SIP
to a wiretapping device in order to eavesdrop on subsequent creates a number of potential opportunities for distributed
voice communications. Also note that some header fields in DoS attacks that must be recognized and addressed by the
SIP are meaningful end-to-end, for example, Subject. UAs implementers and operators of SIP systems.
might be protective of these header fields as well as bod- Attackers can create bogus requests that contain a falsi-
ies (a malicious intermediary changing the Subject header fied source IP address and a corresponding Via header field
field might make an important request appear to be spam, that identify a targeted host as the originator of the request,
for example). However, since many header fields are legiti- and then send this request to a large number of SIP net-
mately inspected or altered by proxy servers as a request is work elements, thereby using hapless SIP UAs or proxies to
routed, not all header fields should be secured end-to-end. generate DoS traffic aimed at the target. Similarly, attack-
For these reasons, the UA might want to secure SIP message ers might use falsified Route header field values in a request
bodies, and in some limited cases header fields, end-to-end. that identify the target host and then send such messages
The security services required for bodies include confidenti- to forking proxies that will amplify messaging sent to the
ality, integrity, and authentication. These end-to-end services target. Record-Route could be used to a similar effect when
should be independent of the means used to secure interac- the attacker is certain that the SIP dialog initiated by the
tions with intermediaries such as proxy servers. request will result in numerous transactions originating in
the backwards direction.
A number of DoS attacks open up if REGISTER
19.12.1.4 Tearing Down Sessions
requests are not properly authenticated and authorized by
Once a dialog has been established by initial messaging, registrars. Attackers could deregister some or all users in an
subsequent requests can be sent that modify the state of the administrative domain, thereby preventing these users from
dialog or session. It is critical that principals in a session can being invited to new sessions. An attacker could also register
be certain that such requests are not forged by attackers. a large number of contacts designating the same host for a
Consider a case in which a third-party attacker captures some given AOR in order to use the registrar and any associated
initial messages in a dialog shared by two parties in order to proxy servers as amplifiers in a DoS attack. Attackers might
learn the parameters of the session (To tag, From tag, and so also attempt to deplete available memory and disk resources
forth) and then inserts a BYE request into the session. The of a registrar by registering huge numbers of bindings. The
attacker could opt to forge the request such that it seemed to use of multicast to transmit SIP requests can greatly increase
come from either participant. Once the BYE is received by its the potential for DoS attacks. These problems demonstrate a
target, the session will be torn down prematurely. general need to define architectures that minimize the risks
Similar midsession threats include the transmission of of DoS, and the need to be mindful in recommendations for
forged re-INVITEs that alter the session (possibly to reduce security mechanisms of this class of attacks.
session security or redirect media streams as part of a wire-
tapping attack). The most effective countermeasure to this
threat is the authentication of the sender of the BYE. In this 19.12.2 Security Mechanisms
instance, the recipient needs only to know that the BYE came From the threats described above, we gather that the fun-
from the same party with whom the corresponding dialog damental security services required for the SIP protocol are
was established (as opposed to ascertaining the absolute iden- as follows: preserving the confidentiality and integrity of
tity of the sender). Also, if the attacker is unable to learn the messaging, preventing replay attacks or message spoofing,
parameters of the session due to confidentiality, it would not providing for the authentication and privacy of the par-
be possible to forge the BYE. However, some intermediaries ticipants in a session, and preventing DoS attacks. Bodies
(like proxy servers) will need to inspect those parameters as within SIP messages separately require the security services
the session is established. of confidentiality, integrity, and authentication. Rather than
defining new security mechanisms specific to SIP, SIP reuses,
wherever possible, existing security models derived from the
19.12.1.5 DoS and Amplification
HTTP and Simple Mail Transfer Protocol (SMTP) space.
DoS attacks focus on rendering a particular network element Full encryption of messages provides the best means to
unavailable, usually by directing an excessive amount of net- preserve the confidentiality of signaling—it can also guar-
work traffic at its interfaces. A distributed DoS attack allows antee that messages are not modified by any malicious inter-
one network user to cause multiple network hosts to flood a mediaries. However, SIP requests and responses cannot be
770 ◾ Handbook on Session Initiation Protocol
naively encrypted end-to-end in their entirety because mes- connection-oriented protocols (for the purposes of this docu-
sage fields such as the Request-URI, Route, and Via need ment, TCP); tls (signifying TLS over TCP) can be specified
to be visible to proxies in most network architectures so as the desired transport protocol within a Via header field
that SIP requests are routed correctly. Note that proxy serv- value or a SIP-URI. TLS is most suited to architectures in
ers need to modify some features of messages as well (such which hop-by-hop security is required between hosts with
as adding Via header field values) in order SIP to function. no preexisting trust association. For example, Alice trusts her
Proxy servers must therefore be trusted, to some degree, by local proxy server, which, after a certificate exchange, decides
SIP UAs. To this purpose, low-layer security mechanisms for to trust Bob’s local proxy server, which Bob trusts; hence,
SIP are recommended, which encrypt the entire SIP requests Bob and Alice can communicate securely.
or responses on the wire on a hop-by-hop basis, and that TLS must be tightly coupled with a SIP application.
allow end points to verify the identity of proxy servers to Note that transport mechanisms are specified on a hop-by-
whom they send requests. hop basis in SIP; thus, a UA that sends requests over TLS to
SIP entities also have a need to identify one another in a a proxy server has no assurance that TLS will be used end-
secure fashion. When a SIP end point asserts the identity of to-end. The TLSffRSAff WITHffAESff128ffCBCffSHA
its user to a peer UA or to a proxy server, that identity should ciphersuite defined in RFC 5246 must be supported at
in some way be verifiable. A cryptographic authentication a minimum by implementers when TLS is used in a SIP
mechanism is provided in SIP to address this requirement. application. For purposes of backwards compatibility,
An independent security mechanism for SIP message bodies proxy servers, redirect servers, and registrars should support
supplies an alternative means of end-to-end mutual authenti- TLSffRSAff WITHff3DESffEDEffCBCffSHA. Implemen-
cation, as well as provides a limit on the degree to which UAs ters may also support any other ciphersuite.
must trust intermediaries.
19.12.2.2 SIPS URI Scheme
19.12.2.1 Transport- and Network-Layer
The SIPS URI scheme adheres to the syntax of the SIP URI
Security
(see Section 4.2), although the scheme string is sips rather
Transport- or network-layer security encrypts signaling than sip. The semantics of SIPS are very different from the
traffic, guaranteeing message confidentiality and integrity. SIP URI, however. SIPS allows resources to specify that
Oftentimes, certificates are used in the establishment of they should be reached securely. A SIPS URI can be used
lower-layer security, and these certificates can also be used as an AOR for a particular user—the URI by which the
to provide a means of authentication in many architec- user is canonically known (on their business cards, in the
tures. Two popular alternatives for providing security at the From header field of their requests, in the To header field
transport and network layer are, respectively, TLS defined of REGISTER requests). When used as the Request-URI
in RFC 4346 and IPSec specified in RFC 4301. IPSec is of a request, the SIPS scheme signifies that each hop over
a set of network-layer protocol tools that collectively can which the request is forwarded, until the request reaches
be used as a secure replacement for traditional IP. IPSec is the SIP entity responsible for the domain portion of the
most commonly used in architectures in which a set of hosts Request-URI, must be secured with TLS; once it reaches
or administrative domains have an existing trust relation- the domain in question, it is handled in accordance with
ship with one another. IPSec is usually implemented at the local security and routing policy, quite possibly using TLS
operating system level in a host, or on a security gateway for any last hop to a UAS. When used by the originator
that provides confidentiality and integrity for all traffic it of a request (as would be the case if they employed a SIPS
receives from a particular interface as in a virtual private URI as the AOR of the target), SIPS dictates that the entire
network (VPN) architecture. IPSec can also be used on a request path to the target domain be so secured. The SIPS
hop-by-hop basis. scheme is applicable to many of the other ways in which
In many architectures, IPSec does not require integra- SIP URIs are used in SIP today in addition to the Request-
tion with SIP applications; IPSec is perhaps best suited to URI, including in AORs, contact addresses (the contents of
deployments in which adding security directly to SIP hosts Contact headers, including those of REGISTER methods),
would be arduous. UAs that have a preshared keying rela- and Route headers.
tionship with their first-hop proxy server are also good can- In each instance, the SIPS URI scheme allows these
didates to use IPSec. Any deployment of IPSec for SIP would existing fields to designate secure resources. The man-
require an IPSec profile describing the protocol tools that ner in which a SIPS URI is dereferenced in any of these
would be required to secure SIP. No such profile is given contexts has its own security properties, which are
in this document. TLS provides transport-layer security over detailed in RFC 3263 (see Section 8.2.4). The use of
Security Mechanisms in SIP ◾ 771
SIPS in particular entails that mutual TLS authenti- authentication. It is strongly recommended that UAs be
cation should be employed, as should the ciphersuite capable of initiating TLS; UAs may also be capable of acting
TLSffRSAff WITHffAESff128ffCBCffSHA. Certificates as a TLS server. Proxy servers, redirect servers, and registrars
received in the authentication process should be validated should possess a site certificate whose subject corresponds
with root certificates held by the client; failure to validate a to their canonical host name. UAs may have certificates of
certificate should result in the failure of the request. Note their own for mutual authentication with TLS; however, no
that in the SIPS URI scheme, transport is independent provisions are set forth in this document for their use. All
of TLS, and thus sips:[email protected];transport=tcp SIP elements that support TLS must have a mechanism for
and sips:[email protected];transport=sctp are both valid validating certificates received during TLS negotiation; this
(although note that UDP is not a valid transport for SIPS). entails possession of one or more root certificates issued by
The use of transport=tls has consequently been deprecated, certificate authorities (preferably well-known distributors of
partly because it was specific to a single hop of the request. site certificates comparable to those that issue root certifi-
This is a change since RFC 2543 (obsoleted by RFC 3261). cates for web browsers). All SIP elements that support TLS
Users that distribute a SIPS URI as an AOR may elect to must also support the SIPS URI scheme. Proxy servers, redi-
operate devices that refuse requests over insecure transports. rect servers, registrars, and UAs may also implement IPSec or
other lower-layer security protocols. When a UA attempts to
contact a proxy server, redirect server, or registrar, the UAC
19.12.2.3 HTTP Authentication should initiate a TLS connection over which it will send SIP
messages.
SIP provides a challenge capability, based on HTTP authen-
In some architectures, UASs may receive requests over
tication that relies on the 401 and 407 response codes as
such TLS connections as well. Proxy servers, redirect servers,
well as header fields for carrying challenges and credentials.
registrars, and UAs must implement Digest Authorization,
Without significant modification, the reuse of the HTTP
encompassing all related aspects described in Sections 19.5
Digest authentication scheme in SIP allows for replay pro-
and 19.6. Proxy servers, redirect servers, and registrars should
tection and one-way authentication. The usage of Digest
be configured with at least one Digest realm, and at least one
authentication in SIP is detailed Section 19.4.
realm string supported by a given server should correspond
to the server’s host name or domain name. UAs may support
the signing and encrypting of MIME bodies, and transfer-
19.12.2.4 S/MIME
ence of credentials with S/MIME as described in Section
As is discussed above, encrypting entire SIP messages end- 19.6. If a UA holds one or more root certificates of certificate
to-end for the purpose of confidentiality is not appropriate authorities in order to validate certificates for TLS or IPSec,
because network intermediaries (like proxy servers) need it should be capable of reusing these to verify S/MIME cer-
to view certain header fields in order to route messages tificates, as appropriate. A UA may hold root certificates spe-
correctly, and if these intermediaries are excluded from cifically for validating S/MIME certificates. Note that is it
security associations, then SIP messages will essentially anticipated that future security extensions may upgrade the
be nonroutable. However, S/MIME allows SIP UAs to normative strength associated with S/MIME as S/MIME
encrypt MIME bodies within SIP, securing these bodies implementations appear and the problem space becomes bet-
end-to-end without affecting message headers. S/MIME ter understood.
can provide end-to-end confidentiality and integrity for
message bodies, as well as mutual authentication. It is also
possible to use S/MIME to provide a form of integrity and 19.12.3.2 Security Solutions
confidentiality for SIP header fields through SIP message The operation of these security mechanisms in concert can
tunneling. The usage of S/MIME in SIP is detailed in follow the existing web and e-mail security models to some
Section 19.6. degree. At a high level, UAs authenticate themselves to serv-
ers (proxy servers, redirect servers, and registrars) with a
Digest user name and password; servers authenticate them-
19.12.3 Implementing Security Mechanisms selves to UAs one hop away, or to another server one hop
away (and vice versa), with a site certificate delivered by TLS.
19.12.3.1 Requirements for Implementers
On a peer-to-peer level, UAs trust the network to authenti-
of SIP
cate one another ordinarily; however, S/MIME can also be
Proxy servers, redirect servers, and registrars must imple- used to provide direct authentication when the network does
ment TLS, and must support both mutual and one-way not, or if the network itself is not trusted. The following is
772 ◾ Handbook on Session Initiation Protocol
an illustrative example in which these security mechanisms to the UA that has just completed registration. Because the
are used by various UAs and servers to prevent the sorts UA has already authenticated the server on the other side of
of threats described in Section 19.12.1. While implement- the TLS connection, all requests that come over this connec-
ers and network administrators may follow the normative tion are known to have passed through the proxy server—
guidelines given in the remainder of this section, these are attackers cannot create spoofed requests that appear to have
provided only as example implementations. been sent through that proxy server.
The proxy server at biloxi.com should inspect the certifi- initiate a TLS connection with the biloxi proxy directly (using
cate of the proxy server at atlanta.com in turn and compare the mechanism described in RFC 3263 [see Section 8.2.4] to
the domain asserted by the certificate with the domainname determine how to best to reach the given Request-URI). When
portion of the From header field in the INVITE request. The her UA receives a certificate from the biloxi proxy, it should
biloxi proxy may have a strict security policy that requires be verified normally before she passes her INVITE across the
it to reject requests that do not match the administrative TLS connection. However, Carol has no means of proving her
domain from which they have been proxied. Such security identity to the biloxi proxy, but she does have a CMS-detached
policies could be instituted to prevent the SIP equivalent of signature over a message/sip body in the INVITE. It is unlikely
SMTP open relays that are frequently exploited to generate in this instance that Carol would have any credentials in the
spam. This policy, however, only guarantees that the request biloxi.com realm, since she has no formal association with
came from the domain it ascribes to itself; it does not allow biloxi.com. The biloxi proxy may also have a strict policy that
biloxi.com to ascertain how atlanta.com authenticated Alice. precludes it from even bothering to challenge requests that do
Only if biloxi.com has some other way of knowing atlanta. not have biloxi.com in the domainname portion of the From
com’s authentication policies could it possibly ascertain header field—it treats these users as unauthenticated.
how Alice proved her identity. biloxi.com might then insti- The biloxi proxy has a policy for Bob that all nonau-
tute an even stricter policy that forbids requests that come thenticated requests should be redirected to the appropri-
from domains that are not known administratively to share ate contact address registered against [email protected],
a common authentication policy with biloxi.com. Once the namely <sip:[email protected]>. Carol receives the redirection
INVITE has been approved by the biloxi proxy, the proxy response over the TLS connection she established with the
server should identify the existing TLS channel, if any, asso- biloxi proxy, so she trusts the veracity of the contact address.
ciated with the user targeted by this request (in this case Carol should then establish a TCP connection with the des-
[email protected]). ignated address and send a new INVITE with a Request-
The INVITE should be proxied through this channel URI containing the received contact address (recomputing
to Bob. Since the request is received over a TLS connection the signature in the body as the request is readied). Bob
that had previously been authenticated as the biloxi proxy, receives this INVITE on an insecure interface, but his UA
Bob knows that the From header field was not tampered inspects and, in this instance, recognizes the From header
with and that atlanta.com has validated Alice, although field of the request and subsequently matches a locally
not necessarily whether or not to trust Alice’s identity. cached certificate with the one presented in the signature
Before they forward the request, both proxy servers should of the body of the INVITE. He replies in similar fashion,
add a Record-Route header field to the request so that all authenticating himself to Carol, and a secure dialog begins.
future requests in this dialog will pass through the proxy Sometimes, firewalls or network address translators in an
servers. The proxy servers can thereby continue to provide administrative domain could preclude the establishment of
security services for the lifetime of this dialog. If the proxy a direct TCP connection to a UA. In these cases, proxy
servers do not add themselves to the Record-Route, future servers could also potentially relay requests to UAs in a way
messages will pass directly end-to-end between Alice and that has no trust implications (e.g., forgoing an existing
Bob without any security services (unless the two parties TLS connection and forwarding the request over cleartext
agree on some independent end-to-end security such as TCP) as local policy dictates.
S/MIME). In this respect, the SIP trapezoid model can
provide a good structure where conventions of agreement 19.12.3.2.4 DoS Protection
between the site proxies can provide a reasonably secure
channel between Alice and Bob. An attacker preying on To minimize the risk of a DoS attack against architectures
this architecture would, for example, be unable to forge using these security solutions, implementers should take
a BYE request and insert it into the signaling stream note of the following guidelines. When the host on which
between Bob and Alice because the attacker has no way of a SIP proxy server is operating is routable from the public
ascertaining the parameters of the session and also because Internet, it should be deployed in an administrative domain
the integrity mechanism transitively protects the traffic with defensive operational policies (blocking source-routed
between Alice and Bob. traffic, preferably filtering ping traffic). Both TLS and IPSec
can also make use of bastion hosts at the edges of administra-
tive domains that participate in the security associations to
19.12.3.2.3 Peer-to-Peer Requests aggregate secure tunnels and sockets. These bastion hosts can
Alternatively, consider a UA asserting the identity carol@ also take the brunt of DoS attacks, ensuring that SIP hosts
chicago.com that has no local outbound proxy. When Carol within the administrative domain are not encumbered with
wishes to send an INVITE to [email protected], her UA should superfluous messaging.
774 ◾ Handbook on Session Initiation Protocol
that reason, it is recommended that TCP should be used as choose to disregard the forwarding rules associated with
a transport protocol when S/MIME tunneling is employed. SIPS (and the general forwarding rules in Section 3.11.6).
Such malicious intermediaries could, for example, retarget
a request from a SIPS URI to a SIP URI in an attempt to
19.12.4.3 TLS downgrade security.
Alternatively, an intermediary might legitimately retarget
The most commonly voiced concern about TLS is that it
a request from a SIP to a SIPS URI. Recipients of a request
cannot run over UDP; TLS requires a connection-oriented
whose Request-URI uses the SIPS URI scheme thus cannot
underlying transport protocol, which for the purposes of this
assume on the basis of the Request-URI alone that SIPS was
document means TCP. It may also be arduous for a local
used for the entire request path (from the client onwards). To
outbound proxy server or registrar to maintain many simul-
address these concerns, it is recommended that recipients of
taneous long-lived TLS connections with numerous UAs.
a request whose Request-URI contains a SIP or SIPS URI
This introduces some valid scalability concerns, especially for
inspect the To header field value to see if it contains a SIPS
intensive ciphersuites. Maintaining redundancy of long-lived
URI (however, note that it does not constitute a breach of
TLS connections, especially when a UA is solely responsible
security if this URI has the same scheme but is not equiva-
for their establishment, could also be cumbersome. TLS only
lent to the URI in the To header field). Although clients may
allows SIP entities to authenticate servers to which they are
choose to populate the Request-URI and To header field of
adjacent; TLS offers strictly hop-by-hop security. Neither
a request differently, when SIPS is used this disparity could
TLS, nor any other mechanism specified in this document,
be interpreted as a possible security violation, and the request
allows clients to authenticate proxy servers to whom they
could consequently be rejected by its recipient. Recipients
cannot form a direct TCP connection.
may also inspect the Via header chain in order to double-
check whether or not TLS was used for the entire request path
19.12.4.4 SIPS URIs until the local administrative domain was reached. S/MIME
may also be used by the originating UAC to help ensure that
Actually using TLS on every segment of a request path
the original form of the To header field is carried end-to-end.
entails that the terminating UAS must be reachable over TLS
If the UAS has reason to believe that the scheme of the
(perhaps registering with a SIPS URI as a contact address).
Request-URI has been improperly modified in transit, the
This is the preferred use of SIPS. Many valid architectures,
UA should notify its user of a potential security breach. As a
however, use TLS to secure part of the request path, but
further measure to prevent downgrade attacks, entities that
rely on some other mechanism for the final hop to a UAS,
accept only SIPS requests may also refuse connections on
for example. Thus, SIPS cannot guarantee that TLS usage
insecure ports. End users will undoubtedly discern the dif-
will be truly end-to-end. Note that since many UAs will not
ference between SIPS and SIP URIs, and they may manually
accept incoming TLS connections, even those UAs that do
edit them in response to stimuli. This can either benefit or
support TLS may be required to maintain persistent TLS
degrade security. For example, if an attacker corrupts a DNS
connections, as described in Section 19.12.4.3, in order to
cache, inserting a fake record set that effectively removes all
receive requests over TLS as a UAS. Location services are
SIPS records for a proxy server, then any SIPS requests that
not required to provide a SIPS binding for a SIPS Request-
traverse this proxy server may fail. When a user, however,
URI. Although location services are commonly populated by
sees that repeated calls to a SIPS AOR are failing, they could
user registrations (as described in Section 3.3), various other
on some devices manually convert the scheme from SIPS to
protocols and interfaces could conceivably supply contact
SIP and retry. Of course, there are some safeguards against
addresses for an AOR, and these tools are free to map SIPS
this (if the destination UA is truly paranoid, it could refuse
URIs to SIP URIs as appropriate. When queried for bind-
all non-SIPS requests), but it is a limitation worth noting.
ings, a location service returns its contact addresses without
On the bright side, users might also divine that SIPS would
regard for whether it received a request with a SIPS Request-
be valid even when they are presented only with a SIP URI.
URI. If a redirect server is accessing the location service, it
is up to the entity that processes the Contact header field
of a redirection to determine the propriety of the contact
addresses. 19.13 Summary
Ensuring that TLS will be used for all of the request seg- We have defined the basic security functions authentication,
ments up to the target domain is somewhat complex. It is authorization, integrity, confidentiality, and nonrepudia-
possible that cryptographically authenticated proxy servers tion. The inherent security capabilities that are available in
along the way that are noncompliant or compromised may SIP are discussed in this chapter. First, the functionality for
776 ◾ Handbook on Session Initiation Protocol
negotiating the security mechanisms used between a SIP UA 10. What is AIB? How does it enhance integrity and
and its next-hop SIP entity at the session level is provided. authentication protection in SIP? Explain with mes-
Second, all security mechanisms of SIP are described in sage flows using a third-party conference control where
detail. Third, SDP security description is used for negotia- the REFER method is used.
tion of security context for media security at the media level, 11. How does the cryptographic authentication scheme
after session establishment using SIP signaling messages. help authentication in SIP as specified in RFC 4474?
Fourth, we have explained the SIP session setup that includes Describe the behavior of SIP entities in handling
security features using a call flows example. Fifth, we have cryptographic authentication. What are the security
explained possible security threats that are being faced in the concerns for this? How can these security holes be
context of SIP that is an application-layer protocol. Finally, mitigated?
the means of mitigating security threats using existing secu- 12. How are the privacy and security concerns handled in
rity mechanisms in SIP are provided. In this context, how SIP for user identity and the TEL URI scheme in the
lower-transport-layer security capabilities complement SIP cryptographic authentication scheme?
application-layer security features is discussed. However, a 13. What is the AKA HTTP digest as specified in RFC
separate chapter is devoted to describing privacy and ano- 3310? Explain with detailed examples the operation of
nymity in SIP. the digest AKA. What are the security concerns for
this scheme? How are the security loopholes mitigated
in this scheme?
PROBLEMS 14. Describe in detail the key-derivation authentication
1. What are the definitions of authentication, authorization, scheme in SIP, including challenge, response, and con-
integrity, confidentiality, and nonrepudiation? What are firmation. Why is it important for authentication?
the differences between confidentiality and privacy? 15. What has been missing in RFC 3261 for DNS-based
2. Why does SIP need security in both the session and authentication for TLS sessions in SIP? How does
the media level, making security in SIP fundamentally DNSec enhance security in SIP for TLS sessions?
different from other applications? Explain how media- Describe DANS-capable SIP implementation using
level security depends on session-level security in SIP. examples.
3. Describe with examples how the SDP security features 16. How does authorization come into picture in SIP?
help negotiate security mechanisms to be used for What is role/trait-based authorization? Explain using
medial-level security. How does the SDP help obtain some use cases and its requirements.
security features to be used in SRTP and ZRTP? 17. What is SAML assertion? How can it be applied in SIP
4. What are the SDP media stream security preconditions to play the role of authorization? What are the pros
that are needed in SIP? How do they ensure media and cons of using SAML assertion in SIP? How are
security in delaying the transfer of media? Explain all the security vulnerabilities taken care of using SAML
security preconditions with detailed call flows. assertion over the SIP network?
5. Why does SIP need extension with three headers only 18. How does authorization occur through dialog identifi-
for making agreement between SIP entities for security cation in SIP? Explain the UA and proxy behaviors for
mechanisms as specified in RFC 3329? Describe in detail this using call flows.
how the security mechanisms are negotiated between 19. Explain media authorization in SIP as defined in RFC
SIP entities for client- and server-initiated negotiation. 3313 along with bandwidth reservation with an RSVP
6. What are the security limitations in negotiating secu- QOS signaling protocol that is used over the IP net-
rity capabilities as described in RFC 3329? How can work along with UA and proxy behaviors. What are
we mitigate those limitations? the pros and cons of these procedures of media autho-
7. Describe the procedures for authentication in SIP: rization in SIP from a security point of view?
user-to-user and proxy-to-user. 20. Explain how early-media authorization in SIP can be
8. What is digest authentication scheme in SIP? Describe offered in SIP. Explain the SIP session setup with a media
in detail using call flows. How does SIP authentication authorization framework as explained in RFC 3521.
differ from that of HTTP/1.1? 21. How do S/MIME certificates help in session-level
9. What are the benefits provided by obtaining the security? How does the S/MIME key exchange help in
domain certificate over TLS for authentication in SIP securing the SIP session?
per procedures defined in RFC 5922? How is the host 22. Explain how the SDP message body of the SIP signal-
name resolved in the SIP domain? How should this ing message secures MIME bodies.
certificate be used by a SIP service provider? Describe 23. How are SIP header integrity and confidentiality pro-
the behavior of SIP entities for the domain certificate. tected using S/MIME tunneling? What are the pros
Security Mechanisms in SIP ◾ 777
and cons of tunneling SIP headers? Explain using 34. What are the security concerns in consent-based com-
examples. munications in SIP? How are those security loopholes
24. How are the SIP messages’ integrity and authentica- mitigated?
tion protected using S/MIME tunneling? What are 35. How does this RFC update RFC 3261 for secur-
the pros and cons of tunneling SIP messages? Explain ing forking in SIP? Explain in detail the guidance
using examples. that is provided by RFC 5393 to implementers for
25. How is SIP message encryption done using S/MIME securing forking in SIP. Are these extensions of RFC
tunneling? What are the pros and cons of tunneling 3261 mandated by RFC 5393 sufficient to secure a
encrypted SIP messages? Explain using examples. SIP forking proxy? What alternative solution did the
26. How do nonrepudiation services work in SIP? Explain authors of this RFC suggest, but was not accepted
with detailed explanations. by the IETF Working Group? Provide your view on
27. Develop a SIP network architecture for providing non- this alternative solution for mitigating forking vul-
repudiation services in billing using the DIAMETER nerabilities in SIP.
protocol that offers authentication, authorization, and 36. Explain in detail the attacks and threats in SIP at the
accounting services in SIP. registration and session levels: registration hijacking,
28. What are the vulnerabilities in forking by a SIP proxy? impersonating a server, tampering with message bod-
Explain with examples. How the suggestions provided ies, tearing down sessions, and DoS and amplification.
in RFC 5393 will mitigate those vulnerabilities of fork- 37. Explain in detail how each of one of these security
ing in SIP? Write the possible solutions using existing mechanisms help mitigate particular attacks and
SIP messages, existing other standards, and/or new threats: SIPS URI scheme, HTTP authentication, S/
extensions. MIME, and transport- and network-layer security.
29. What are the requirements for URI-list services using 38. What are the requirements for implementation of SIP
external lists? What are the requirements for URI-list security? What are the security solutions in meeting the
services using Request-Contained lists? Describe the requirements for registration, interdomain requests,
framework specified by RFC 5363 for carrying and peer-to-peer requests, and DoS protection?
processing of URI lists in SIP. What are the security 39. What are the security limitations of HTTP Digest, S/
concerns and general issues for carrying URI-list, and MIME, TLS, and SIPS URIs? Explain in detail with
how do you mitigate those security loopholes in SIP? examples.
30. What are the security concerns if session invitation,
instant messaging, and other requests are sent to a
party without its consent in SIP? What requirements
do consent-based communications in SIP, specified in
RFC 5360, need to meet? What is the content-based
References
communications solution architecture in SIP? Explain 1. Shekh-Yusef, R., “Key-derivation authentication scheme,”
with a detailed explanation. draft-yusef-sipcore-key-derivation-00 (work in progress),
31. How does the relay behave in avoidance of amplifica- October 2014.
2. Tschofenig, H. et al., “Using SAML to protect the Session
tion, subscription to permission status, and request for
Initiation Protocol (SIP),” IEEE Network, September/
permission for consent-based communications in SIP? October 2006.
How does a relay behave in handling SIP Identity, 3. 3rd Generation Partnership Project, “Security architecture
P-Asserted-Identity, Return Routability, and SIP (Release 4),” TS 33.102, December 2001.
Digest once the recipient grants permission? 4. “NIST special publication 800-132—Recommendations for
32. How is permission revocation performed for consent- password-based key derivations,” December 2010.
based communications in SIP? Explain the relay behav- 5. Finch, T., Miller, M., and Saint-Andre, P., “Using DNS-
ior related to handling request-contained URI lists. based Authentication of Named Entities (DANE) TLSA
records with SRV records,” draft-ietf-dane-srv-07 (work in
Why are the 470 response code and the Permission-
progress), July 2014.
Missing and Trigger-Consent header fields needed in 6. Johansson, O., “TLS sessions in SIP using DNS-based
SIP for consent-based communications? Authentication of Named Entities (DANE) TLSA records,”
33. Explain the registration scheme for consent-based draft-johansson-sipcore-dane-sip-07 (work in progress),
communications in SIP. Explain the relay behavior in October 6, 2014.
generating the traffic toward the recipient. 7. Kaliski, B., “TWIRL and RSA key size,” May 2003.
Chapter 20
779
780 ◾ Handbook on Session Initiation Protocol
speaking in the conference. SIP has mechanisms for provid- example, the Contact header field contains a SIP URI, one
ing anonymity under such circumstances. that is commonly as revealing as the AOR in the From header
RFC 3261 describes how SIP user privacy will be affected field. In some headers, the originating UA can conceal iden-
by SIP signaling message fields to a great length. For exam- tity information as a matter of local policy without affecting
ple, SIP messages frequently contain sensitive information the operation of the SIP. However, certain headers are used
about their senders—not only what they have to say, but also in the routing of subsequent messages in a dialog, and must
with whom they communicate, when they communicate and therefore be populated with functional data.
for how long, and from where they participate in sessions. The privacy problem is further complicated by proxy
Many applications and their users require that this kind of servers (also referred to in this document as “intermediaries”
private information be hidden from any parties that do not or “the network”) that add headers of their own, such as the
need to know it. Record-Route and Via headers. Information in these headers
Note that there are also less direct ways in which private could inadvertently reveal something about the originator of
information can be divulged. If a user or service chooses to a message; for example, a Via header might reveal the ser-
be reachable at an address that is guessable from the per- vice provider through whom the user sends requests, which
son’s name and organizational affiliation (which describes might, in turn, strongly hint at the user’s identity to some
most addresses of record [AORs]), the traditional method of recipients. For these reasons, the participation of intermedi-
ensuring privacy by having an unlisted “phone number” is aries is also crucial to providing privacy in SIP.
compromised. A user location service can infringe on the pri- RFC 3323 that is described here addresses the SIP privacy
vacy of the recipient of a session invitation by divulging their problems and defines new mechanisms for the SIP in sup-
specific whereabouts to the caller; an implementation conse- port of privacy. Specifically, UA and privacy service behavior
quently should be able to restrict, on a per-user basis, what guidelines are provided for the creation of messages that do
kind of location and availability information is given out to not divulge personal identity information. In addition, the
certain classes of callers. This is a whole class of problem that “privacy service” logical role for intermediaries is defined,
is expected to be studied further in ongoing SIP work. meeting privacy requirements that UAs cannot satisfy
In some cases, users may want to conceal personal infor- themselves. Mechanisms are described by which a user can
mation in header fields that convey identity. This can apply request particular functions from a privacy service. Privacy
not only to the From and related headers representing the is defined as the withholding of the identity of a person (and
originator of the request, but also the To—it may not be related personal information) from one or more parties in
appropriate to convey to the final destination a speed-dialing an exchange of communications, specifically a SIP dialog.
nickname, or an unexpanded identifier for a group of targets, These parties potentially include the intended destination(s)
either of which would be removed from the Request-URI of messages and/or any intermediaries handling these mes-
as the request is routed, but not changed in the To header sages. As identity is defined, withholding the identity of a
field if the two were initially identical. Thus, it may be desir- user will, among other things, render the other parties in the
able for privacy reasons to create a To header field that dif- dialog unable to send new SIP requests to the user outside of
fers from the Request-URI. However, how the privacy of the the context of the current dialog.
SIP user can be provided has subsequently been described in Two complementary principles have been used in design-
other RFCs updating RFC 3261. ing the privacy mechanism in RFC 3323: users are empow-
ered to hide their identity and related personal information
when they issue requests; however, intermediaries and des-
ignated recipients of requests are entitled to reject requests
20.2 Privacy Mechanism in SIP whose originator cannot be identified. The privacy properties
of only those specific headers enumerated in the core SIP
20.2.1 Background specification (RFC 3261), as opposed to headers defined by
In SIP, identity is most commonly carried in the form of a SIP any existing or planned extension, are discussed in this docu-
Uniform Resource Identifier (URI) and an optional display ment—however, the privacy mechanisms described in this
name. A SIP AOR has a form similar to an e-mail address document can be extended to support extensions. There are
with a SIP URI scheme (e.g., sip:[email protected]). A dis- other aspects of the general privacy problem for SIP that are
play name is a string containing a name for the identified user not addressed by RFC 3323. Most significantly, the mecha-
(e.g., “Alice”). SIP identities of this form commonly appear in nisms for managing the confidentiality of SIP headers and
the To and From header fields of SIP requests and responses. bodies, as well the security of session traffic, are not reconsid-
A user may have many identities that they use in different ered here. These problems are sufficiently well addressed in
contexts. There are numerous other places in SIP messages the baseline SIP specification and related documents, and no
in which identity-related information can be revealed. For new mechanisms are required.
Privacy and Anonymity in SIP ◾ 781
20.2.2 Varieties of Privacy sends a request to a telephone number, they may believe that
the final destination of the request will be a station in the
A user may possess many identities that are used in vari- public switched telephone network (PSTN) that is unable to
ous contexts; generally, identities are AORs that are bound inspect, say, SIP Contact headers, and therefore assume that
to particular registrars (operated by the administrators of a it is safe to leave such headers in the clear; however, such a
domain) with whom SIP UAs register. The operators of these request might very well end up being retargeted by the net-
domains may be employers, service providers, or unaffiliated work to a native SIP end point to which Contact headers are
users themselves. When a user voluntarily asserts an identity quite legible.
in a request, they are claiming that they can receive requests RFC 3323 describes three degrees of privacy—one level
sent to that identity in that domain. Strictly speaking, privacy of user-provided privacy, and two levels of network-provided
entails the restriction of the distribution of a specific identity privacy (header privacy and session privacy). How much pri-
and related personal information from some particular party vacy does a user need for any given session? Generally, if a
or parties that are potentially recipients of the message. In user is seeking privacy, they are going to need as much of it
particular, there are scenarios in which a party desiring ano- as they can get. However, if a user knows of no privacy ser-
nymity may send a message and withhold an identity from vice, they must be content with user-provided privacy alone.
the final destination(s) while still communicating an identity Similarly, if a user knows of an anonymization service that
to one or more intermediaries; send a message and withhold can provide session privacy but is unable to secure session
identity from some or all intermediaries, but still communicate traffic to prevent the anonymizer from possibly eavesdrop-
an identity end-to-end to the final destination(s); or withhold ping on the session, they might judge the loss of session
identity from both intermediaries and final destination(s). privacy to be the lesser evil. The user might also be aware
The result of withholding an identity is that the par- of exceptional conditions about the architecture in which
ties in question would be unable, for example, to attempt the UA is deployed that may obviate one or more privacy
to initiate a new dialog with the anonymous party at a later concerns.
time. However, the anonymous party still must be capable A user may not always be the best judge of when privacy
of receiving responses and new requests during the dialog in is required even under ideal circumstances, and thus pri-
which it is participating. It may be desirable to restrict iden- vacy may in some architectures be applied by intermediaries
tity information on both requests and responses. Initially, it without the user’s explicit per-message request. By sending
might seem unusual to suggest that a response has privacy a request through intermediaries that can provide a privacy
concerns—presumably, the originator of the request knows role, the user tacitly permits privacy functions to be invoked
who they were attempting to contact, so the identity of the as needed. It is also important that users understand that
respondent can hardly be confidential. However, some per- intermediaries may be unable to provide privacy functions
sonal information in responses (such as the contact address requested by users. Requests for privacy may not be honored
at which the respondent is currently registered) is subject to due to legal constraints, unimplemented or misconfigured
privacy concerns and can be addressed by these mechanisms. features, or other exceptional conditions.
Note that just as it is the prerogative of users to conceal
their identity, so it must also be the prerogative of proxy serv-
20.2.2.1 User Necessity for Privacy
ers and other users to refuse to process requests from users
Users may wish for identity information to be withheld from who they cannot identify. Therefore, users should not just
a given party for any number of reasons; for example, users automatically withhold their identity for all requests and
might want to contact a particular party without revealing responses—inability to ascertain the identity of the origina-
their identity to impart information with which they would tor of the request will frequently be grounds for rejection.
not like to be associated; users might fear that the exposure Privacy should only be requested when the user has a need
of their identity or personal information to some networks or for it. Further to this point, withholding some information
destinations will make them a target for unsolicited adver- in signaling might not be necessary for all UAs to ensure pri-
tising, legal censure, or other undesirable consequences; or vacy. For example, UAs may acquire their IP addresses and
users might want to withhold from participants in a session host names dynamically, and these dynamic addresses may
the identity by which they are known to network intermedi- not reveal any information about the user whatsoever. In
aries for the purposes of billing and accounting. these cases, restricting access to host names is unnecessary.
When a UA decides to send a request through a proxy
server, it may be difficult for the originator to anticipate the
20.2.2.2 User-Provided Privacy
final destination of that message. For that reason, users are
advised not to base their estimation of their privacy needs on There is a certain amount of privacy that a UA can provide
where they expect a message will go. For example, if a user itself. For example, the baseline SIP specification permits
782 ◾ Handbook on Session Initiation Protocol
a UA to populate the From header field of a request with 20.2.4 UA Behavior Constructing
an anonymous value. Users can take similar steps to avoid Private Messages
revealing any other unnecessary identity information in
related SIP headers. A user may have different privacy needs Privacy starts with the UA. The bulk of the steps that are
for a message if it traverses intermediaries rather than going required to conceal private information about the sender of a
directly end-to-end. A user may attempt to conceal things message are, appropriately enough, the sender’s responsibil-
from intermediaries that are not concealed from the final ity. The following SIP headers, when generated by a UA, can
destination, and vice versa. For example, using baseline SIP directly or indirectly reveal identity information about the
mechanisms, a UA can encrypt SIP bodies end-to-end to originator of a message: From, Contact, Reply-To, Via, Call-
prevent intermediaries from inspecting them. If a SIP mes- Info, User-Agent, Organization, Server, Subject, Call-ID,
sage will not pass through intermediaries, however, this step In-Reply-To, and Warning. Note that the use of an authenti-
might not be necessary (i.e., lower-layer security, without the cation system, such as the SIP Digest authentication method
addition of security for SIP bodies, could be sufficient). Also described in RFC 3261 (see Section 19.4), also usually entails
note that if a dialog goes directly end-to-end between partici- revealing identity to one or more parties.
pants, however, it will not be possible to conceal the network The first and most obvious step is that UAs should not
addresses of the participants. include any optional headers that might divulge personal
information; there is certainly no reason for a user seeking
privacy to include a Call-Info. Secondly, the user should
20.2.2.3 Network-Provided Privacy populate URIs throughout the message in accordance with
If a user is sending a request through intermediaries, a UA the guidelines given below. For example, users should cre-
can conceal its identity to only a limited extent without the ate an anonymous From header field for the request. Finally,
intermediaries’ cooperation. Also, some information can users may also need to request certain privacy functions from
only be concealed from destination end points if an inter- the network, as described in in the subsequent sections. The
mediary is entrusted to remove it. For these reasons, a user Call-ID header, which is frequently constructed in a manner
must have a way to request privacy from intermediaries, a that reveals the IP address or host name of the originating
means that allows users both to signal some indications of client, requires special mention. UAs should substitute for
the desired privacy services, and to ensure that their call the IP address or host name that is frequently appended to
is routed to an intermediary that is capable of providing the Call-ID value a suitably long random value (the value
these services. A user may be aware of a specific third-party used as the “tag” for the From header of the request might
anonymizing host, one with which they have a preexisting even be reused). Note that if the user wants to conceal any of
relationship, or a user may request that their local adminis- the above headers from intermediaries alone, without with-
trative domain provide privacy services. Intermediaries may holding them from the final destination of the message, users
also be empowered to apply privacy to a message without may also place legitimate values for these headers in encap-
any explicit signaling from the originating user, since UAs sulated “message/sip” Secure/Multipurpose Internet Mail
may not always be cognizant or capable of requesting privacy Extension (S/MIME) bodies as described in RFC 3261 (see
when it is necessary. Section 19.6).
(especially in applications such as instant messaging or dialog should be routed (namely the Contact header, Via
Internet gaming) the use of such aliases is unlikely to pro- header, and session information in the SDP), there seems
vide a cause for distrust. It is recommended that UAs seeking to be little that a user can do to disguise the existing URI,
anonymity use a display name of “Anonymous.” because users must provide a value that will allow them to
receive further requests. In some cases, disguising or fail-
ing to provide the user name, as described above, may cre-
20.2.4.1.2 URI User Names
ate some level of privacy; however, the host name provides a
The structure of a URI itself can reveal or conceal a consid- more significant obstacle. Is there much additional privacy in
erable amount of personal information. Consider the differ- using an IP address rather than a host name? It does prevent
ence between sip:[email protected] and sip:a0017@ someone who casually inspects a message from gathering
anonymous-sip.com. From the former, the full name and information that they might see otherwise.
employer of the party in question can easily be guessed. However, reverse-resolving such addresses is generally
From the latter, you learn nothing other than that the party trivial, and substituting an IP address for a host name could
desires anonymity. In some cases, sufficient anonymity can introduce some complications, for example, due to network
be achieved by selecting an oblique URI. Today, the SIP address translation and firewall traversal concerns. Headers
specification recommends a URI with “anonymous” in the used in routing may also rely on certain DNS practices to
user portion of the From header. In some URIs, such as those provide services that would be lost if an IP address is used
that appear in Contact headers, it may also make sense to in place of a host name. This document thus recommends
omit the user name altogether, and provide only a host name, that the host portion of URIs that are used in the routing of
like sip:anonymous-sip.com. subsequent requests, such as URIs appearing in the Contact
header, should not be altered by the UA owing to privacy
considerations. If these headers require anonymization, the
20.2.4.1.3 URI Host Names and IP Addresses user requests that service from an intermediary, namely a pri-
It is assumed by this document that the user that requests vacy service. Note that many of the considerations regarding
privacy wishes to receive future requests and responses the Contact header above apply equally well to SIP headers
within this dialog, but does not wish to reveal an identity in which a host name, rather than a URI, is used for some
that could be used to send new requests to him outside the routing purpose (namely the Via header).
scope of this dialog. For that reason, a different treatment
must be recommended for URIs that are used in the con- 20.2.5 UA Behavior Expressing
text of routing further requests in the dialog, as opposed to
routing new requests outside the context of the dialog. For
Privacy Preferences
headers indicating how the user would like to be contacted There are some headers that a UA cannot conceal itself,
for future sessions (such as the From header), it might not be because they are used in routing, that could be concealed
immediately obvious why changing the host name would be by an intermediary that subsequently takes responsibility for
necessary—if the user name is “anonymous,” requests will directing messages to and from the anonymous user. The UA
not be routable to the anonymous user. must have some way to request such privacy services from
Sometimes, merely changing the user name will not be the network. For that purpose, RFC 3323 defines a new SIP
enough to conceal a user’s identity. A user’s SIP service pro- header (see Section 2.8), Privacy, that can be used to specify
vider might decisively reveal a user’s identity (if it reflected privacy handling for requests and responses. The Privacy
something like a small company or a personal domain). So header is explained in more detail as stated below:
in this case, even though the URI in the From header would
not dereference to the anonymous user, humans might easily Privacy-hdr = "Privacy" HCOLON priv-value
*(";" priv-value)
guess the user’s identity and know the proper form of their priv-value = "header"/"session"/"user"/"non
AOR. For these reasons, the host name value “anonymous. e"/"critical"/token
invalid” should be used for anonymous URIs (see RFC 2606
for more information about the reserved “invalid” Domain UAs should include a Privacy header when network-
Name System [DNS] top-level domain). The full recom- provided privacy is required. Note that some intermediaries
mended form of the From header for anonymity is (note that may also add the Privacy header to messages, including pri-
this From header, like all others, must contain a valid and vacy services. However, such intermediaries should only do so
unique “tag=” parameter) as follows: From: “Anonymous” if they are operating at a user’s behest, for example, if a user has
<sip:[email protected]>;tag=1928301774>. an administrative arrangement with the operator of the inter-
For headers indicating how further requests in the current mediary that it will add such a Privacy header. An intermediary
784 ◾ Handbook on Session Initiation Protocol
must not modify the Privacy header in any way if the “none” 20.2.6 UA Behavior Routing Requests
priv-value is already specified. The values of priv-value today to Privacy Services
are restricted to the above options, although further options
can be defined as appropriate, as described in subsequent sec- The most obvious way for a UA to invoke the privacy func-
tions. Each legitimate priv-value can appear zero or one time tion is to direct a request through an intermediary known
in a Privacy header. The current values are as follows: to act as a privacy service. Doing so traditionally entails the
configuration of preloaded Route headers that designate the
◾◾ header: The user requests that a privacy service obscure privacy service. It is recommended that service providers
those headers that cannot be completely expunged of couple the privacy service function with a local outbound
identifying information without the assistance of inter- proxy. Users can thereby send their messages that request
mediaries (such as Via and Contact). Also, no unneces- privacy through their usual outbound route. Users should
sary headers should be added by the service that might not assume, however, that the administrative domain that
reveal personal information about the originator of the is the destination of the request would be willing and able
request. to perform the privacy service function on their behalf. If
◾◾ session: The user requests that a privacy service pro- the originating user wishes to keep their local administrative
vide anonymization for the session(s) (described, e.g., domain a secret, then they must use a third-party anony-
in an SDP used in the SIP message body) initiated by mization service outside of any of the principal administra-
this message. This will mask the IP address from which tive domains associated with the session.
the session traffic would ordinarily appear to originate. It is highly recommended that UAs use network or trans-
When session privacy is requested, UAs must not port layer security, such as TLS, when contacting a privacy
encrypt SDP bodies in messages. Note that requesting service. Ideally, users should establish a direct (i.e., single
session privacy in the absence of any end-to-end session preloaded Route header) connection to a privacy service;
encryption raises some serious security concerns. this will both allow the user to inspect a certificate presented
◾◾ user: This privacy level is usually set only by interme- by the privacy service, and will provide confidentiality for
diaries, in order to communicate that user-level privacy requests that will reduce the chances that the information
functions must be provided by the network, presum- that the privacy service will obscure is revealed before a mes-
ably because the UA is unable to provide them. UAs sage arrives at the privacy service. By establishing a direct
may, however, set this privacy level for REGISTER connection to a privacy service, the user also eliminates the
requests, but should not set “user”-level privacy for possibility that intermediaries could remove requests for pri-
other requests. vacy. If a direct connection is impossible, users should use a
◾◾ none: The user requests that a privacy service apply mechanism like SIP Security (SIPS) to guarantee the use of
no privacy functions to this message, regardless of any lower-layer security all the way to the privacy service.
preprovisioned profile for the user or default behavior If a UA believes that it is sending a request directly to
of the service. UAs can specify this option when they a privacy service, it should include a Proxy-Require header
are forced to route a message through a privacy service containing a new option tag, “privacy,” especially when the
that will, if no Privacy header is present, apply some “critical” priv-value is present in the Privacy header. That
privacy functions that the user does not desire for this way, in the unlikely event that the UA sends a request to an
message. Intermediaries must not remove or alter a intermediary that does not support the extensions described
Privacy header whose priv-value is “none.” UAs must in this document, the request will fail. Note that because of
not populate any other priv-values (including “critical”) special privacy service behavior (described in Section 20.2.8),
in a Privacy header that contains a value of “none.” no subsequent intermediaries in the signaling path of the
◾◾ critical: The user asserts that the privacy services request will also need to the support the “privacy” option
requested for this message are critical, and that, there- tag—once the privacy service has fulfilled all the required
fore, if these privacy services cannot be provided by privacy functions, the “privacy” option tag is removed from
the network, this request should be rejected. Criticality the Proxy-Require header.
cannot be managed appropriately for responses.
20.2.7 UA Behavior Routing Responses
When a Privacy header is constructed, it must consist of
to Privacy Services
either the value “none,” or one or more of the values “user,”
“header,” and “session” (each of which must appear at most Making sure that responses will go through a privacy service
once), which may, in turn, be followed by the “critical” indi- is a little bit trickier. The path traversed by SIP responses is the
cator. The settings for the Privacy header for different SIP same as the path over which the request traveled. Thus, the
methods are described in Section 2.8. responding UA, for example, cannot force a privacy service to
Privacy and Anonymity in SIP ◾ 785
be injected in the response path after it has received a request. a user is sending messages from a legacy client that does sup-
What a responding UA can do, however, is ensure that the port the Privacy header, or a UA that does not allow the user
path by which requests reach them traverses their privacy ser- to configure the values of headers that could reveal personal
vice. In some architectures, the privacy service function will information.
be fulfilled by the same server to which requests for the local However, if the Privacy header value of “none” is speci-
administrative domain are sent, and hence it will automati- fied in a message, privacy services must not perform any pri-
cally be in the path of incoming requests. However, if this vacy function and must not remove or modify the Privacy
is not the case, the user will have to ensure that requests are header. Privacy services must implement support for the
directed through a third-party privacy service. “none” and “critical” privacy tokens, and may implement
One way to accomplish this is to procure an “anonymous any of other privacy levels described in Section 20.2.5 as well
callback” URI from the third-party service and to distrib- as any extensions that are not detailed in this document. In
ute this as an AOR. A privacy service provider might offer some cases, the privacy service will not be capable of ful-
these anonymous callback URIs to users in the same way filling the requested level of privacy. If the “critical” privacy
that an ordinary SIP service provider grants AORs. The user level is present in the Privacy header of a request, then if the
would then register their normal AOR as a contact address privacy service is incapable of performing all of the levels of
with the third-party service. Alternatively, a UA could send privacy specified in the Privacy header, then it MUST fail
REGISTER requests through a privacy service with a request the request with a 500 Server Error response code.
for “user”-level privacy. This will allow the privacy service to The reason phrase of the status line of the response should
insert anonymous Contact header URIs. Requests sent to the contain appropriate text indicating that there has been a pri-
user’s conventional AOR would then reach the user’s devices vacy failure as well as an enumeration of the priv-value(s)
without revealing any usable contact addresses. that were not supported by the privacy service (the reason
Finally, a user might generate a Call Processing Language phrase should also respect any Accept-Language header in
(CPL) script defined in RFC 3880 that will direct requests to the request, if possible). When a privacy service performs one
an anonymization service. Users are also advised to use trans- of the functions corresponding to a privacy level listed in
port or network layer security in the response path. This may the Privacy header, it should remove the corresponding priv-
involve registering a SIPS URI and/or maintaining persis- value from the Privacy header—otherwise, any other privacy
tent TLS connections over which their UA receives requests. service involved with routing this message might unneces-
Privacy services may, in turn, route requests through other sarily apply the same function, which in many cases would
privacy services. This may be necessary if a privacy service be undesirable. When the last priv-value (not counting “criti-
does not support a particular privacy function but knows of cal”) has been removed from the Privacy header, the entire
a peer that does. Privacy services may also cluster themselves Privacy header must be removed from a message. When the
into networks that exchange session traffic between one privacy service removes the entire Privacy header, if the mes-
another in order to further disguise the participants in a ses- sage is a request, the privacy service must also remove any
sion, although no specific architecture or method for doing “privacy” option tag from the Proxy-Require header field of
so is described in this document. the request.
to the request before its arrival at the privacy service (a prac- in the Record-Route header field after the URI repre-
tice referred to as “Via stripping”) and then should add a senting the privacy service. Note that when a privacy
single Via header representing themselves. Note that the bot- service is handling a request and providing privacy
tommost such Via header field value in a request contains on behalf of the destination of the request, providing
an IP address or host name that designates the originating privacy for Record-Route headers downstream of the
client, and subsequent Via header field values may indicate privacy service is significantly more complicated. This
hosts in the same administrative domain as the client. No document recommends no way of statefully restoring
Via stripping is required when handling responses. those headers if they are stripped.
Contact headers are added by UAs to both requests and
responses. A privacy service should replace the value of the
Contact header in a message with a URI that does not deref-
20.2.8.2 Session Privacy
erence to the originator of the message such as the anony-
mous URI. The URI that replaces the existing Contact If a privacy level of “session” is requested, then the user has
header field value must dereference to the privacy service. In requested that the privacy service anonymize the session traf-
a manner similar to Via stripping, a privacy service should fic (e.g., for SIP telephony calls, the audio media) associated
also strip any Record-Route headers that have been added with this dialog. The SIP specification dictates that interme-
to a request before it reaches the privacy service—although diaries such as proxy servers cannot inspect and modify mes-
note that no such headers will be present if there is only one sage bodies. The privacy service logical role must therefore
hop between the originating UA and the privacy service, as act as a B2UA to provide media privacy, effectively termi-
is recommended above. Such Record-Route headers might nating and reoriginating the messages that initiate a session
also divulge information about the administrative domain (although in support of session privacy, the privacy service
of the client. does not need to alter headers characterizing the originator
For the purposes of this document, it is assumed that the or destination when the request is reoriginated).
privacy service has locally persisted the values of any of the To introduce an anonymizer for session traffic, the pri-
above headers that are so removed, which requires the privacy vacy service needs to control a middle box (RFC 3303) that
service to keep a pretty significant amount of state on a per- can provide an apparent source and sink for session traffic.
dialog basis. When further requests or responses associated The details of the implementation of an anonymizer, and the
with the dialog reach the privacy service, it MUST restore modifications that must be made to the SDP of the SIP mes-
values for the Via, Record-Route/Route, or Contact headers sage bodies in the messages that initiate a session, are outside
that it has previously removed in the interests of privacy. There the scope of this document. The risk, of course, of using such
may be alternative ways (outside the scope of this document) an anonymizer is that the anonymizer itself is party to your
to perform this function that do not require keeping state in communications. For that reason, requesting session-level
the privacy service (usually means that involve encrypting privacy without resorting to some sort of end-to-end secu-
and persisting the values in the signaling somehow). rity for the session traffic with RTP (see Section 7.5) media,
The following procedures are recommended for handling for example, Secure Real-time Transmission Protocol (see
the Record-Route header field of requests and responses, Section 7.3), is not recommended.
which provides special challenges to a privacy service:
20.2.8.3 Applying User-Level Privacy
◾◾ When a privacy service is processing (on behalf of the
Functions at a Privacy Service
originator) a request that contains one or more Record-
Route header field values, the privacy service must If a privacy level of “user” is requested, then the originat-
strip these values from the request and remember both ing user has requested that privacy services perform the
the dialog identifiers and the ordered Record-Route user-level privacy functions described earlier. Note that the
header field values. As described above, it must also privacy service must remove any nonessential informational
replace the Contact header field with a URI indicating headers that have been added by the UA, including the
itself. When a response with the same dialog identi- Subject, Call-Info, Organization, User-Agent, Reply-To, and
fiers arrives at the privacy service, the privacy service In-Reply-To. Significantly, user-level privacy could entail the
must reapply any Record-Route header field values to modification of the From header, changing it from its origi-
the response in the same order, and it must then add nal value to an anonymous value. Before the current issue of
a URI representing itself to the Record-Route header the SIP specification, the modification of the values of the
field of the response. To and From headers by intermediaries was not permitted,
◾◾ If the response contains Record-Route header field val- and would result in improper dialog matching by the end
ues of its own, these must also be included (in order) points. Currently, dialog matching uses only the tags in the
Privacy and Anonymity in SIP ◾ 787
To and From headers, rather than the whole header fields. articulated rules to be followed by any protocol wishing to
Thus, under the new rules, the URI values in the To and be considered a “Using Protocol,” specifying how a transport
From headers themselves could be altered by intermediar- protocol meets those rules. RFC 6280 updates the guidance
ies. However, some legacy clients might consider it an error in RFC 3693 to include subsequently introduced entities
condition if the value of the URI in the From header altered and concepts in the geolocation architecture. RFC 5606 (see
between the request and the response. Section 2.8) explores the difficulties inherent in mapping
Also, performing user-level privacy functions may entail the GEOPRIV architecture onto SIP elements. In particu-
the modification of the Call-ID header, since the Call-ID lar, the difficulties of defining and identifying recipients of
commonly contains a host name or IP address correspond- location information are given in that document, along with
ing to the originating client. This field is essential to dia- guidance provided earlier on the use of location-by-reference
log matching, and it cannot be altered by intermediaries. mechanisms to preserve confidentiality of location informa-
Therefore, any time that a privacy service needs to modify tion from unauthorized recipients.
any dialog-matching headers for privacy reasons, it should In a SIP deployment, location information may be added
act as a transparent B2BUA, and it must persist the former by any of several elements, including the originating UA or
values of the dialog-matching headers. These values must be a proxy server. In all cases, the Rule Maker associated with
restored in any messages that are sent to the originating UA. that location information decides which entity adds loca-
tion information and what access control rules apply. For
example, a SIP UA that does not support the Geolocation
20.2.8.4 Network-Asserted Identity Privacy
header may rely on a proxy server under the direction of the
We have described different types of Network-Asserted Rule Maker adding a Geolocation header with a reference to
Identity (NAI) in Section 10.5. The means by which any pri- location information. The manner in which the Rule Maker
vacy requirements in respect of the NAI are determined are operates on these devices is outside the scope of this docu-
outside the scope of this document. It shall be possible to ment. The manner in which SIP implementations honor the
indicate within a message containing a NAI that this NAI is Rule Maker’s stipulations for access control rules (including
subject to a privacy requirement that prevents it being passed retention and retransmission) is application specific and not
to other users. This indication should not carry any seman- within the scope of SIP operations. Entities in SIP networks
tics as to the reason for this privacy requirement. It shall be that fulfill the architectural roles of the location server or
possible to indicate that the user has requested that the NAI location recipient treat the privacy rules associated with loca-
not be passed to other users. This is distinct from the above tion information per the guidance provided in RFC 6280.
indication, in that it implies specific user intent with respect In particular, RFC 4119 (see Section 2.8) gives guidance for
to the NAI. The mechanism shall support Trust Domain handling access control rules; SIP implementations should
policies where the above two indications are equivalent (i.e., furthermore consult the recommendations in RFC 5606 (see
the only possible reason for a privacy requirement is a request Section 2.8).
from the user), and policies where they are not.
In this case, the NAI specification shall require that
the mechanism described earlier shall not be used; that is,
20.2.10 Security Considerations
a trusted node shall not pass the identity to a node it does Messages that request privacy require confidentiality and
not trust. However, the mechanism of “sending of NAI to integrity. Without integrity, the requested privacy functions
entities outside a Trust Domain” described earlier may not could be downgraded or eliminated, potentially exposing
be used to transfer the identity within the trusted network. identity information. Without confidentiality, eavesdrop-
Note that “anonymity” requests from users or subscribers pers on the network (or any intermediaries between the user
may well require functionality in addition to the above han- and the privacy service) could see the very personal informa-
dling of NAIs. Such additional functionality is beyond the tion that the user has asked the privacy service to obscure.
scope of this document. All of the network-provided privacy functions in this docu-
ment entail a good deal of trust for the privacy service. Users
should only trust privacy services that are somehow account-
20.2.9 Location Information Privacy
able to them.
RFC 6442 (see Section 9.10), which specifies the target’s Operators of privacy services should be aware that in
location information creation and distribution, has described the eyes of downstream entities, a privacy service will be the
how to maintain the privacy of the target. Location informa- only source to which anonymous messages can be traced.
tion is considered by most to be highly sensitive informa- Note that authentication mechanisms, including the Digest
tion, requiring protection from eavesdropping and altering authentication method described in the SIP specification,
in transit. RFC 3693 (see Sections 2.8 and 9.10.1) originally are outside the scope of the privacy considerations in this
788 ◾ Handbook on Session Initiation Protocol
document. Revealing identity through authentication is of this header is only applicable inside a “Trust Domain” as
highly selective and may not result in the compromise of any defined in short-term requirements for NAI specified in RFC
private information. Obviously, users that do not wish to 3324 (see Section 10.5). Nodes in such a Trust Domain are
reveal their identity to servers that issue authentication chal- explicitly trusted by its users and end systems to publicly assert
lenges may elect not to respond to such challenges. the identity of each party, and to be responsible for withhold-
ing that identity outside of the Trust Domain when privacy
is requested. The means by which the network determines
the identity to assert is outside the scope of this document
20.3 Asserted and Preferred (although it commonly entails some form of authentication).
Identity for Privacy in SIP A key requirement of RFC 3324 is that the behavior of all
20.3.1 Background nodes within a given Trust Domain “T” is known to com-
ply with a certain set of specifications known as “Spec(T).”
Various providers offering a telephony service over IP net- Spec(T) must specify behavior for the following:
works have selected SIP as a call establishment protocol. Their
environments require a way for trusted network elements ◾◾ The manner in which users are authenticated
operated by the service providers (e.g., SIP proxy servers) to ◾◾ The mechanisms used to secure the communication
communicate the identity of the subscribers to such a ser- among nodes within the Trust Domain
vice, yet also need to withhold this information from entities ◾◾ The mechanisms used to secure the communication
that are not trusted when necessary. Such networks typically between UAs and nodes within the Trust Domain
assume some level of transitive trust among providers and the ◾◾ The manner used to determine which hosts are part of
devices they operate. These networks need to support certain the Trust Domain
traditional telephony services, and meet basic regulatory and ◾◾ The default privacy handling when no Privacy header
public safety requirements. These include Calling Identity field is present
Delivery services, Calling Identity Delivery Blocking, and ◾◾ That nodes in the Trust Domain are compliant to SIP
the ability to trace the originator of a call. While baseline (RFC 3261)
SIP can support each of these services independently, certain ◾◾ That nodes in the Trust Domain are compliant to this
combinations cannot be supported without the extensions document
described in this document. For example, a caller that wants ◾◾ Privacy handling for identity as described in the sub-
to maintain privacy and consequently provides limited infor- sequent section
mation in the SIP From header field will not be identifiable ◾◾ Intermediaries including proxy servers within a Trust
by recipients of the call unless they rely on some other means Domain working as SIP B2BUAs for supporting pri-
to discover the identity of the caller. Masking identity infor- vacy in SIP as specified in RFC 3323 (see Section 20.2)
mation at the originating UA will prevent certain services, ◾◾ The meanings of the terms Identity, NAI, and Trust
for example, call trace, from working in the PSTN or being Domain defined in RFC 3324 (see Section 10.5)
performed at intermediaries not privy to the authenticated
identity of the user. An example of a suitable Spec(T) is shown here later.
Providing privacy in a SIP network is more complicated This document does not offer a general privacy or identity
than in the PSTN. In SIP networks, the participants in a ses- model suitable for interdomain use or use in the Internet at
sion are typically able to exchange IP traffic directly without large. Its assumptions about the trust relationship between
involving any SIP service provider. The IP addresses used for the user and the network may not apply in many applica-
these sessions may themselves reveal private information. A tions. For example, these extensions do not accommodate
general-purpose mechanism for providing privacy in a SIP a model whereby end users can independently assert their
environment is provided in RFC 3323 (see Section 20.2). identity by use of the extensions defined here. Furthermore,
This document applies that privacy mechanism to the prob- since the asserted identities are not cryptographically certi-
lem of NAI. fied, they are subject to forgery, replay, and falsification in
any architecture that does not meet the requirements of RFC
20.3.2 P-Asserted-Identity and 3324 (see Section 10.5).
The asserted identities also lack an indication of who spe-
P-Preferred-Identity for Privacy
cifically is asserting the identity, and so it must be assumed
The P-Asserted-Identity defined in SIP (RFC 3325, see that the Trust Domain is asserting the identity. Therefore,
Sections 2.8 and 10.4) enables a network of trusted SIP serv- the information is only meaningful when securely received
ers to assert the identity of end users or end systems, and from a node known to be a member of the Trust Domain.
to convey indications of end-user requested privacy. The use Despite these limitations, there are sufficiently useful
Privacy and Anonymity in SIP ◾ 789
specialized deployments that meet the assumptions described determine if the user requested that asserted identity infor-
above, and can accept the limitations that result, to warrant mation be kept private.
informational publication of this mechanism. An example
deployment would be a closed network that emulates a tra-
ditional circuit switched telephone network. The mechanism
20.3.4 Hints for Multiple Identities
described here relies on a new header field called “P-Asserted- If a P-Preferred-Identity header field is present in the message
Identity” that contains a URI (commonly a SIP URI) and an that a proxy receives from an entity that it does not trust,
optional display name, for example: the proxy MAY use this information as a hint suggesting
which of multiple valid identities for the authenticated user
P-Asserted-Identity: "Brian.Johnson" should be asserted. If such a hint does not correspond to any
sip:[email protected] valid identity known to the proxy for that user, the proxy
can add a P-Asserted-Identity header of its own construc-
A proxy server that handles a message can, after authenti- tion, or it can reject the request (e.g., with a 403 Forbidden).
cating the originating user in some way (e.g., Digest authen- The proxy MUST remove the user-provided P-Preferred-
tication), insert such a P-Asserted-Identity header field into Identity header from any message it forwards. A UA only
the message and forward it to other trusted proxies. A proxy sends a P-Preferred-Identity header field to proxy servers in
that is about to forward a message to a proxy server or UA a Trust Domain; UAs must not populate the P-Preferred-
that it does not trust must remove all the P-Asserted-Identity Identity header field in a message that is not sent directly to
header field values if the user requested that this information a proxy that is trusted by the UA. Were a UA to send a mes-
be kept private. Similar is the case for the P-Preferred-Identity sage containing a P-Preferred-Identity header field to a node
header for privacy. Although the syntax for these two headers outside a Trust Domain, then the hinted identity might not
can be seen in Section 2.4.1, we are again explaining in detail be managed appropriately by the network, which could have
the same two headers later here for convenience. negative ramifications for privacy.
example, these mechanisms provide no means by which end of the responder in a response (commonly called “response
users can securely share identity information end-to-end identity”) is outside the scope of this document. Note that
without a trusted service provider. Identity information that even if identity were to be conveyed somehow in a response,
the user designates as “private” can be inspected by any inter- there would in general be difficulty in authenticating the
mediaries participating in the Trust Domain. This informa- UAS. Providing identity in a separate request allows normal
tion is secured by transitive trust, which is only as reliable as authentication techniques to be used.
the weakest link in the chain of trust. When a trusted entity RFC 4916 that is described here provides a means for a
sends a message to any destination with that party’s identity SIP UA that receives a dialog-forming request to supply its
in a P-Asserted-Identity header field, the entity must take identity to the peer UA by means of a request in the reverse
precautions to protect the identity information from eaves- direction, and for that identity to be signed by an authen-
dropping and interception, to protect the confidentiality and tication service. Because of retargeting of a dialog-forming
integrity of that identity information. The use of transport request (changing the value of the Request-URI), the UA
or network layer hop-by-hop security mechanisms, such as that receives it (the UAS) can have a different identity from
TLS or IPSec with appropriate cipher suites, can satisfy this that in the To header field. The same mechanism can be
requirement. Section 10.4 describes in detail the recom- used to indicate a change of identity during a dialog, for
mended use of the asserted identity. example, because of some action in the PSTN behind a gate-
way. Thereby, RFC 4916 normatively updates RFC 3261 for
maintaining the privacy of the identity of the connected user
termed as the connected identity.
20.4 Connected Identity
for Privacy in SIP 20.4.2 Terminology
20.4.1 Overview We are repeating the definitions of some terms specifically
The SIP (RFC 3261) initiates sessions but also provides infor- related to the connected identity for convenience, although
mation on the identities of the parties at both ends of a ses- these are available in Section 2.2:
sion. Users need this information to help determine how to
deal with communications initiated by a SIP. The identity of ◾◾ Caller: The user of the UA that issues an INVITE
the party who answers a call can differ from that of the initial request to initiate a call
called party for various reasons, such as call forwarding, call ◾◾ Caller identity: The identity (AOR) of a caller
distribution, and call pick-up. Furthermore, once a call has ◾◾ Callee: The user of the UA that answers a call by issu-
been answered, a party can be replaced by a different party ing a 2xx response to an INVITE request
with a different identity for reasons such as call transfer, call- ◾◾ Callee identity: The identity (AOR) of a callee
park and retrieval, and other call services. Although, in some ◾◾ Potential c allee: The user of any UA to which an
cases, there can be reasons for not disclosing these identities, INVITE request is targeted, resulting in the formation
it is desirable to have a mechanism for providing this infor- of an early dialog; however, because of parallel or serial
mation. This document extends the use of the From header forking of the request, it is not necessarily the user that
field to allow it to convey what is commonly called “con- answers the call
nected identity” information (the identity of the connected ◾◾ Connected user: Any user involved in an established
user) in either direction within the context of an existing call, including the caller, the callee, or any user that
INVITE-initiated dialog. It can be used to convey replaces the caller or callee following a call rearrange-
ment such as call transfer
◾◾ The callee identity to a caller when a call is answered ◾◾ Connected id entity: The identity (AOR) of a con-
◾◾ The identity of a potential callee before answer nected user
◾◾ The identity of a user that replaces the caller or cal-
lee following a call rearrangement such as call transfer
carried out within the PSTN or within a B2BUA using
20.4.3 Overview of Solution
third-party call control techniques
A mid-dialog request is used to provide connected identity.
Note that the use of standard SIP call transfer techniques, The UA client (UAC) for that request inserts its identity in the
involving the REFER method, leads to the establishment of From header field of the request. To provide authentication,
a new dialog and hence normal mechanisms for caller and the Identity header field specified in RFC 4474 (see Sections
callee identity applications. The provision of the identity 2.8 and 19.4.8) is inserted by a suitable authentication service
794 ◾ Handbook on Session Initiation Protocol
on the path of the mid-dialog request. Unless provided at requires a UA that has received a connected identity in the
the UAC, the authentication service is expected to be at a URI of the From header field of a mid-dialog request to use
proxy that record-routes and is able to authenticate the that URI in the To header field of any subsequent mid-dialog
UAC. A request in the opposite direction to the INVITE request sent by that UA. In the absence of a suitable authen-
request before or at the time the call is answered can indicate tication service on the path of the mid-dialog request, the
the identity of the potential callee or callee, respectively. A UAS will receive an unauthenticated connected identity (i.e.,
request in the same direction as the INVITE request before without a corresponding Identity header field). The implica-
answer can indicate a change of caller. A request in either tions of this are discussed in the subsequent section.
direction after answering can indicate a change of the con-
nected user. In all cases, a dialog (early or confirmed) has to
be established before such a request can be sent.
20.4.4 UA Behavior outside the Context
This solution uses the UPDATE method specified in of an Existing Dialog
RFC 3311 (see Section 15.4) for the request, or in some cir- 20.4.4.1 Issuing an INVITE Request
cumstances the re-INVITE method. To send the callee iden-
tity, the UAS for the INVITE request sends the UPDATE When issuing an INVITE request, a UA compliant with this
request after sending the 2xx response to the INVITE specification MUST include the “from-change” option tag
request and after receiving an ACK request. To send the in the Supported header field. Note that sending the “from-
potential callee identity, RFC 3262 (see Sections 2.5, 2.8.2, change” option tag does not guarantee that connected iden-
and 2.10) is expected to be supported. In this case, the UAS tity will be received in subsequent requests.
for the INVITE request sends the UPDATE request after
receiving and responding to a PRACK request (which occurs 20.4.4.2 Receiving an INVITE Request
after sending a reliable 1xx response to the INVITE request).
The UPDATE request could also conceivably be used for After receiving an INVITE request, a UA compliant with
other purposes; for example, it could be used during an early this specification must include the “from-change” option tag
dialog to send the potential callee identity at the same time in the Supported header field of any dialog-forming response.
as a SDP offer for early media. To indicate a connected iden- Note that sending the “from-change” option tag does not
tity change during an established call, either the UPDATE guarantee that connected identity will be received in the
method or the re-INVITE method can be used. event of a change of caller. After an early dialog has been
The re-INVITE method would be used if required for formed, if the “from-change” option tag has been received
other purposes; for example, when a B2BUA performs trans- in a Supported header field, the UA may issue an UPDATE
fer using third-party call control techniques specified in request on the same dialog, subject to having sent a reliable
RFC 3725 (see Section 18.3), it has to issue a re-INVITE provisional response to the INVITE request and having
request without an SDP offer to solicit an SDP offer from received and responded to a PRACK request. After a full
the UA. This solution involves changing the URI (not the dialog has been formed (after sending a 2xx final response
tags) in the To and From header fields of mid-dialog requests to the INVITE request), if the “from-change” option tag has
and their responses, compared with the corresponding val- been received in a Supported header field and an UPDATE
ues in the dialog-forming request and response. According request has not already been sent on the early dialog, the
to RFC 3261, changing the To and From header field URIs UA must issue an UPDATE request on the same dialog. In
is complicated. either case, the UPDATE request must contain the callee’s
RFC 4916, therefore, deprecates mandatory reflection of (or potential callee’s) identity in the URI of the From header
the original To and From URIs in mid-dialog requests and field (or an anonymous identity if anonymity is required).
their responses, which constitutes a change to RFC 3261. Note that even if the URI does not differ from that in the
RFC 4916 makes no provision for proxies that are unable to To header field URI of the INVITE request, sending a new
tolerate a change of URI, since changing the URI has been request allows the authentication service to assert authenti-
expected for a considerable time. To cater for any UAs that cation of this identity, and confirms to the peer UA that the
are not able to tolerate a change of URI, a new option tag connected identity is the same as that in the To header field
“from-change” is introduced for providing a positive indica- URI of the INVITE request.
tion of support in the Supported header field. By sending a
request with a changed From header field URI only to targets 20.4.5 Behavior of a UA Whose
that have indicated support for this option, there is no need
Identity Changes
to send this option tag in a Require header field. In addition
to allowing the From header field URI to change during a If the “from-change” option tag has been received in a Sup
dialog to reflect the connected identity, this document also ported header field during an established INVITE-initiated
Privacy and Anonymity in SIP ◾ 795
dialog and if the identity associated with the UA changes (e.g., later, if it cannot rely on the presence of a Verifier on the
due to transfer) compared with the last identity indicated in the path of the request. If a UA receives a mid-dialog request
From header field of a request sent by that UA, the UA must from the peer UA in which the From header field URI dif-
issue a request on the same dialog containing the new identity fers from that received in the previous request on that dialog
in the URI of the From header field (or an anonymous identity or that sent in the To header field of the original INVITE
if anonymity is required). For this purpose, the UA must use request, and if the UA sends a 2xx response, the UA must
the UPDATE method unless for other reasons the re-INVITE update the remote URI for this dialog, as defined in RFC
method is being used at the same time. 3261 (see Section 3.6). This will cause the new value to be
used in the To header field of subsequent requests that the
UA sends, in accordance with the rules of Section 20.4.6.1
20.4.6 General UA Behavior (Section 4.4.1 of RFC 4916). If any other final response
20.4.6.1 Sending a Mid-Dialog Request is sent, the UA must not update the remote URI for this
dialog.
When sending a mid-dialog request, a UA MUST observe
the requirements of RFC 4474 (see Sections 2.8 and 19.4.8)
when populating the From header field URI, including pro- 20.4.7 Authentication Service Behavior
visions for achieving anonymity. This will allow an authenti- An authentication service must behave in accordance with
cation service on the path of the mid-dialog request to insert RFC 4474 (see Sections 2.8 and 19.4.8) when dealing with
an Identity header field. When sending a mid-dialog request, mid-dialog requests. Note that RFC 4474 is silent on how
a UA must populate the To header field URI with the current to behave if the identity in the From header field is not one
value of the remote URI for that dialog, where this is subject that the UAC is allowed to assert, and therefore it is a matter
to update in accordance with the rules of Section 20.4.6.2 for local policy whether to reject the request or forward it
(Section 4.4.2 of RFC 4916) rather than being fixed at the without an Identity header field. Policy can be different for a
beginning of the dialog in accordance with RFC 3261. After mid-dialog request compared with other requests. Note that
sending a request with a revised From header field URI (i.e., when UAs conform with this specification, the authentica-
revised compared with the URI sent in the From header field tion service should (subject to the normal rules for authenti-
of the previous request on this dialog or in the To header field cation) be able to authenticate the sender of a request as being
of the received dialog-forming INVITE request if no request the entity identified in the From header field, and hence will
has been sent), the UA MUST send the same URI in the be able provide a signature for this identity. This is in con-
From header field of any future requests on the same dialog, trast to UAs that do not support this specification, where
unless the identity changes again. retargeting and mid-dialog identity changes can render the
Also, the UA must be prepared to receive the revised URI From header field inaccurate as a means of identifying the
in the To header field of subsequent mid-dialog requests and sender of the request.
must also continue to be prepared to receive the old URI at
least until a request containing the revised URI in the To
header field has been received. The mid-dialog request can be 20.4.8 Verifier Behavior
rejected in accordance with RFC 4474 (see Sections 2.8 and When dealing with mid-dialog requests, an authentica-
19.4.8) if the UAS does not accept the connected identity. If tion service must behave in accordance with RFC 4474 (see
the UAC receives a 428, 436, 437, or 438 response to a mid- Sections 2.8 and 19.4.8) updated as stated below. RFC 4474
dialog request, it should regard the dialog as terminated in states that it is a matter of policy whether to reject a request
the case of a dialog-terminating request and should take no with a 428 Use Identity Header response if there is no Identity
action in the case of any other request. Any attempt to repeat header field in the request. A UA may adopt a different policy
the request or send any other mid-dialog request is likely to for mid-dialog requests compared with other requests.
result in the same response, since the UA has no control over
actions of the authentication service.
20.4.9 Proxy Behavior
A proxy that receives a mid-dialog request must be prepared
20.4.6.2 Receiving a Mid-Dialog Request
for the To header field URI and/or the From header field
If a UA receives a mid-dialog request from the peer UA, the URI to differ from those that appeared in the dialog-forming
UA can make use of the identity in the From header field request and response. A proxy that is able to provide an
URI (e.g., by indicating to the user). The UA may discrimi- authentication service for mid-dialog requests must record-
nate between signed and unsigned identities. In the case of route if Supported: from-change is indicated in the dialog-
a signed identity, the UA should invoke a Verifier, described forming request received by the proxy from the UAC.
796 ◾ Handbook on Session Initiation Protocol
<allOneLine> <allOneLine>
Identity: Via: SIP/2.0/TLS ua1.example.com;
"xN6gCHR6KxGM+nyiEM13LcWgAFQD3lkni1DPk branch=z9hG4bKnashds8;
wgadxh4BB7G+VwY13uRv5hbCI2VSvKuZ4LY received=192.0.2.1
N0JNoe7v8VAzruKMyi4Bi4nUghR/fFGBrpBSjz </allOneLine>
tmfffLTp6SFLxo9XQSVrkm1O4c/4UrKn2ej
Rz+5BULu9n9kWswzKDNjlYlmmc=" To: Bob <sip:[email protected]>;tag=2ge46ab5
</allOneLine> From: Alice <sip:alice@example
.com>;tag=13adc987
Identity-Info: <https://example.com/example Call-ID: [email protected]
.cer>;alg=rsa-sha1 CSeq: 1 INVITE
Content-Type: application/sdp Allow: INVITE, ACK, CANCEL, OPTIONS, BYE,
Content-Length: 154 UPDATE
Supported: from-change
v=0 Contact: <sip:[email protected]>
o=UserA 2890844526 2890844526 IN IP4 ua1 Record-Route: <sip:proxy.example.com;lr>
.example.com Content-Type: application/sdp
s=Session SDP Content-Length: 154
c=IN IP4 ua1.example.com
t=0 0 v=0
m=audio 49172 RTP/AVP 0 o=UserB 2890844536 2890844536 IN IP4 ua2
a=rtpmap:0 PCMU/8000 .example.com
s=Session SDP
F3. 200 OK: c=IN IP4 ua2.example.com
t=0 0
SIP/2.0 200 OK m=audio 49172 RTP/AVP 0
<allOneLine> a=rtpmap:0 PCMU/8000
Via: SIP/2.0/TLS proxy.example.com;
F5. ACK:
branch=z9hG4bK776asdhds;
received=192.0.2.2 ACK sip:[email protected] SIP/2.0
</allOneLine> Via: SIP/2.0/TLS ua1.example.com;
branch=z9hG4bKnashds9
<allOneLine>
From: Alice <sip:[email protected]>;
Via: SIP/2.0/TLS ua1.example.com;
tag=13adc987
branch=z9hG4bKnashds8;
To: Bob <sip:[email protected]>;tag=2ge46ab5
received=192.0.2.1
Call-ID: [email protected]
<allOneLine>
CSeq: 1 ACK
To: Bob <sip:[email protected]>;tag=2ge46ab5 Max-Forwards: 70
From: Alice <sip:[email protected]>; Route: <sip:proxy.example.com;lr>
tag=13adc987 Content-Length: 0
Call-ID: [email protected]
CSeq: 1 INVITE F6. ACK:
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE,
ACK sip:[email protected] SIP/2.0
UPDATE
Via: SIP/2.0/TLS proxy.example.com;
Supported: from-change
branch=z9hG4bK776asdhdt
Contact: <sip:[email protected]>
Record-Route: <sip:proxy.example.com;lr>
<allOneLine>
Content-Type: application/sdp
Via: SIP/2.0/TLS ua1.example.com;branch=z9hG4
Content-Length: 154
bKnashds9;received=192.0.2.1
v=0 </allOneLine>
o=UserB 2890844536 2890844536 IN IP4 ua2. From: Alice <sip:[email protected]>;
example.com tag=13adc987
s=Session SDP To: Bob <sip:[email protected]>;tag=2ge46ab5
c=IN IP4 ua2.example.com Call-ID: [email protected]
t=0 0 CSeq: 1 ACK
m=audio 49172 RTP/AVP 0 Max-Forwards: 69
a=rtpmap:0 PCMU/8000 Content-Length: 0
F6. re-INVITE
F9. 200 OK:
F7. 200 OK
SIP/2.0 200 OK
F8. ACK
<allOneLine>
Via: SIP/2.0/TLS proxy.example.com;
branch=z9hG4bK776asdhdu; Figure 20.2 Call flows for sending revised connected
received=192.0.2.2 identity during a call. (Copyright IETF. Reproduced with
</allOneLine> permission.)
Privacy and Anonymity in SIP ◾ 799
Identity-Info: <https://example.com/
<allOneLine>
cert>;alg=rsa-sha1
Via: SIP/2.0/TLS ua1.example.com;
Content-Length: 0
branch=z9hG4bKnashds8;
received=192.0.2.1
</allOneLine> F5. 200 OK:
nontargets of these priv-values. Some nontarget SIP head- The next five columns show the recommended treatment
ers/SDP parameters may carry privacy-sensitive information for each priv-value:
that may need privacy treatment regardless of the privacy
level requested. This is further described later. The way in ◾◾ Delete: The header is recommended to be deleted at a
which SIP headers and SDP parameters listed here are privacy service.
obscured may depend on the implementation and network ◾◾ Not add: The header is recommended not to be added
policy. This document does not prevent different variations at a privacy service.
that may exist based on local policy but tries to provide rec- ◾◾ Anonymize: The header is recommended to be ano-
ommendations for how a privacy service treats SIP headers nymized at a privacy service. How to anonymize the
and SDP parameters. header depends on the header. Details are provided
later.
◾◾ Anonymize*: An asterisk indicates that the involve-
20.5.3.1 Target SIP Headers for Each Priv-Value
ment of a privacy service and treatment of the relevant
Table 20.2 shows a recommended treatment of each SIP header depend on the circumstance. Details are given
header for each priv-value. Detailed descriptions of the rec- later.
ommended treatment per SIP header are covered in Section
2.8. The “where” column describes the request and response Any time a privacy service modifies a Call-ID, it must
types in which the header needs the treatment to maintain retain the former and modified values as indicated in Section
privacy. Values in this column are 5.3 of RFC 3323 (see Section 20.2). It must then restore the
former value in a Call-ID header and other corresponding
◾◾ R: The header needs the treatment when it appears in headers and parameters (such as In-Reply-To, Replaces, and
a request. Target-Dialog) in any messages that are sent using the modi-
◾◾ r: The header needs the treatment when it appears in fied Call-ID to the originating UA. It should also modify a
a response. Call-ID header and other corresponding headers/parameters
Contact R – Anonymize – – –
From R Anonymize – – – –
In-Reply-To R Delete – – – –
Record-Route Rr – Anonymize – – –
Referred-By R Anonymize* – – – –
Reply-To Rr Delete – – – –
Subject R Delete – – – –
User-Agent R Delete – – – –
Via R – Anonymize – – –
Warning R Anonymize – – – –
Source: Copyright IETF. Reproduced with permission.
Privacy and Anonymity in SIP ◾ 803
(see Section 2.5). Since the Contact header is essential for could be also expressed for a specific History-Info entry by
routing further requests to the UA, it must include a func- inserting “privacy=history” in the History-Info header. In
tional URI even when it is anonymized. A UA must not ano- such a case, a privacy service should delete the History-Info
nymize a Contact header, unless it can obtain an IP address entry as indicated in RFC 4244. RFC 4244 (see Section 2.8)
or contact address that is functional yet has a characteristic describes the detailed behavior for dealing with History-Info
of anonymity as indicated in RFC 3323 (see Sections 2.8 headers.
and 20.2). Since RFC 3323 was published, there have been
proposals that allow UAs to obtain an IP address or contact
20.5.4.1.6 In-Reply-To
address with a characteristic of anonymity. The mechanisms
are described in Globally Routable UA URI (GRUU) (see The In-Reply-To header contains a Call-ID of the referenced
Section 4.3), which provides a functional Contact address dialog. The replying user may be identified by the Call-ID in
with a short life span, making it ideal for privacy-sensitive an In-Reply-To header.
calls, and TURN (see Section 14.3), through which an
IP address of a relay can be obtained for use in a Contact ◾◾ Alice -> INVITE(Call-ID:C1) -> Bob
header. A privacy service should anonymize a Contact header ◾◾ Bob -> INVITE(In-Reply-To:C1) -> Alice
by replacing the existing Contact header field value with the
URI that dereferences to the privacy service when user pri- In this case, unless the In-Reply-To header is deleted,
vacy is requested with Privacy:header, as indicated in RFC Alice might notice that the replying user is Bob because
3323 (see Section 20.2). This is generally done by replacing Alice’s UA knows that the Call-ID relates to Bob. A UA exe-
the IP address or host name with that of the privacy service. cuting a user-level privacy function on its own should not add
an In-Reply-To header as implied in RFC 3323 (see Sections
2.8 and 20.2). A privacy service must delete the In-Reply-To
20.5.4.1.4 From header when user privacy is requested with Privacy:user as
This field contains the identity of the user, such as display indicated in RFC (see Sections 2.8 and 20.2). In addition,
name and URI. A UA executing a user-level privacy func- since an In-Reply-To header contains the Call-ID of the dia-
tion on its own should anonymize a From header using an log to which it is replying, special attention is required, as
anonymous display name and an anonymous URI as indi- described earlier, regardless of the priv-value or presence of a
cated in RFC 3323 (see Sections 2.8 and 20.2). A privacy Privacy header. Once a privacy service modifies a Call-ID in
service should anonymize a From header when user privacy is the request, a privacy service should restore the former value
requested with Privacy:user. Note that this does not prevent in an In-Reply-To header, if present in the INVITE request
a privacy service from anonymizing the From header based replying to the original request, as long as the privacy service
on local policy. The anonymous display name and anony- maintains the dialog state.
mous URI mentioned in this section use the display name Example:
“Anonymous,” a URI with “anonymous” in the user portion
of the From header, and the host name value “anonymous. ◾◾ Alice -> INVITE(Call-ID:C1, Privacy:user) ->
PS -> INVITE(Call-ID:C2) -> Bob
invalid” as indicated in RFC 3323. The recommended form
◾◾ Bob -> INVITE(In-Reply-To:C2, Privacy:none)
of the From header for anonymity is -> PS -> INVITE(In-Reply-To:C1) -> Alice
From: "Anonymous" <sip:anonymous@anonymous
.invalid>;tag=1928301774 Note that this is possible only if the privacy service
maintains the state and retains all the information that it
The tag value varies from dialog to dialog, but the rest of this modified to provide privacy even after the dialog has been
header form is recommended as shown. terminated, which is unlikely. Callback is difficult to achieve
when a privacy service is involved in forming the dialog to
20.5.4.1.5 History-Info be referenced.
privacy service should not add an Organization header when adding a single Record-Route header representing itself. In
user privacy is requested with Privacy: header as indicated in this case, the privacy service needs to retain the removed
RFC 3323. headers and restore them in a response. Alternatively, privacy
services can remove the Record-Route headers and encrypt
them into a single Record-Route header field. In this case,
20.5.4.1.8 P-Asserted-Identity
the privacy service needs to decrypt the header and restore
This header contains a network-verified and network-asserted the former values in a response.
identity of the user sending a SIP message. A privacy service A privacy service should strip or encrypt any Record-
must delete the P-Asserted-Identity headers when user pri- Route headers that have been added to a message before it
vacy is requested with Privacy:id as indicated in RFC 3325 reaches the privacy service when user privacy is requested
(see Sections 2.8 and 20.3), and should delete the P-Asserted- with Privacy:header as indicated in RFC 3323 (see Sections
Identity headers when user privacy is requested with 2.8 and 20.2). As in the case of a Call-ID, if a privacy ser-
Privacy:header before it forwards the message to an entity vice modifies the Record-Route headers, it must be able
that is not trusted. It is recommended for a privacy service to restore Route headers with retained values as indicated
to remove the P-Asserted-Identity header if user privacy is in RFC 3323. Some examples where the restoration of the
requested with Privacy:id or Privacy:header even when for- Route headers is necessary and unnecessary are given below.
warding to a trusted entity, unless it can be confident that When a UAC (Alice) requires privacy for a request, a pri-
the message will not be routed to an untrusted entity without vacy service does not have to restore the Route headers in the
going through another privacy service. subsequent request (see Figure 20.3). Figure 20.3 shows that
the restoration of route header is unnecessary when UAC
requires privacy.
20.5.4.1.9 Record-Route
This field may reveal information about the administrative ◾◾ Alice -> INVITE(Privacy:header) > P1 ->
domain of the user. To hide Record-Route headers while INVITE(Record-Route:P1, Privacy:header)
-> PS -> INVITE(Record-Route:PS) -> P2 ->
keeping routability to the sender, privacy services can execute INVITE(Record-Route:P2,PS) > Bob
a practice referred to as stripping. Stripping means remov- ◾◾ Bob -> 200 OK(Record-Route:P2,PS) -> P2 ->
ing all the Record-Route headers that have been added to PS-> 200 OK(Record-Route:P2,PS,P1) -> P1 ->
the request before its arrival at the privacy service, and then Alice
Privacy
Proxy 1 Proxy 2
Alice service Bob
(P1) (P2)
(PS)
F12. INVITE
Figure 20.3 Example of when restoration of Route header is unnecessary. (Copyright IETF. Reproduced with permission.)
806 ◾ Handbook on Session Initiation Protocol
Privacy
Proxy 1 Proxy 2
Alice service’ Bob
(P1) (P2)
(PS’)
F1. INVITE
F2. INVITE (RR:P1)
F4. 200 OK
(Priv RR:P2, P1)
F5. 200 OK
F6. 200 OK (Priv RR:P2, P1)
(RR:PS’, P1)
F7. 200 OK
(RR:PS’, P1)
Figure 20.4 Example of when restoration of Route header is necessary. (Copyright IETF. Reproduced with permission.)
Privacy and Anonymity in SIP ◾ 807
20.5.4.1.11 Reply-To privacy service should strip or encrypt any Via headers that
have been added before reaching the privacy service when
This field contains a URI that can be used to reach the user
user privacy is requested with Privacy:header as indicated in
on subsequent callbacks. A UA executing a user-level privacy
in RFC 3323. Refer to the Record-Route header for details
function on its own should not add a Reply-To header in the
of stripping and encryption as described earlier. A privacy
message as implied in RFC 3323 (see Sections 2.8 and 20.2).
service must restore the original values of Via headers when
A privacy service must delete a Reply-To header when user
handling a response in order to route the response to the orig-
privacy is requested with Privacy:user as indicated in RFC
inator as indicated RFC 3323. No Via stripping is required
3323 (see Sections 2.8 and 20.2).
when handling responses.
20.5.4.1.12 Server 20.5.4.1.16 Warning
This field contains information about the software used by This field may contain the host name of the UAS. A UA
the UAS to handle the request. A UA executing a user-level executing a user-level privacy function on its own should not
privacy function on its own should not add a Server header include the host name representing its identity in a Warning
in the response as implied in Section 4.1 of RFC 3323. A pri- header. A privacy service should anonymize a Warning header
vacy service must delete a Server header in a response when by deleting the host name portion (if it represents a UAS’s
user privacy is requested with Privacy:user. A privacy service identity) from the header when user privacy is requested with
should not add a Server header in a response when user pri- Privacy:user.
vacy is requested with Privacy:header as indicated in RFC
3323 (see Sections 2.8 and 20.2).
20.5.4.2 Target SDP Parameters
20.5.4.1.13 Subject This section describes privacy considerations for each SDP
parameter specified in RFC 4566 (see Section 7.7) that may
This field contains free-form text about the subject of the reveal information about the user. When privacy functions
call. It may include text describing something about the for user-inserted information are requested to be executed at
user. A UA executing a user-level privacy function on its own a privacy service, UAs must not encrypt SDP bodies in mes-
should not include any information identifying the caller sages as indicated in RFC 3323 (see Sections 2.8 and 20.2).
in a Subject header. A privacy service must delete a Subject
header when user privacy is requested with Privacy:user as
indicated in RFC 3323 (see Sections 2.8 and 20.2). 20.5.4.2.1 c/m Lines
The c and m lines in the SDP body convey the IP address and
20.5.4.1.14 User-Agent port for receiving media. A UA must not anonymize the IP
address and port in the c and m lines, unless it can obtain an
This field contains the UAC’s information. A UA executing IP address that is functional yet has a characteristic of ano-
a user-level privacy function on its own should not add a nymity as implied in RFC 3323 (see Sections 2.8 and 20.2).
User-Agent header as implied in RFC 3323 (see Sections 2.8 This may be possible by obtaining an IP address specifically
and 20.2). A privacy service must delete a User-Agent header for this purpose either from the service provider or through
when user privacy is requested with Privacy:user as indicated features such as TURN (see Section 14.3). A privacy service
in RFC 3323. must anonymize the IP address and port in c and m lines
using a functional anonymous IP address and port when user
privacy is requested with Privacy:session. This is generally
20.5.4.1.15 Via
done by replacing the IP address and port present in the SDP
The bottommost Via header added by a UA contains the IP with that of a relay server.
address and port or host name that are used to reach the
UA for responses. Via headers added by proxies may reveal
20.5.4.2.2 o Line
information about the administrative domain of the user. A
UA MUST NOT anonymize a Via header as indicated in The user name and IP address in this parameter may reveal
RFC 3323 (see Sections 2.8 and 20.2), unless it can obtain information about the user. A UA may anonymize the user
an IP address that is functional yet has a characteristic of name in an o line by setting user name to “-” and anonymize
anonymity. This may be possible by obtaining an IP address the IP address in the o line by replacing it with a value so that
specifically for this purpose either from the service provider it is sufficiently unique. A privacy service must anonymize
or through features such as TURN (see Section 14.3). A the user name and IP address in the o line by setting the user
808 ◾ Handbook on Session Initiation Protocol
name to “-” and replacing the IP address with a value so that Path header is not the target of any priv-values. Given that
it is sufficiently unique when user privacy is requested with the Path header (see Section 2.8) only appears in REGISTER
Privacy:session. requests–responses and is essential for a call to reach the reg-
istered UA in the visited domain, it serves no purpose to with-
20.5.4.2.3 i/u/e/p Lines hold or hide the information contained in the Path header;
rather, it is harmful. The only reason privacy may be con-
These lines may contain information about the user. A UA sidered desirable is if the visited domain wants to withhold
executing a session-level privacy function on its own should its topology from the home domain of the user. In doing so,
not include user’s information in the i, u, e, and p lines. A the domain withholding the topology needs to ensure that
privacy service should modify the i, u, e, and p lines to delete it provides sufficient information so that the home domain
the user’s identity information when user privacy is requested can route the call to the visited domain, thus reaching the
with Privacy:session. UA. However, anonymization of network privacy-sensitive
information is out of scope.
20.5.4.3 Nontarget SIP Headers/Parameters
20.5.4.3.1 Identity/Identity-Info 20.5.4.3.3 Replaces Header/Parameter
The Identity header field specified in RFC 4474 (see Sections The Replaces header (see Section 2.8) and the “replaces”
2.8 and 19.4.8) contains a signature used for validating the parameter contain identifiers of a dialog to be replaced,
identity. The Identity-Info header field contains a reference to which are composed of Call-ID, local tag, and remote tag.
the certificate of the signer of Identity headers. An Identity- The sender of the INVITE with a Replaces header is usu-
Info header may reveal information about the administra- ally not the originating UA or terminating UA of the tar-
tive domain of the user. The signature in an Identity header get dialog to be replaced. Therefore, the Call-ID within the
provides integrity protection over the From, To, Call-ID, Replaces header is unlikely to be generated by the sender,
Cseq, Date, and Contact headers and over the message body. and thus this header is outside the anonymization target
The integrity protection is violated if a privacy service modi- per priv-value. The “replaces” parameter, which appears in
fies these headers and/or the message body for the purpose a Refer-To header in a REFER request, is not the target of
of user privacy protection. Once those integrity-protected any particular priv-values either. As described in the Call-ID
headers (such as From and Call-ID) are modified, the header earlier, regardless of the priv-value or the presence of
Identity/Identity-Info header fields are not valid any more. a Privacy header, once a privacy service modifies a Call-ID
Thus, a privacy service acting on a request for Privacy:user, in the request, it should monitor headers that may contain
Privacy:header, or Privacy:session can invalidate integrity Call-ID and restore the portion of the value representing the
protection provided by an upstream authentication service modified Call-ID to the original Call-ID value in a Replaces
that has inserted Identity/Identity-Info header fields. header received.
The use of such a privacy service should be avoided if The main challenge for this to function properly is that
integrity protection needs to be retained. Otherwise, if the a privacy service has to be on a signaling path to the origi-
privacy service invalidates the integrity protection, it should nator for every dialog. This is generally not possible and
remove the Identity/Identity-Info header fields. An authen- results in REFER requests not functioning at all times. This
tication service downstream of the privacy service may add is a trade-off that is anticipated when privacy is imposed.
Identity/Identity-Info header fields if the domain name of The privacy requirements mentioned earlier will cause the
the From header field URI has not been anonymized (e.g., Replaces header and “replaces” parameter to contain values
“sip:[email protected]”), which makes it possible that will fail the resulting dialog establishment in some situ-
for the service to authenticate the UAC. This authenticated ations. This loss of functionality is allowed and/or intended
yet anonymous From header means that “this is a known as illustrated above (i.e., it is not the responsibility of a pri-
user in my domain that I have authenticated, but I am vacy service to ensure that these features always work). The
keeping its identity private” as indicated in RFC 4474. The functionality of the Replaces header/parameter when ano-
desired deployment will have a privacy service located before nymized depends on the circumstances in which it is used.
or colocated with the identity service; thus, integrity and pri- REFER may work or may not work depending on the follow-
vacy can both be provided seamlessly. ing three criteria:
20.5.4.3.2 Path
◾◾ Who generated the Call-ID
This field may contain information about the administrative ◾◾ Where the privacy service is on the signaling path
domain and/or the visited domain of the UA. However, the ◾◾ Who initiates the REFER with the “replaces” parameter
Privacy and Anonymity in SIP ◾ 809
A few examples that explore when the Replaces header/ Example 5 : Hold; privacy service added only for first
parameter works or fails are given below. INVITE
Example 1: Transfer initiated by the originator, privacy
service added for first INVITE and REFER ◾◾ Alice -> INVITE(Call-ID:C1, Privacy:user) ->
PS -> INVITE(Call-ID:C2) -> Bob
◾◾ Alice -> REF(Refer-To:Bob?Replaces=C1) ->
◾◾ Alice -> INVITE(Call-ID:C1, Privacy:user) ->
Music-Server
PS -> INVITE(Call-ID:C2) -> Bob
◾◾ Music-Server -> INV(Replaces:C1) -> Bob
◾◾ Alice -> REFER(Refer-To:Bob?Replaces=C1,
(Fail)
Privacy:user) -> PS -> REFER(Refer-
To:Bob?Replaces=C2) -> Carol
◾◾ Carol -> INVITE(Replaces:C2) -> Bob Note: Example 5 would succeed if the same privacy ser-
(Succeed) vice (that modifies the Call-ID in the INVITE from Alice) is
added for the INVITE from the Music-Server and modifies
Example 2: Transfer initiated by the originator, privacy the value in a Replaces header from C1 to C2.
service added only for first INV As the above examples show, in some scenarios, informa-
tion carried in the Replaces header/parameter would result in
◾◾ Alice -> INVITE(Call-ID:C1, Privacy:user) -> failure of the REFER. This will not happen if the Call-ID is
PS -> INVITE(Call-ID:C2) -> Bob not modified at a privacy service.
◾◾ Alice -> REFER(Refer-To:Bob?Replaces=C1) ->
Carol
◾◾ Carol -> INVITE(Replaces:C1) -> Bob (Fail) 20.5.4.3.4 Route
This field may contain information about the administrative
Note that Example 2 would succeed if the same privacy domain of the UA, but the Route header is not the target of
service (that modifies the Call-ID in the INVITE from any priv-value. Route headers appear only in SIP requests
Alice) is also added for REFER and modifies the value in to force routing through the listed set of proxies. If a pri-
the “replaces” parameter from C1 to C2 even if there is no vacy service anonymizes the Route header, the routing does
Privacy header in the REFER. not function. Furthermore, there is no risk in revealing the
Example 3: Transfer initiated by the originator; privacy information in the Route headers to further network enti-
service added only for REFER ties, including the terminating UA, because a proxy removes
the value from the Route header when it replaces the value
◾◾ Alice > INVITE(Call-ID:C1) -> INVITE(Call- in the Request-URI as defined in RFC 3261 (see Section
ID:C1) -> Bob 4.2). A privacy service that modifies Record-Route headers
◾◾ Alice -> REFER(Refer-To:Bob?Replaces=C1, may need to restore the values in Route headers as necessary.
Privacy:user) -> PS -> REFER(Refer- As indicated in RFC 3323 (see Sections 2.8 and 20.2), if a
To:Bob?Replaces=C1) -> Carol privacy service modifies the Record-Route headers, it must
◾◾ Carol -> INVITE(Replaces:C1, Privacy:user)
be able to restore Route headers with retained values. Please
-> PS’ -> INVITE(Replaces:C1) -> Bob
(Succeed) refer to the Record-Route header as described earlier for fur-
ther detail and examples.
not a target for any particular priv-values, and how a privacy that is described here specifies the UA-driven privacy mecha-
service still needs to evaluate and modify the value contained nism in SIP by allowing a UA to take control of its privacy,
even if no privacy is requested. rather than being completely dependent on an external pri-
vacy service, enhancing overall privacy services defined in
RFC 3323 as well as the usage of RFC 3323 specified in RFC
20.6 Anonymity in SIP 3325 (see Sections 2.8 and 20.3). The privacy-sensitive infor-
mation is defined in RFC 5767 more specifically as the infor-
20.6.1 Overview mation that identifies a user who sends the SIP message, as
well as other information that can be used to guess the user’s
The SIP (RFC 3261) allows users to make anonymous calls.
identity. The protection of network privacy (e.g., topology
Anonymity may include personal, location, media, and other
hiding) is not defined in RFC 5767. Privacy-sensitive infor-
information. SIP signaling message headers usually contain
mation includes display name and URI in a From header
personal (e.g., UA, organization, From, Call-Id) and location
field that can reveal the user’s name and affiliation (e.g., com-
(e.g., Contact, Via, Route, Route-Record) information, while
pany name), and IP addresses or host names in a Contact
the SDP that contains the media address information such
header field, a Via header field, a Call-ID header field, or a
as IP address, port number, and so on, is carried in the SIP
SDP (see Section 7.7) body that might reveal the location of
signaling message body. The privacy services anonymize all
a UA. RFC 5767 specifies a mechanism for a UA to generate
signaling message headers by SIP UA and/or service provid-
an anonymous SIP message by utilizing mechanisms such
ers that reside between the source–destination paths related
as GRUU (see Section 4.3) and TURN (see Section 14.3)
to the personal, location, media, and other information.
without the need for a privacy service.
However, the privacy services can anonymize differ-
ent user- and media-related information of the calling and
the called party as required. The intermediaries that reside 20.6.2.2 Treatment of Privacy- Sensitive
between the source–destination paths need to be able to Information
modify, replace, and add headers of both SIP signaling and
message body. In this respect, the intermediaries need to act Some fields of a SIP message potentially contain privacy-
as the B2BUA to do so rightly as specified in RFC 3323 (see sensitive information but are not essential for achieving the
Section 20.2) because a proxy will not be able to do all the intended purpose of the message and can be omitted without
functions that are required for providing anonymity/privacy any side effects. Other fields are essential for achieving the
services. intended purpose of the message and need to contain ano-
Although SIP RFC 3261 allows users to make anonymous nymized values in order to avoid disclosing privacy-sensitive
calls by including a From header field whose display name information. Of the privacy-sensitive information listed ear-
has the value of “Anonymous,” greater levels of anonymity lier, URIs, host names, and IP addresses in Contact, Via, and
can be provided using privacy services with the capabilities SDP are required to be functional (i.e., suitable for purpose)
subsequently defined in RFC 3323 (see Section 20.2), which even when they are anonymized. With the use of GRUU (see
introduces the Privacy header field. The Privacy header field Section 4.3) and TURN (see Section 14.3), a UA can obtain
allows a requesting UA to ask for various levels of anonym- URIs and IP addresses for media and signaling that are func-
ity, including user-level, header-level, and session-level ano- tional yet anonymous, and do not identify either the UA or
nymity. In addition, RFC 3325 (see Sections 2.8 and 20.3) the user. How to obtain a functional anonymous URI and IP
defines the P-Asserted-Identity header field, used to contain address is described below. Host names need to be concealed
an asserted identity. RFC 3325 also defined the “id” value for because the user’s identity can be guessed from them; how-
the Privacy header field, which is used to request the network ever, they are not always regarded as critical privacy-sensitive
to remove the P-Asserted-Identity header field. information. In addition, a UA needs to be careful not to
include any information that identifies the user in optional
SIP header fields such as Subject and User-Agent.
20.6.2 UA-Driven Anonymity
20.6.2.1 Background 20.6.2.2.1 Anonymous URI Using
the GRUU Mechanism
RFC 3323 (see Section 20.2) defines a privacy mechanism
for the SIP that relies on the use of a separate privacy service A UA wanting to obtain a functional anonymous URI must
to remove privacy-sensitive information from SIP messages support and utilize the GRUU mechanism unless it is able to
sent by a UA before forwarding those messages to the final obtain a functional anonymous URI through other means
destination. It does not provide any mechanism for taking outside the scope for this document. By sending a REGISTER
control of the privacy services by the UA itself. RFC 5767 request requesting GRUU, the UA can obtain an anonymous
Privacy and Anonymity in SIP ◾ 811
URI, which can later be used for the Contact header field. The must obscure or conceal all the critical UA-inserted pri-
detailed process on how a UA obtains a GRUU is described in vacy-sensitive information in SIP requests and responses, as
Section 4.3. To use the GRUU mechanism to obtain a func- described below, when user privacy is requested. In addition,
tional anonymous URI, the UA must request GRUU in the how the UA should conceal the noncritical privacy-sensitive
REGISTER request. If a “temp-gruu” SIP URI parameter information is described. Furthermore, when a UA uses a
and value are present in the REGISTER response, the UA relay server to conceal its identity, the UA must send requests
MUST use the value of the “temp-gruu” as an anonymous to the relay server to ensure that the request and response fol-
URI representing the UA. This means that the UA must use low the same signaling path.
this URI as its local target, and that the UA must place this
URI in the Contact header field of subsequent requests and
responses that require the local target to be sent. 20.6.2.4 UA Behavior for Critical
If there is no “temp-gruu” SIP URI parameter in the 200 Privacy- Sensitive Information
OK response to the REGISTER request, a UA should not pro- 20.6.2.4.1 Contact Header Field
ceed with its anonymization process, unless something equiv-
alent to “temp-gruu” is provided through some administrative When using this header field in a dialog-forming request
means. It is recommended that the UA consult the user before or response, or in a mid-dialog request or response, this
sending a request without a functional anonymous URI when field contains the local target, that is, a URI used to reach
privacy is requested from the user. Owing to the nature of the UA for mid-dialog requests and possibly out-of-dialog
how GRUU works, the domain name is always revealed when requests, such as a REFER request (see Section 2.8). The
GRUU is used. If revealing the domain name in the Contact Contact header field can also contain a display name. Since
header field is a concern, use of a third-party GRUU server is the Contact header field is used for routing further requests
a possible solution; however, this is not specified in RFC 5767. to the UA, the UA must include a functional URI even when
it is anonymized. When using this header field in a dialog-
forming request or response, or in a mid-dialog request
20.6.2.2.2 Anonymous IP Address Using or response, the UA must anonymize the Contact header
the TURN Mechanism field using an anonymous URI (“temp-gruu”) obtained
A UA that is not provided with a functional anonymous IP through the GRUU mechanism, unless an equivalent func-
address through some administrative means must obtain a tional anonymous URI is provided by some other means.
relayed address (IP address of a relay) if anonymity is desired For other requests and responses, with the exception of 3xx
for use in SDP and in the Via header field. Such an IP address responses, REGISTER requests, and 200 OK responses to a
is to be derived from a STUN (see Section 14.3) relay server REGISTER request, the UA must either omit the Contact
through the TURN mechanism, which allows a STUN header field or use an anonymous URI. We have described
server to act as a relay. Anonymous IP addresses are needed earlier in detail how to obtain an anonymous URI through
for two purposes. The first is for use in the Via header field GRUU. The UA must omit the display name in a Contact
of a SIP request. By obtaining an IP address from a STUN header field or set the display name to “Anonymous.”
relay server, using that address in the Via header field of the
SIP request, and sending the SIP request to the STUN relay 20.6.2.4.2 From Header Field in Requests
server, the IP address of the UA will not be revealed beyond
the relay server. Without privacy considerations, this field contains the iden-
The second is for use in SDP as an address for receiving tity of the user, such as the display name and URI. RFCs
media. By obtaining an IP address from a STUN relay server 3261 (see Sections 2.8 and 3.1) and 3323 (see Section 20.2)
and using that address in SDP, media will be received via the recommend setting “sip:[email protected]”
relay server. Also, media can be sent via the relay server. In as a SIP URI in a From header field when user privacy is
this way, neither SDP nor media packets reveal the IP address requested. This raises an issue when the SIP-Identity mecha-
of the UA. It is assumed that a UA is either manually or auto- nism specified in RFC 4474 (see Sections 2.8 and 19.4.8)
matically configured through means such as the configura- is applied to the message, because SIP-Identity requires an
tion framework (RFC 6011) with the address of one or more actual domain name in the From header field. A UA generat-
STUN relay servers to obtain an anonymous IP address. ing an anonymous SIP message supporting this specification
must anonymize the From header field in one of the two
ways described below.
20.6.2.3 UA Behavior
This section describes how to generate an anonymous SIP Option 1: A UA anonymizes a From header field using
message at a UA. A UA fully compliant with this document an anonymous display name and an anonymous URI
812 ◾ Handbook on Session Initiation Protocol
following the procedure noted in RFC 3323. The 20.6.2.5 UA Behavior for Noncritical
example form of the From header field of option 1 is Privacy-Sensitive Information
as follows:
20.6.2.5.1 Host Names in Other SIP Header Fields
From: "Anonymous" <sip:anonymous
A UA generating an anonymous SIP message supporting this
@anonymous.invalid>;tag=1928301774
specification should conceal host names in any SIP header
Option 2 : A UA anonymizes a From header field using fields, such as Call-ID and Warning header fields, if consid-
an anonymous display name and an anonymous URI ered privacy sensitive.
with user’s valid domain name instead of “anonymous.
invalid.” The example form of the From header field of 20.6.2.5.2 Optional SIP Header Fields
option 2 is as follows:
Other optional SIP header fields (such as Call-Info,
From: "Anonymous" <sip:anonymous In-Reply-To, Organization, Referred-By, Reply-To, Server,
@example.com>;tag=1928301774 Subject, User-Agent, and Warning) can contain privacy-
sensitive information. A UA generating an anonymous SIP
A UA should go with option 1 to conceal its domain message supporting this specification should not include any
name in the From header field. However, SIP-Identity cannot information that identifies the user in such optional header
be used with a From header field in accordance with option fields.
1, because the SIP-Identity mechanism uses authentication
based on the domain name. If a UA expects the SIP-Identity
mechanism to be applied to the request, it is recommended 20.6.2.6 Security Considerations
to go with option 2. However, the user’s domain name will This specification uses GRUU and TURN, and inherits
be revealed from the From header field of option 2. If the any security considerations described in these documents.
user wants both anonymity and strong identity, a solution Furthermore, if the provider of the caller intending to
would be to use a third-party anonymization service that obscure its identity consists of a small number of people
issues an AOR for use in the From header field of a request, (e.g., small enterprise, small office/home office [SOHO]),
and that also provides a SIP-Identity authentication service. the domain name alone can reveal the identity of the
Third-party anonymization service is not specified in RFC caller. The same can be true when the provider is large
5767. but the receiver of the call only knows a few people from
the source of call. There are mainly two places in the
20.6.2.4.3 Via Header Field in Requests message, the From header field and Contact header field,
where the domain name is expected to be functional. The
Without privacy considerations, the bottommost Via header domain name in the From header field can be obscured as
field added to a request by a UA contains the IP address described earlier, whereas the Contact header field needs
and port or host name that are used to reach the UA for to contain a valid domain name at all times in order to
responses. A UA generating an anonymous SIP request sup- function properly.
porting this specification must anonymize the IP address in Note that, in general, a device will not show the contact
the Via header field using an anonymous IP address obtained address to the receiver, but this does not mean that one can-
through the TURN mechanism, unless an equivalent func- not find the domain name in a message. In fact, as long as
tional anonymous IP address is provided by some other this specification is used to obscure identity, the message
means. The UA should not include a host name in a Via will always contain a valid domain name as it inherits key
header field. characteristics of GRUU. Also, for UAs that use a tempo-
rary GRUU, confidentiality does not extend to parties that
are permitted to register to the same AOR or are permitted
20.6.2.4.4 IP Addresses in SDP
to obtain temporary GRUUs when subscribed to the “reg”
A UA generating an anonymous SIP message supporting event package (RFC 3680, see Section 5.3) for the AOR.
this specification MUST anonymize IP addresses in SDP, if To limit this, it is suggested that the authorization policy
present, using an anonymous IP address obtained through for the “reg” event package permit only those subscribers
the TURN mechanism, unless an equivalent functional authorized to register to the AOR to receive temporary
anonymous IP address is provided by some other means. We GRUUs. With this policy, the confidentiality of the tem-
have described earlier in detail how to obtain an IP address porary GRUU will be the same whether or not the “reg”
through TURN. event package is used. If one wants to assure anonymization,
Privacy and Anonymity in SIP ◾ 813
it is suggested that the user seek and rely on a third-party ◾◾ The request contained a Privacy header field whose
anonymization service, which is outside the scope of this value indicates that the user wishes its identity with-
document. A third-party anonymization service provides held. Values meeting this criteria are “id,” specified in
registrar and TURN service that have no affiliation with RFC 3325 (see Sections 2.8 and 20.3), or “user.”
the caller’s provider, allowing the caller to completely with- ◾◾ The From header field contains a URI that has an
hold its identity. explicit indication that it is anonymous. One such
example of a mechanism that would meet this criteria
is [coexistence]. This criterion is true even if the request
20.6.3 Rejecting Anonymous Requests has a validated Identity header field.
Although users need to be able to make anonymous calls, ◾◾ RFC 4474 (see Sections 2.8 and 19.4.8) can be used in
users that receive such calls retain the right to reject the call concert with anonymized From header fields.
because it is anonymous. The SIP does not provide a response
code that allows the UAS, or a proxy acting on its behalf, Lack of an NAI, such as the P-Asserted-Identity header
to explicitly indicate that the request was rejected because it field, in and of itself, should not be considered an indication
was anonymous. The closest response code is 403 Forbidden, of anonymity. Even though a Privacy header field value of
which does not convey a specific reason. While it is possi- “id” will cause the removal of an NAI, there is no way to
ble to include a reason phrase in a 403 Forbidden response differentiate this case from one in which an NAI was not
that indicates to the human user that the call was rejected supported by the originating domain. As a consequence, a
because it was anonymous, that reason phrase is not useful request without an NAI is considered anonymous only when
for automata and cannot be interpreted by callers that speak there is some other indication of this, such as a From header
a different language. An indication that can be understood field with a display name of “Anonymous.”
by an automaton would allow for programmatic handling, In addition, requests where the identity of the requestor
including user interface prompts, or conversion to equivalent cannot be determined or validated, but it is not a consequence
error codes in the PSTN when the client is a gateway. To of an explicit action on the part of the requestor, are not con-
remedy this, RFC 5079 that is described here defines the 433 sidered anonymous. For example, if a request contains a non-
Anonymity Disallowed response code (also see Section 2.6) anonymous From header field, along with the Identity and
that indicates that the server refused to fulfill the request Identity-Info header fields (RFC 4474, see Sections 2.8 and
because the requestor was anonymous. 19.4.8), but the certificate could not be obtained from the
reference in the Identity-Info header field, it is not considered
an anonymous request, and the 433 Anonymity Disallowed
20.6.3.1 Server Behavior response code should not be used.
A server, generally acting on behalf of the called party, though
this need not be the case, may generate a 433 Anonymity
Disallowed response when it receives an anonymous request, 20.6.3.2 UAC Behavior
and the server refuses to fulfill the request because the A UAC receiving a 433 Anonymity Disallowed must not
requestor is anonymous. A request should be considered retry the request without anonymity unless it obtains confir-
anonymous when the identity of the originator of the request mation from the user that this is desirable. Such confirmation
has been explicitly withheld by the originator. This occurs in could be obtained through the user interface, or by access-
any one of the following cases: ing user-defined policy. If the user has indicated that this is
desirable, the UAC may retry the request without request-
◾◾ The From header field contains a URI within the anon- ing anonymity. Note that if the UAC were to automatically
ymous.invalid domain. retry the request without anonymity in the absence of an
◾◾ The From header field contains a display name whose indication from the user that this treatment is desirable, then
value is either “Anonymous” or “anonymous.” Note the user’s expectations would not be met. Consequently, a
that display names make a poor choice for indicating user might think it had completed a call anonymously when
anonymity, since they are meant to be consumed by it is not actually anonymous. Receipt of a 433 Anonymity
humans, not automata. Thus, language variations and Disallowed response to a mid-dialog request should not cause
even misspelling can cause an automaton to miss a hint the dialog to terminate, and should not cause the specific
in the display name. Despite these problems, a check usage of that dialog to terminate (RFC 5057, see Sections
on the display name is included here because RFC 3.6.5 and 16.2). A UAC that does not understand or care
3261 explicitly calls out the usage of the display name about the specific semantics of the 433 response will treat it
as a way to declare anonymity. as a 400 Bad Request response.
814 ◾ Handbook on Session Initiation Protocol
815
816 ◾ Appendix A
with the base interpretation of those characters indicated Section A.5 provides definitions for a 7-bit US-ASCII envi-
explicitly. The following bases are currently defined: ronment as has been common to much of the Internet. By
separating external encoding from the syntax, it is intended
b = binary that alternate encoding environments can be used for the
d = decimal
same syntax.
x = hexadecimal
will match only the string that comprises only the lowercase Note: A quoted string containing alphabetic characters is a
characters, abc. special form for specifying alternative characters and is inter-
preted as a nonterminal representing the set of combinatorial
strings with the contained characters, in the specified order
A.2.4 External Encodings but with any mixture of uppercase and lowercase.
External representations of terminal value characters will
vary according to constraints in the storage or transmission A.3.3 Incremental Alternatives:
environment. Hence, the same ABNF-based grammar may
have multiple external encodings, such as one for a 7-bit
Rule1 = /Rule2
US-ASCII environment, another for a binary octet environ- It is sometimes convenient to specify a list of alternatives in
ment, and still a different one when 16-bit Unicode is used. fragments. That is, an initial rule may match one or more
Encoding details are beyond the scope of ABNF, although alternatives, with later rule definitions adding to the set of
Appendix A ◾ 817
is the same as specifying This will avoid misinterpretation by casual readers. The
ruleset = alt1/alt2/alt3/
sequence group notation is also used within free text to set
alt4/alt5 off an element sequence from the prose.
Hence where <a> and <b> are optional decimal values, indicating at
DIGIT = %x30-39
least <a> and at most <b> occurrences of the element.
Default values are 0 and infinity so that *<element>
is equivalent to allows any number, including zero; 1*<element> requires at
least one; 3*3<element> allows exactly 3; and 1*2<element>
DIGIT = "0"/"1"/"2"/"3"/"4"/"5"/ allows one or two.
"6"/"7"/"8"/"9"
Concatenated numeric values and numeric value ranges A.3.7 Specific Repetition: nRule
cannot be specified in the same string. A numeric value may A rule of the form
use the dotted notation for concatenation or it may use the
dash notation to specify one value range. Hence, to specify <n>element
one printable character between end-of-line sequences, the
specification could be is equivalent to
and is equivalent to
A semicolon starts a comment that continues to the end of comment = ";" *(WSP/VCHAR) CRLF
line. This is a simple way of including useful notes in parallel
with the specifications. alternation = concatenation *(*c-wsp
"/" *c-wsp concatenation)
HEXDIG = DIGIT/"A"/"B"/"C"/"D"/"E"/"F"
1123 Requirements for Internet Hosts—Application 2046 Multipurpose Internet Mail Extensions (MIME)
and Support. R. Braden, Ed. October 1989. Part Two: Media Types. N. Freed,
(Updates RFC 0822, RFC 0952) (Updated by N. Borenstein. November 1996. (Obsoletes
RFC 1349, RFC 2181, RFC 5321, RFC 5966) (Also RFC 1521, RFC 1522, RFC 1590) (Updated by
STD 0003) (Status: Proposed Standard) RFC 2646, RFC 3798, RFC 5147, RFC 6657)
(Status: Proposed Standard)
821
822 ◾ Appendix B
2069 An Extension to HTTP: Digest Access 2404 The Use of HMAC-SHA-1-96 within ESP and
Authentication. J. Franks, P. Hallam-Baker, AH. C. Madson, R. Glenn. November 1998.
J. Hostetler, P. Leach, A. Luotonen, E. Sink, (Status: Proposed Standard)
L. Stewart. January 1997. (Obsoleted by RFC
2617) (Status: Proposed Standard) 2451 The ESP CBC-Mode Cipher Algorithms.
R. Pereira, R. Adams. November 1998.
2076 Common Internet Message Headers. J. Palme. (Status: Proposed Standard)
February 1997. (Status: Informational)
2460 Internet Protocol, Version 6 (IPv6)
2104 HMAC: Keyed-Hashing for Message Specification. S. Deering, R. Hinden.
Authentication. H. Krawczyk, M. Bellare, December 1998. (Obsoletes RFC 1883)
R. Canetti. February 1997. (Updated by (Updated by RFC 5095, RFC 5722, RFC 5871,
RFC 6151) (Status: Informational) RFC 6437, RFC 6564, RFC 6935, RFC 6946, RFC
7045, RFC 7112) (Status: Draft Standard)
2141 2141 URN Syntax. R. Moats. May 1997. (Status:
Proposed Standard) 2506 Media Feature Tag Registration Procedure.
K. Holtman, A. Mutz, T. Hardie. March 1999.
2183 Communicating Presentation Information in (Also BCP 0031) (Status: Best Current Practice)
Internet Messages: The Content-Disposition
Header Field. R. Troost, S. Dorner, K. Moore, 2507 IP Header Compression. M. Degermark,
Ed. August 1997. (Obsoletes RFC 1806) B. Nordgren, S. Pink. February 1999.
(Updated by RFC 2184, RFC 2231) (Status: (Status: Proposed Standard)
Proposed Standard)
2508 2508 Compressing IP/UDP/RTP Headers for
2205 Resource ReSerVation Protocol (RSVP)— Low-Speed Serial Links. S. Casner,
Version 1 Functional Specification. R. Braden, V. Jacobson. February 1999. (Status: Proposed
Ed., L. Zhang, S. Berson, S. Herzog, S. Jamin. Standard)
September 1997. (Updated by RFC 2750, RFC
3936, RFC 4495, RFC 5946, RFC 6437, RFC 6780) 2533 A Syntax for Describing Media Feature Sets.
(Status: Proposed Standard) G. Klyne. March 1999. (Updated by RFC 2738,
RFC 2938) (Status: Proposed Standard)
2224 NFS URL Scheme. B. Callaghan. October 1997.
(Status: Informational) 2543 SIP: Session Initiation Protocol. M. Handley,
H. Schulzrinne, E. Schooler, J. Rosenberg.
2277 IETF Policy on Character Sets and Languages. March 1999. (Obsoleted by RFC 3261,
H. Alvestrand. January 1998. (Also BCP 0018) RFC 3262, RFC 3263, RFC 3264, RFC 3265)
(Status: Best Current Practice) (Status: Proposed Standard)
2326 Real Time Streaming Protocol (RTSP). 2585 Internet X.509 Public Key Infrastructure
H. Schulzrinne, A. Rao, R. Lanphier. April 1998. Operational Protocols: FTP and HTTP.
(Status: Proposed Standard) R. Housley, P. Hoffman. May 1999.
(Status: Proposed Standard)
2327 SDP: Session Description Protocol. M. Handley,
V. Jacobson. April 1998. (Obsoleted by RFC 2606 2606 Reserved Top Level DNS Names.
4566) (Updated by RFC 3266) (Status: Proposed D. Eastlake 3rd, A. Panitz. June 1999.
Standard) (Updated by RFC 6761) (Also BCP 0032)
(Status: BEST CURRENT PRACTICE)
2387 The MIME Multipart/Related Content-type.
E. Levinson. August 1998. (Obsoletes RFC 2617 HTTP Authentication: Basic and Digest Access
2112) (Status: Proposed Standard) Authentication. J. Franks, P. Hallam-Baker,
J. Hostetler, S. Lawrence, P. Leach,
2392 Content-ID and Message-ID Uniform Resource A. Luotonen, L. Stewart. June 1999.
Locators. E. Levinson. August 1998. (Obsoletes (Obsoletes RFC 2069) (Updated by RFC 7235)
RFC 2111) (Status: Proposed Standard) (Status: Draft Standard)
2403 The Use of HMAC-MD5-96 within ESP and AH. 2671 Extension Mechanisms for DNS (EDNS0).
C. Madson, R. Glenn. November 1998. (Status: P. Vixie. August 1999. (Obsoleted by RFC 6891)
Proposed Standard) (Status: Proposed Standard)
Appendix B ◾ 823
2672 Non-terminal DNS Name Redirection. 2848 RFC 2848 The PINT Service Protocol: Extensions
M. Crawford. August 1999. (Obsoleted by RFC to SIP and SDP for IP Access to Telephone Call
6672) (Updated by RFC 4592, RFC 6604) (Status: Services. S. Petrack, L. Conroy. June 2000.
Proposed Standard) (Status: Proposed Standard)
2703 Protocol-independent Content Negotiation 2849 The LDAP Data Interchange Format (LDIF)—
Framework. G. Klyne. September 1999. (Status: Technical Specification. G. Good. June 2000.
Informational) (Status: Proposed Standard)
2738 Corrections to “A Syntax for Describing Media 2872 Application and Sub Application Identity Policy
Feature Sets.” G. Klyne. December 1999. Element for Use with RSVP. Y. Bernet,
(Updates RFC 2533) (Status: Proposed R. Pabbati. June 2000. (Status: Proposed
Standard) Standard)
2749 COPS usage for RSVP. S. Herzog, Ed., J. Boyle, 2904 AAA Authorization Framework. J. Vollbrecht,
R. Cohen, D. Durham, R. Rajan, A. Sastry. P. Calhoun, S. Farrell, L. Gommans, G. Gross,
January 2000. (Status: Proposed Standard) B. de Bruijn, C. de Laat, M. Holdrege,
D. Spence. August 2000.
2750 RSVP Extensions for Policy Control. S. Herzog. (Status: Informational)
January 2000. (Updates RFC 2205) (Status:
Proposed Standard) 2914 Congestion Control Principles. S. Floyd.
September 2000. (Updated by RFC 7141) (Also
2753 A Framework for Policy-based Admission BCP 0041) (Status: Best Current Practice)
Control. R. Yavatkar, D. Pendarakis, R. Guerin.
January 2000. (Status: Informational) 2931 DNS Request and Transaction Signatures
(SIG(0)s). D. Eastlake 3rd. September 2000.
2778 A Model for Presence and Instant Messaging. (Updates RFC 2535) (Status: Proposed
M. Day, J. Rosenberg, H. Sugano. February Standard)
2000. (Status: Informational)
2974 Session Announcement Protocol. M. Handley,
2782 A DNS RR for Specifying the Location of C. Perkins, E. Whelan. October 2000. (Status:
Services (DNS SRV). A. Gulbrandsen, P. Vixie, Experimental)
L. Esibov. February 2000. (Obsoletes RFC 2052)
(Updated by RFC 6335) (Status: Proposed 3006 Integrated Services in the Presence of
Standard) Compressible Flows. B. Davie, C. Iturralde,
D. Oran, S. Casner, J. Wroclawski. November
2784 Generic Routing Encapsulation (GRE). 2000. (Status: Proposed Standard)
D. Farinacci, T. Li, S. Hanks, D. Meyer, P. Traina.
March 2000. (Updated by RFC 2890) (Status: 3043 The Network Solutions Personal Internet Name
Proposed Standard) (PIN): A URN Namespace for People and
Organizations. M. Mealling. January 2001.
2818 HTTP Over TLS. E. Rescorla. May 2000. (Status: Informational)
(Updated by RFC 5785, RFC 7230) (Status:
Informational) 3044 Using the ISSN (International Serial Standard
Number) as URN (Uniform Resource Names)
2822 Internet Message Format. P. Resnick, Ed. April within an ISSN-URN Namespace. S. Rozenfeld.
2001. (Obsoletes RFC 0822) (Obsoleted by RFC January 2001. (Status: Informational)
5322) (Updated by RFC 5335, RFC 5336) (Status:
Proposed Standard) 3050 Common Gateway Interface for SIP. J. Lennox,
H. Schulzrinne, J. Rosenberg. January 2001.
2824 Call Processing Language Framework and (Status: Informational)
Requirements. J. Lennox, H. Schulzrinne. May
2000. (Status: Informational) 3084 COPS Usage for Policy Provisioning (COPS-PR).
K. Chan, J. Seligson, D. Durham, S. Gai,
2827 Network Ingress Filtering: Defeating Denial of K. McCloghrie, S. Herzog, F. Reichmeyer,
Service Attacks which employ IP Source R. Yavatkar, A. Smith. March 2001.
Address Spoofing. P. Ferguson, D. Senie. May (Status: Proposed Standard)
2000. (Obsoletes RFC 2267) (Updated by RFC
3704) (Also BCP 0038) (Status: Best Current 3087 Control of Service Context using SIP Request-
Practice) URI. B. Campbell, R. Sparks. April 2001.
(Status: Informational)
824 ◾ Appendix B
3095 RObust Header Compression (ROHC): 3262 Reliability of Provisional Responses in Session
Framework and four profiles: RTP, UDP, ESP, Initiation Protocol (SIP). J. Rosenberg,
and uncompressed. C. Bormann, H. Schulzrinne. June 2002. (Obsoletes
C. Burmeister, M. Degermark, H. Fukushima, RFC 2543) (Status: Proposed Standard)
H. Hannu, L.-E. Jonsson, R. Hakenberg,
T. Koren, K. Le, Z. Liu, A. Martensson, 3263 Session Initiation Protocol (SIP): Locating SIP
A. Miyazaki, K. Svanbro, T. Wiebke, Servers. J. Rosenberg, H. Schulzrinne. June
T. Yoshimura, H. Zheng. July 2001. (Updated by 2002. (Obsoletes RFC 2543) (Status: Proposed
RFC 3759, RFC 4815) (Status: Proposed Standard) Standard)
3108 Conventions for the use of the Session 3264 An Offer/Answer Model with Session
Description Protocol (SDP) for ATM Bearer Description Protocol (SDP). J. Rosenberg,
Connections. R. Kumar, M. Mostafa. May 2001. H. Schulzrinne. June 2002. (Obsoletes RFC
(Status: Proposed Standard) 2543) (Updated by RFC 6157) (Status:
Proposed Standard)
3120 A URN Namespace for XML.org. K. Best,
N. Walsh. June 2001. (Status: Informational) 3265 Session Initiation Protocol (SIP)-Specific Event
Notification. A. B. Roach. June 2002.
3174 US Secure Hash Algorithm 1 (SHA1). (Obsoletes RFC 2543) (Updates RFC 3261)
D. Eastlake 3rd, P. Jones. September 2001. (Updated by RFC 5367, RFC 5727) (Status:
(Updated by RFC 4634, RFC 6234) Proposed Standard)
(Status: Informational)
3274 Compressed Data Content Type for
3187 Using International Standard Book Numbers as Cryptographic Message Syntax (CMS).
Uniform Resource Names. J. Hakala, P. Gutmann. June 2002. (Status: Proposed
H. Walravens. October 2001. Standard)
(Status: Informational)
3303 Middlebox Communication Architecture and
3188 Using National Bibliography Numbers as Framework. P. Srisuresh, J. Kuthan,
Uniform Resource Names. J. Hakala. October J. Rosenberg, A. Molitor, A. Rayhan.
2001. (Status: Informational) August 2002. (Status: Informational)
3204 MIME Media Types for ISUP and QSIG Objects. 3311 The Session Initiation Protocol (SIP) UPDATE
E. Zimmerer, J. Peterson, A. Vemuri, L. Ong, Method. J. Rosenberg. October 2002.
F. Audet, M. Watson, M. Zonoun. December (Status: Proposed Standard)
2001. (Updated by RFC 3459, RFC 5621)
(Status: Proposed Standard) 3312 Integration of Resource Management and
Session Initiation Protocol (SIP). G. Camarillo,
3219 Telephony Routing over IP (TRIP). J. Rosenberg, Ed., W. Marshall, Ed., J. Rosenberg. October
H. Salama, M. Squire. January 2002. 2002. (Updated by RFC 4032, RFC 5027)
(Status: Proposed Standard) (Status: Proposed Standard)
3238 IAB Architectural and Policy Considerations for 3313 Private Session Initiation Protocol (SIP)
Open Pluggable Edge Services. S. Floyd, Extensions for Media Authorization.
L. Daigle. January 2002. (Status: Informational) W. Marshall, Ed. January 2003.
(Status: Informational)
3240 Digital Imaging and Communications in
Medicine (DICOM)—Application/dicom 3320 Signaling Compression (SigComp). R. Price,
MIME Sub-type Registration. D. Clunie, C. Bormann, J. Christoffersson, H. Hannu,
E. Cordonnier. February 2002 Z. Liu, J. Rosenberg. January 2003. (Updated by
(Status: Informational) RFC 4896) (Status: Proposed Standard)
3261 SIP: Session Initiation Protocol. J. Rosenberg, 3323 A Privacy Mechanism for the Session Initiation
H. Schulzrinne, G. Camarillo, A. Johnston, Protocol (SIP). J. Peterson. November 2002.
J. Peterson, R. Sparks, M. Handley, E. Schooler. (Status: Proposed Standard)
June 2002. (Obsoletes RFC 2543) (Updated by
RFC 3265, RFC 3853, RFC 4320, RFC 4916, RFC 3324 Short Term Requirements for Network
5393, RFC 5621, RFC 5626, RFC 5630, RFC 5922, Asserted Identity. M. Watson. November 2002.
RFC 5954, RFC 6026, RFC 6141) (Status: (Status: Informational)
Proposed Standard)
Appendix B ◾ 825
3325 Private Extensions to the Session Initiation 3401 Dynamic Delegation Discovery System (DDDS)
Protocol (SIP) for Asserted Identity within Part One: The Comprehensive DDDS.
Trusted Networks. C. Jennings, J. Peterson, M. Mealling. October 2002. (Obsoletes RFC
M. Watson. November 2002. (Updated by 2915, RFC 2168) (Updates RFC 2276)
RFC 5876) (Status: Informational) (Status: Informational)
3326 The Reason Header Field for the Session 3402 Dynamic Delegation Discovery System (DDDS)
Initiation Protocol (SIP). H. Schulzrinne, Part Two: The Algorithm. M. Mealling.
D. Oran, G. Camarillo. December 2002. October 2002. (Obsoletes RFC 2915, RFC 2168)
(Status: Proposed Standard) (Status: Proposed Standard)
3327 RFC 3327 Session Initiation Protocol (SIP) 3403 Dynamic Delegation Discovery System (DDDS)
Extension Header Field for Registering Part Three: The Domain Name System (DNS)
Non-Adjacent Contacts. D. Willis, Database. M. Mealling. October 2002.
B. Hoeneisen. December 2002. (Updated by (Obsoletes RFC 2915, RFC 2168)
RFC 5626) (Status: Proposed Standard) (Status: Proposed Standard)
3329 Security Mechanism Agreement for the 3404 Dynamic Delegation Discovery System
Session Initiation Protocol (SIP). J. Arkko, (DDDS) Part Four: The Uniform Resource
V. Torvinen, G. Camarillo, A. Niemi, T. Haukka. Identifiers (URI). M. Mealling. October 2002.
January 2003. (Status: Proposed Standard) (Obsoletes RFC 2915, RFC 2168)
(Status: Proposed Standard)
3351 User Requirements for the Session Initiation
Protocol (SIP) in Support of Deaf, Hard of 3406 Uniform Resource Names (URN) Namespace
Hearing and Speech-impaired Individuals. Definition Mechanisms. L. Daigle, D. van
N. Charlton, M. Gasson, G. Gybels, Gulik, R. Iannella, P. Faltstrom. October 2002.
M. Spanner, A. van Wijk. August 2002. (Obsoletes RFC 2611) (Also BCP 0066)
(Status: Informational) (Status: Best Current Practice)
3361 Dynamic Host Configuration Protocol (DHCP- 3420 Internet Media Type message/sipfrag.
for-IPv4) Option for Session Initiation R. Sparks. November 2002.
Protocol (SIP) Servers. H. Schulzrinne. (Status: Proposed Standard)
August 2002. (Status: Proposed Standard)
3428 Session Initiation Protocol (SIP) Extension for
3362 Real-time Facsimile (T.38)—image/t38 MIME Instant Messaging. B. Campbell, Ed.,
Sub-type Registration. G. Parsons. J. Rosenberg, H. Schulzrinne, C. Huitema,
August 2002. (Status: Proposed Standard) D. Gurle. December 2002.
(Status: Proposed Standard)
3370 Cryptographic Message Syntax (CMS)
Algorithms. R. Housley. August 2002. 3459 Critical Content Multi-purpose Internet Mail
(Obsoletes RFC 2630, RFC 3211) (Updated by Extensions (MIME) Parameter. E. Burger.
RFC 5754) (Status: Proposed Standard) January 2003. (Updates RFC 3204)
(Status: Proposed Standard)
3372 Session Initiation Protocol for Telephones
(SIP-T): Context and Architectures. A. Vemuri, 3480 Signalling Unnumbered Links in CR-LDP
J. Peterson. September 2002. (Constraint-Routing Label Distribution
(Status: Best Current Practice) Protocol). K. Kompella, Y. Rekhter, A. Kullberg.
February 2003. (Status: Proposed Standard)
3388 Grouping of Media Lines in the Session
Description Protocol (SDP). G. Camarillo, 3481 TCP over Second (2.5G) and Third (3G)
G. Eriksson, J. Holler, H. Schulzrinne. Generation Wireless Networks. H. Inamura,
December 2002. (Obsoleted by RFC 5888 Ed., G. Montenegro, Ed., R. Ludwig, A. Gurtov,
(Status: Proposed Standard) F. Khafizov. February 2003. (Also BCP 0071)
(Status: Best Current Practice)
3398 Integrated Services Digital Network (ISDN)
User Part (ISUP) to Session Initiation Protocol
(SIP) Mapping. G. Camarillo, A.B. Roach,
J. Peterson, L. Ong. December 2002.
(Status: Proposed Standard)
826 ◾ Appendix B
3485 The Session Initiation Protocol (SIP) and 3551 RTP Profile for Audio and Video Conferences
Session Description Protocol (SDP) Static with Minimal Control. H. Schulzrinne,
Dictionary for Signaling Compression S. Casner. July 2003. (Obsoletes RFC 1890)
(SigComp). M. Garcia-Martin, C. Bormann, (Updated by RFC 5761, RFC 7007)
J. Ott, R. Price, A.B. Roach. February 2003. (Also STD 0065) (Status: Internet Standard)
(Updated by RFC 4896) (Status: Proposed
Standard) 3578 Mapping of Integrated Services Digital Network
(ISDN) User Part (ISUP) Overlap Signalling to
3486 Compressing the Session Initiation Protocol the Session Initiation Protocol (SIP).
(SIP). G. Camarillo. February 2003. (Updated G. Camarillo, A.B. Roach, J. Peterson, L. Ong.
by RFC 5049) (Status: Proposed Standard) August 2003. (Status: Proposed Standard)
3487 RFC 3487 Requirements for Resource Priority 3581 An Extension to the Session Initiation Protocol
Mechanisms for the Session Initiation (SIP) for Symmetric Response Routing.
Protocol (SIP). H. Schulzrinne. February 2003. J. Rosenberg, H. Schulzrinne. August 2003.
(Status: Informational) (Status: Proposed Standard)
3490 Internationalizing Domain Names in 3605 3Real Time Control Protocol (RTCP) attribute
Applications (IDNA). P. Faltstrom, P. Hoffman, in Session Description Protocol (SDP).
A. Costello. March 2003. (Obsoleted by RFC C. Huitema. October 2003.
5890, RFC 5891) (Status: Proposed Standard) (Status: Proposed Standard)
3515 The Session Initiation Protocol (SIP) Refer 3608 Session Initiation Protocol (SIP) Extension
Method. R. Sparks. April 2003. (Status: Header Field for Service Route Discovery
Proposed Standard) During Registration. D. Willis, B. Hoeneisen.
October 2003. (Updated by RFC 5630)
3521 Framework for Session Set-up with Media (Status: Proposed Standard)
Authorization. L-N. Hamer, B. Gage, H. Shieh.
April 2003. (Status: Informational) 3665 Session Initiation Protocol (SIP) Basic Call Flow
Examples. A. Johnston, S. Donovan, R. Sparks,
3523 Internet Emergency Preparedness (IEPREP) C. Cunningham, K. Summers. December 2003.
Telephony Topology Terminology. J. Polk. April (Status: Best Current Practice)
2003. (Status: Informational)
3666 Session Initiation Protocol (SIP) Public
3524 Mapping of Media Streams to Resource Switched Telephone Network (PSTN) Call
Reservation Flows. G. Camarillo, A. Monrad. Flows. A. Johnston, S. Donovan, R. Sparks,
April 2003. (Status: Proposed Standard) C. Cunningham, K. Summers. December 2003.
3530 Network File System (NFS) version 4 Protocol. (Also BCP 0076) (Status: Best Current Practice)
S. Shepler, B. Callaghan, D. Robinson, 3669 Guidelines for Working Groups on Intellectual
R. Thurlow, C. Beame, M. Eisler, D. Noveck. Property Issues. S. Brim. February 2004.
April 2003. (Obsoletes RFC 3010) (Obsoleted (Status: Informational)
by RFC 7530) (Status: Proposed Standard)
3680 A Session Initiation Protocol (SIP) Event
3540 Robust Explicit Congestion Notification (ECN) Package for Registrations. J. Rosenberg.
Signaling with Nonces. N. Spring, March 2004. (Updated by RFC 6140)
D. Wetherall, D. Ely. June 2003. (Status: Proposed Standard)
(Status: Experimental)
3693 Geopriv Requirements. J. Cuellar, J. Morris,
3550 RTP: A Transport Protocol for Real-Time D. Mulligan, J. Peterson, J. Polk. February 2004.
Applications. H. Schulzrinne, S. Casner, (Updated by RFC 6280, RFC 7459)
R. Frederick, V. Jacobson. July 2003. (Status: Informational)
(Obsoletes RFC 1889) (Updated by RFC 5506,
RFC 5761, RFC 6051, RFC 6222, RFC 7022, 3711 The Secure Real-time Transport Protocol
RFC 7160, RFC 7164) (Also STD 0064) (SRTP). M. Baugher, D. McGrew, M. Naslund,
(Status: Internet Standard) E. Carrara, K. Norrman. March 2004. (Updated
by RFC 5506, RFC 6904) (Status: Proposed
Standard)
Appendix B ◾ 827
3725 Best Current Practices for Third Party Call 3862 Common Presence and Instant Messaging
Control (3pcc) in the Session Initiation (CPIM): Message Format. G. Klyne, D. Atkins.
Protocol (SIP). J. Rosenberg, J. Peterson, August 2004. (Status: Proposed Standard)
H. Schulzrinne, G. Camarillo. April 2004.
(Status: Best Current Practice) 3863 Presence Information Data Format (PIDF).
H. Sugano, S. Fujimoto, G. Klyne, A. Bateman,
3735 Guidelines for Extending the Extensible W. Carr, J. Peterson. August 2004.
Provisioning Protocol (EPP). S. Hollenbeck. (Status: Proposed Standard)
March 2004. (Status: Informational)
3873 Stream Control Transmission Protocol (SCTP)
3764 Enumservice registration for Session Initiation Management Information Base (MIB).
Protocol (SIP) Addresses-of-Record. J. Pastor, M. Belinchon. September 2004.
J. Peterson. April 2004. (Updated by RFC 6118) (Status: Proposed Standard)
(Status: Proposed Standard)
3880 Call Processing Language (CPL): A Language
3824 Using E.164 numbers with the Session for User Control of Internet Telephony
Initiation Protocol (SIP). J. Peterson, H. Liu, Services. J. Lennox, X. Wu, H. Schulzrinne.
J. Yu, B. Campbell. June 2004. October 2004. (Status: Proposed Standard)
(Status: Informational)
3891 The Session Initiation Protocol (SIP) “Replaces”
3830 MIKEY: Multimedia Internet KEYing. J. Arkko, Header. R. Mahy, B. Biggs, R. Dean. September
E. Carrara, F. Lindholm, M. Naslund, K. 2004. (Status: Proposed Standard)
Norrman. August 2004. (Updated by RFC 4738,
RFC 6309) (Status: Proposed Standard) 3892 The Session Initiation Protocol (SIP)
Referred-By Mechanism. R. Sparks. September
3840 Indicating User Agent Capabilities in the 2004. (Status: Proposed Standard)
Session Initiation Protocol (SIP). J. Rosenberg,
H. Schulzrinne, P. Kyzivat. August 2004. 3893 Session Initiation Protocol (SIP) Authenticated
(Status: Proposed Standard) Identity Body (AIB) Format. J. Peterson.
September 2004. (Status: Proposed Standard)
3841 Caller Preferences for the Session Initiation
Protocol (SIP). J. Rosenberg, H. Schulzrinne, 3903 Session Initiation Protocol (SIP) Extension for
P. Kyzivat. August 2004. Event State Publication. A. Niemi, Ed. October
(Status: Proposed Standard) 2004. (Status: Proposed Standard)
3842 A Message Summary and Message Waiting 3911 The Session Initiation Protocol (SIP) “Join”
Indication Event Package for the Session Header. R. Mahy, D. Petrie. October 2004.
Initiation Protocol (SIP). R. Mahy. August 2004. (Status: Proposed Standard)
(Status: Proposed Standard) 3957 Authentication, Authorization, and Accounting
3853 S/MIME Advanced Encryption Standard (AES) (AAA) Registration Keys for Mobile IPv4.
Requirement for the Session Initiation Protocol C. Perkins, P. Calhoun. March 2005.
(SIP). J. Peterson. July 2004. (Updates RFC 3261) (Status: Proposed Standard)
(Status: Proposed Standard) 3958 Domain-Based Application Service Location
3856 A Presence Event Package for the Session Using SRV RRs and the Dynamic Delegation
Initiation Protocol (SIP). J. Rosenberg. Discovery Service (DDDS). L. Daigle,
August 2004. (Status: Proposed Standard) A. Newton. January 2005.
(Status: Proposed Standard)
3857 A Watcher Information Event Template-
Package for the Session Initiation Protocol 3959 The Early Session Disposition Type for the
(SIP). J. Rosenberg. August 2004. Session Initiation Protocol (SIP). G. Camarillo.
(Status: Proposed Standard) December 2004. (Status: Proposed Standard)
3858 An Extensible Markup Language (XML) Based 3960 Early Media and Ringing Tone Generation in
Format for Watcher Information. J. Rosenberg. the Session Initiation Protocol (SIP).
August 2004. (Status: Proposed Standard) G. Camarillo, H. Schulzrinne. December 2004.
(Status: Informational)
3859 Common Profile for Presence (CPP).
J. Peterson. August 2004. (Status: Proposed
Standard)
828 ◾ Appendix B
3966 The tel URI for Telephone Numbers. 4035 Protocol Modifications for the DNS Security
H. Schulzrinne. December 2004. (Status: Extensions. R. Arends, R. Austein, M. Larson,
Proposed Standard) (Obsoletes RFC 2806) D. Massey, S. Rose. March 2005. (Obsoletes
(Updated by RFC 5341) (Status: Proposed RFC 2535, RFC3 008, RFC 3090, RFC 3445, RFC
Standard) 3655, RFC 3658, RFC 3755, RFC3 757, RFC 3845)
(Updates RFC 1034, RFC 1035, RFC 2136, RFC
3968 The Internet Assigned Number Authority 2181, RFC 2308, RFC 3225, RFC 3597, RFC 3226)
(IANA) Header Field Parameter Registry for (Updated by RFC 4470, RFC 6014, RFC 6840)
the Session Initiation Protocol (SIP). (Status: Proposed Standard)
G. Camarillo. December 2004. (Updates RFC
3427) (Status: Best Current Practice) 4083 Input 3rd-Generation Partnership Project
(3GPP) Release 5 Requirements on the Session
3969 The Internet Assigned Number Authority Initiation Protocol (SIP). M. Garcia-Martin.
(IANA) Uniform Resource Identifier (URI) May 2005. (Status: Informational)
Parameter Registry for the Session Initiation
Protocol (SIP). G. Camarillo. December 2004. 4086 Randomness Requirements for Security.
(Updates RFC 3427) (Updated by RFC 5727) D. Eastlake 3rd, J. Schiller, S. Crocker. June
(Also BCP 0099) (Status: Best Current Practice) 2005. (Obsoletes RFC 1750) (Also BCP 0106)
(Status: Best Current Practice)
3986 Uniform Resource Identifier (URI): Generic
Syntax. T. Berners-Lee, R. Fielding, L. Masinter. 4103 RTP Payload for Text Conversation.
January 2005. (Obsoletes RFC 2732, RFC 2396, G. Hellstrom, P. Jones. June 2005. (Obsoletes
RFC 1808) (Updates RFC 1738) (Updated by RFC 2793) (Status: Proposed Standard)
RFC 6874, RFC 7320) (Also STD 0066) (Status:
Internet Standard) 4117 Transcoding Services Invocation in the Session
Initiation Protocol (SIP) Using Third Party Call
4028 Session Timers in the Session Initiation Control (3pcc). G. Camarillo, E. Burger,
Protocol (SIP). S. Donovan, J. Rosenberg. April H. Schulzrinne, A. van Wijk. June 2005.
2005. (Status: Proposed Standard) (Status: Informational)
4032 Update to the Session Initiation Protocol (SIP) 4119 A Presence-based GEOPRIV Location Object
Preconditions Framework. G. Camarillo, Format. J. Peterson. December 2005. (Updated
P. Kyzivat. March 2005. (Updates RFC 3312) by RFC 5139, RFC 5491, RFC 7459)
(Status: Proposed Standard) (Status: Proposed Standard)
4033 DNS Security Introduction and Requirements. 4122 A Universally Unique IDentifier (UUID) URN
R. Arends, R. Austein, M. Larson, D. Massey, Namespace. P. Leach, M. Mealling, R. Salz. July
S. Rose. March 2005. (Obsoletes RFC 2535, 2005. (Status: Proposed Standard)
RFC 3008, RFC 3090, RFC 3445, RFC 3655, RFC
3658, RFC 3755, RFC 3757, RFC 3845) (Updates 4145 TCP-Based Media Transport in the Session
RFC 1034, RFC 1035, RFC 2136, RFC 2181, RFC Description Protocol (SDP). D. Yon,
2308, RFC 3225, RFC 3597, RFC 3226) G. Camarillo. September 2005. (Updated by
(Updated by RFC 6014, RFC 6840) RFC 4572) (Status: Proposed Standard)
(Status: Proposed Standard) 4149 Definition of Managed Objects for Synthetic
4034 Resource Records for the DNS Security Sources for Performance Monitoring
Extensions. R. Arends, R. Austein, M. Larson, Algorithms. C. Kalbfleisch, R. Cole,
D. Massey, S. Rose. March 2005. (Obsoletes D. Romascanu. August 2005. (Status: Proposed
RFC 2535, RFC 3008, RFC 3090, RFC 3445, RFC Standard)
3655, RFC 3658, RFC 3755, RFC 3757, RFC 3845) 4152 A Uniform Resource Name (URN) Namespace
(Updates RFC 1034, RFC 1035, RFC 2136, RFC for the Common Language Equipment
2181, RFC 2308, RFC 3225, RFC 3597, RFC 3226) Identifier (CLEI) Code. K. Tesink, R. Fox.
(Updated by RFC 4470, RFC 6014, RFC 6840, August 2005. (Status: Informational)
RFC 6944) (Status: Proposed Standard)
4168 The Stream Control Transmission Protocol
(SCTP) as a Transport for the Session Initiation
Protocol (SIP). J. Rosenberg, H. Schulzrinne,
G. Camarillo. October 2005. (Status: Proposed
Standard)
Appendix B ◾ 829
4179 Using Universal Content Identifier (UCI) as 4305 Cryptographic Algorithm Implementation
Uniform Resource Names (URN). S. Kang. Requirements for Encapsulating Security
October 2005. (Status: Informational) Payload (ESP) and Authentication Header
(AH). D. Eastlake 3rd. December 2005.
4195 A Uniform Resource Name (URN) Namespace (Obsoletes RFC 2402, RFC 2406) (Obsoleted by
for the TV-Anytime Forum. W. Kameyama. RFC 4835) (Status: Proposed Standard)
October 2005. (Status: Informational)
4306 Internet Key Exchange (IKEv2) Protocol.
4198 A Uniform Resource Name (URN) Namespace C. Kaufman, Ed. December 2005. (Obsoletes
for Federated Content. D. Tessman. November RFC 2407, RFC 2408, RFC 2409) (Obsoleted by
2005. (Status: Informational) RFC 5996) (Updated by RFC 5282)
4234 Augmented BNF for Syntax Specifications: (Status: Proposed Standard)
ABNF. D. Crocker, Ed., P. Overell. October 4313 Requirements for Distributed Control of
2005. (Obsoletes RFC 2234) (Obsoleted by RFC Automatic Speech Recognition (ASR), Speaker
5234) (Status: Draft Standard) Identification/Speaker Verification (SI/SV), and
4235 An INVITE-Initiated Dialog Event Package for Text-to-Speech (TTS) Resources. D. Oran.
the Session Initiation Protocol (SIP). December 2005. (Status: Informational)
J. Rosenberg, H. Schulzrinne, R. Mahy, Ed. 4320 Actions Addressing Identified Issues with the
November 2005. (Status: Proposed Standard) Session Initiation Protocol’s (SIP) Non-INVITE
4240 Basic Network Media Services with SIP. Transaction. R. Sparks. January 2006. (Updates
E. Burger, Ed., J. Van Dyke, A. Spitzer. RFC 3261) (Status: Proposed Standard)
December 2005. (Status: Informational) 4321 Problems Identified Associated with the
4244 An Extension to the Session Initiation Protocol Session Initiation Protocol’s (SIP) Non-INVITE
(SIP) for Request History Information. Transaction. R. Sparks. January 2006.
M. Barnes, Ed. November 2005. (Status: Informational)
(Status: Proposed Standard) 4335 The Secure Shell (SSH) Session Channel Break
4248 The telnet URI Scheme. P. Hoffman. October Extension. J. Galbraith, P. Remaker. January
2005. (Obsoletes RFC 1738) (Status: Proposed 2006. (Status: Proposed Standard)
Standard) 4340 Datagram Congestion Control Protocol
4266 The gopher URI Scheme. P. Hoffman. (DCCP). E. Kohler, M. Handley, S. Floyd. March
November 2005. (Obsoletes RFC 1738) (Status: 2006. (Updated by RFC 5595, RFC 5596, RFC
Proposed Standard) 6335, RFC 6773) (Status: Proposed Standard)
4269 The SEED Encryption Algorithm. H.J. Lee, 4343 Domain Name System (DNS) Case Insensitivity
S.J. Lee, J.H. Yoon, D.H. Cheon, J.I. Lee. Clarification. D. Eastlake 3rd. January 2006.
December 2005. (Obsoletes RFC 4009) (Updates RFC 1034, RFC 1035, RFC 2181)
(Status: Informational) (Status: Proposed Standard)
4289 Multipurpose Internet Mail Extensions (MIME) 4346 The Transport Layer Security (TLS) Protocol
Part Four: Registration Procedures. N. Freed, Version 1.1. T. Dierks, E. Rescorla. April 2006.
J. Klensin. December 2005. (Obsoletes RFC 2048) (Obsoletes RFC 2246) (Obsoleted by RFC 5246)
(Also BCP 0013) (Status: Best Current Practice) (Updated by RFC 4366, RFC 4680, RFC 4681,
RFC 5746, RFC 6176, RFC 7465)
4301 Security Architecture for the Internet Protocol. (Status: Proposed Standard)
S. Kent, K. Seo. December 2005. (Obsoletes
RFC 2401) (Updates RFC 3168) (Updated by 4353 A Framework for Conferencing with the
RFC 6040) (Status: Proposed Standard) Session Initiation Protocol (SIP). J. Rosenberg.
February 2006. (Status: Informational)
4302 IP Authentication Header. S. Kent. December
2005. (Obsoletes RFC2 402) (Status: Proposed 4375 Emergency Telecommunications Services (ETS)
Standard) Requirements for a Single Administrative
Domain. K. Carlberg. January 2006.
4303 IP Encapsulating Security Payload (ESP). S. Kent. (Status: Informational)
December 2005. (Obsoletes RFC 2406) (Status:
Proposed Standard)
830 ◾ Appendix B
4408 Sender Policy Framework (SPF) for Authorizing 4504 SIP Telephony Device Requirements and
Use of Domains in E-Mail, Version 1. M. Wong, Configuration. H. Sinnreich, Ed., S. Lass,
W. Schlitt. April 2006. (Obsoleted by RFC 7208) C. Stredicke. May 2006. (Status: Informational)
(Updated by RFC 6652) (Status: Experimental)
4508 Conveying Feature Tags with the Session
4411 Extending the Session Initiation Protocol (SIP) Initiation Protocol (SIP) REFER Method.
Reason Header for Preemption Events. J. Polk. O. Levin, A. Johnston. May 2006.
February 2006. (Status: Proposed Standard) (Status: Proposed Standard)
4412 Communications Resource Priority for the 4515 Lightweight Directory Access Protocol (LDAP):
Session Initiation Protocol (SIP). String Representation of Search Filters.
H. Schulzrinne, J. Polk. February 2006. M. Smith, Ed., T. Howes. June 2006. (Obsoletes
(Status: Proposed Standard) RFC 2254) (Status: Proposed Standard)
4457 The Session Initiation Protocol (SIP) P-User- 4542 Implementing an Emergency
Database Private-Header (P-Header). Telecommunications Service (ETS) for
G. Camarillo, G. Blanco. April 2006. (Status: Real-Time Services in the Internet Protocol
Informational) Suite. F. Baker, J. Polk. May 2006. (Updated by
RFC 5865) (Status: Informational)
4458 Session Initiation Protocol (SIP) URIs for
Applications such as Voicemail and Interactive 4566 Session Description Protocol. M. Handley,
Voice Response (IVR). C. Jennings, F. Audet, V. Jacobson, C. Perkins. July 2006. (Obsoletes
J. Elwell. April 2006. (Status: Informational) RFC 2327, RFC 3266) (Status: Proposed
Standard)
4474 Enhancements for Authenticated Identity
Management in the Session Initiation 4567 Key Management Extensions for Session
Protocol (SIP). J. Peterson, C. Jennings. August Description Protocol (SDP) and Real Time
2006. (Status: Proposed Standard) Streaming Protocol (RTSP). J. Arkko,
F. Lindholm, M. Naslund, K. Norrman, E. Carrara.
4475 RFC 4475 Session Initiation Protocol (SIP) Torture July 2006. (Status: Proposed Standard)
Test Messages. R. Sparks, Ed., A. Hawrylyshen,
A. Johnston, J. Rosenberg, H. Schulzrinne. May 4568 Session Description Protocol (SDP) Security
2006. (Status: Informational) Descriptions for Media Streams. F. Andreasen,
M. Baugher, D. Wing. July 2006.
4479 A Data Model for Presence. J. Rosenberg. July (Status: Proposed Standard)
2006. (Status: Proposed Standard)
4572 Connection-Oriented Media Transport over
4483 A Mechanism for Content Indirection in the Transport Layer Security (TLS) Protocol in
Session Initiation Protocol (SIP) Messages. the Session Description Protocol (SDP).
E. Burger, Ed. May 2006. (Status: Proposed J. Lennox. July 2006. (Updates RFC 4145)
Standard) (Status: Proposed Standard)
4484 Trait-Based Authorization Requirements for the 4574 The Session Description Protocol (SDP) Label
Session Initiation Protocol (SIP). J. Peterson, Attribute. O. Levin, G. Camarillo. August 2006.
J. Polk, D. Sicker, H. Tschofenig. August 2006. (Status: Proposed Standard)
(Status: Informational)
4575 A Session Initiation Protocol (SIP) Event
4488 Suppression of Session Initiation Protocol (SIP) Package for Conference State. J. Rosenberg,
REFER Method Implicit Subscription. O. Levin. H. Schulzrinne, O. Levin, Ed. August 2006.
May 2006. (Status: Proposed Standard) (Status: Proposed Standard)
4497 Interworking between the Session Initiation 4579 Session Initiation Protocol (SIP) Call Control—
Protocol (SIP) and QSIG. J. Elwell, F. Derks, Conferencing for User Agents. A. Johnston,
P. Mourot, O. Rousseau. May 2006. O. Levin. August 2006. (Status: Best Current
(Status: Best Current Practice) Practice)
Appendix B ◾ 831
4582 The Binary Floor Control Protocol (BFCP). 4730 A Session Initiation Protocol (SIP) Event
G. Camarillo, J. Ott, K. Drage. November 2006. Package for Key Press Stimulus (KPML).
(Status: Proposed Standard) E. Burger, M. Dolly. November 2006.
(Status: Proposed Standard)
4583 Session Description Protocol (SDP) Format for
Binary Floor Control Protocol (BFCP) Streams. 4733 RTP Payload for DTMF Digits, Telephony Tones,
G. Camarillo. November 2006. (Status: and Telephony Signals. H. Schulzrinne,
Proposed Standard) T. Taylor. December 2006 (Obsoletes RFC 2833)
(Updated by RFC 4734, RFC 5244)
4585 Extended RTP Profile for Real-time Transport (Status: Proposed Standard)
Control Protocol (RTCP)-Based Feedback
(RTP/AVPF). J. Ott, S. Wenger, N. Sato, 4734 Definition of Events for Modem, Fax, and Text
C. Burmeister, J. Rey. July 2006. (Updated by Telephony Signals. H. Schulzrinne, T. Taylor.
RFC 5506) (Status: Proposed Standard) December 2006. (Obsoletes RFC 2833)
(Updates RFC 4733) (Status: Proposed
4612 Real-Time Facsimile (T.38)—audio/t38 MIME Standard)
Sub-type Registration. P. Jones, H. Tamura.
August 2006. (Status: Historic) 4740 Diameter Session Initiation Protocol (SIP)
Application. M. Garcia-Martin, Ed.,
4628 RTP Payload Format for H.263 Moving RFC 2190 M. Belinchon, M. Pallares-Lopez, C. Canales-
to Historic Status. R. Even. January 2007 Valenzuela, K. Tammi. November 2006.
(Status: Informational) (Status: Proposed Standard)
4629 RTP Payload Format for ITU-T Rec. H.263 Video. 4825 The Extensible Markup Language (XML)
J. Ott, C. Bormann, G. Sullivan, S. Wenger, Configuration Access Protocol (XCAP).
R. Even, Ed. January 2007. (Obsoletes RFC J. Rosenberg. May 2007. (Status: Proposed
2429) (Updates RFC 3555) (Status: Proposed Standard)
Standard)
4826 Extensible Markup Language (XML) Formats for
4646 Tags for Identifying Languages. A. Phillips, Representing Resource Lists. J. Rosenberg.
M. Davis. September 2006. (Obsoletes RFC May 2007. (Status: Proposed Standard)
3066) (Obsoleted by RFC 5646) (Status: Best
Current Practice) 4867 RTP Payload Format and File Storage Format for
the Adaptive Multi-Rate (AMR) and Adaptive
4647 Matching of Language Tags. A. Phillips, M. Davis. Multi-Rate Wideband (AMR-WB) Audio
September 2006. (Obsoletes RFC 3066) (Also Codecs. J. Sjoberg, M. Westerlund,
BCP 0047) (Status: Best Current Practice) A. Lakaniemi, Q. Xie. April 2007. (Obsoletes
4648 The Base16, Base32, and Base64 Data RFC 3267) (Status: Proposed Standard)
Encodings. S. Josefsson. October 2006. 4906 Transport of Layer 2 Frames Over MPLS.
(Obsoletes RFC 3548) (Status: Proposed L. Martini, Ed., E. Rosen, Ed., N. El-Aawar, Ed.
Standard) June 2007. (Status: Historic)
4660 Functional Description of Event Notification 4916 Connected Identity in the Session Initiation
Filtering. H. Khartabil, E. Leppanen, Protocol (SIP). J. Elwell. June 2007. (Updates
M. Lonnfors, J. Costa-Requena. September RFC 3261) (Status: Proposed Standard)
2006. (Updated by RFC 6665) (Status: Proposed
Standard) 4919 IPv6 over Low-Power Wireless Personal Area
Networks (6LoWPANs): Overview,
4661 An Extensible Markup Language (XML)-Based Assumptions, Problem Statement, and Goals.
Format for Event Notification Filtering. N. Kushalnagar, G. Montenegro,
H. Khartabil, E. Leppanen, M. Lonnfors, C. Schumacher. August 2007.
J. Costa-Requena. September 2006. (Status: Informational)
(Status: Proposed Standard)
4958 A Framework for Supporting Emergency
4662 A Session Initiation Protocol (SIP) Event Telecommunications Services (ETS) within a
Notification Extension for Resource Lists. Single Administrative Domain. K. Carlberg.
A.B. Roach, B. Campbell, J. Rosenberg. August July 2007. (Status: Informational)
2006. (Status: Proposed Standard)
832 ◾ Appendix B
4960 Stream Control Transmission Protocol. 5057 Multiple Dialog Usages in the Session
R. Stewart, Ed. September 2007. (Obsoletes Initiation Protocol. R. Sparks. November 2007.
RFC 2960, RFC 3309) (Updated by RFC 6096, (Status: Informational)
RFC 6335, RFC 7053) (Status: Proposed
Standard) 5069 Security Threats and Requirements for
Emergency Call Marking and Mapping.
4975 The Message Session Relay Protocol (MSRP). T. Taylor, Ed., H. Tschofenig, H. Schulzrinne,
B. Campbell, Ed., R. Mahy, Ed., C. Jennings, Ed. M. Shanmugam. January 2008. (Status:
September 2007. (Status: Proposed Standard) Informational)
4976 Relay Extensions for the Message Sessions 5079 Rejecting Anonymous Requests in the Session
Relay Protocol (MSRP). C. Jennings, R. Mahy, Initiation Protocol (SIP). J. Rosenberg.
A. B. Roach. September 2007. (Status: December 2007. (Status: Proposed Standard)
Proposed Standard)
5112 The Presence-Specific Static Dictionary
5002 The Session Initiation Protocol (SIP) P-Profile- for Signaling Compression (Sigcomp).
Key Private Header (P-Header). G. Camarillo, M. Garcia-Martin. January 2008.
G. Blanco. August 2007. (Status: Informational) (Status: Proposed Standard)
5009 Private Header (P-Header) Extension to the 5155 DNS Security (DNSSEC) Hashed Authenticated
Session Initiation Protocol (SIP) for Denial of Existence. B. Laurie, G. Sisson,
Authorization of Early Media. R. Ejza. R. Arends, D. Blacka. March 2008.
September 2007. (Status: Informational) (Updated by RFC 6840, RFC 6944)
(Status: Proposed Standard)
5012 Requirements for Emergency Context
Resolution with Internet Technologies. 5168 XML Schema for Media Control. O. Levin,
H. Schulzrinne, R. Marshall, Ed. January 2008. R. Even, P. Hagendorf. March 2008.
(Status: Informational) (Status: Informational)
5022 Media Server Control Markup Language 5196 Session Initiation Protocol (SIP) User Agent
(MSCML) and Protocol. J. Van Dyke, E. Burger, Capability Extension to Presence Information
Ed., A. Spitzer. September 2007. (Obsoletes Data Format (PIDF). M. Lonnfors, K. Kiss.
RFC 4722) (Status: Informational) September 2008.
5027 Security Preconditions for Session Description 5226 Guidelines for Writing an IANA Considerations
Protocol (SDP) Media Streams. F. Andreasen, Section in RFCs. T. Narten, H. Alvestrand. May
D. Wing. October 2007. (Updates RFC 3312) 2008. (Obsoletes RFC 2434) (Also BCP 0026)
(Status: Proposed Standard) (Status: Best Current Practice)
5029 Definition of an IS-IS Link Attribute Sub-TLV. 5234 Augmented BNF for Syntax Specifications:
JP. Vasseur, S. Previdi. September 2007. ABNF. D. Crocker, Ed., P. Overell. January 2008.
(Status: Proposed Standard) (Obsoletes RFC 4234) (Updated by RFC 7405)
(Also STD 0068) (Status: Internet Standard)
5031 A Uniform Resource Name (URN) for
Emergency and Other Well-Known Services. 5239 A Framework for Centralized Conferencing.
H. Schulzrinne. January 2008. (Updated by M. Barnes, C. Boulton, O. Levin. June 2008.
RFC 7163) (Status: Proposed Standard) (Status: Proposed Standard)
5039 The Session Initiation Protocol (SIP) and Spam. 5244 Definition of Events for Channel-Oriented
J. Rosenberg, C. Jennings. January 2008. Telephony Signalling. H. Schulzrinne, T. Taylor.
(Status: Informational) June 2008. (Updates RFC 4733)
(Status: Proposed Standard)
5049 Applying Signaling Compression (SigComp) to
the Session Initiation Protocol (SIP). 5245 Interactive Connectivity Establishment (ICE): A
C. Bormann, Z. Liu, R. Price, G. Camarillo, Ed. Protocol for Network Address Translator
December 2007. (Updates RFC 3486) (NAT) Traversal for Offer/Answer Protocols.
(Status: Proposed Standard) J. Rosenberg. April 2010. (Obsoletes RFC 4091,
RFC 4092) (Updated by RFC 6336)
(Status: Proposed Standard)
Appendix B ◾ 833
5246 The Transport Layer Security (TLS) Protocol 5366 Conference Establishment Using Request-
Version 1.2. T. Dierks, E. Rescorla. August 2008. Contained Lists in the Session Initiation
(Obsoletes RFC 3268, RFC 4346, RFC 4366) Protocol (SIP). G. Camarillo, A. Johnston.
(Updates RFC 4492) (Updated by RFC 5746, October 2008. (Status: Proposed Standard)
RFC 5878, RFC 6176, RFC 7465) (Status:
Proposed Standard) 5367 Subscriptions to Request-Contained Resource
Lists in the Session Initiation Protocol (SIP).
5280 Internet X.509 Public Key Infrastructure G. Camarillo, A.B. Roach, O. Levin. October
Certificate and Certificate Revocation List 2008. (Updates RFC 3265) (Status: Proposed
(CRL) Profile. D. Cooper, S. Santesson, Standard)
S. Farrell, S. Boeyen, R. Housley, W. Polk. May
2008. (Obsoletes RFC 3280, RFC 4325, RFC 5368 Referring to Multiple Resources in the Session
4630) (Updated by RFC 6818) (Status: Proposed Initiation Protocol (SIP). G. Camarillo,
Standard) A. Niemi, M. Isomaki, M. Garcia-Martin,
H. Khartabil. October 2008. (Status: Proposed
5318 The Session Initiation Protocol (SIP) P-Refused- Standard)
URI-List Private-Header (P-Header).
J. Hautakorpi, G. Camarillo. December 2008. 5369 Framework for Transcoding with the Session
(Status: Informational) Initiation Protocol (SIP). G. Camarillo.
October 2008. (Status: Informational)
5322 Internet Message Format. P. Resnick, Ed.
October 2008. (Obsoletes RFC 2822) 5370 The Session Initiation Protocol (SIP)
(Updates RFC 4021) (Updated by RFC 6854) Conference Bridge Transcoding Model.
(Status: Draft Standard) G. Camarillo. October 2008. (Status: Proposed
Standard)
5359 Session Initiation Protocol Service Examples.
A. Johnston, Ed., R. Sparks, C. Cunningham, 5373 Requesting Answering Modes for the Session
S. Donovan, K. Summers. October 2008. Initiation Protocol (SIP). D. Willis, Ed., A. Allen.
(Also BCP 0144) (Status: Best Current Practice) November 2008. (Status: Proposed Standard)
5360 A Framework for Consent-Based 5379 Guidelines for Using the Privacy Mechanism
Communications in the Session Initiation for SIP. M. Munakata, S. Schubert, T. Ohba.
Protocol (SIP). J. Rosenberg, G. Camarillo, Ed., February 2010. (Status: Informational)
D. Willis. October 2008. (Status: Proposed 5389 Session Traversal Utilities for NAT (STUN).
Standard) J. Rosenberg, R. Mahy, P. Matthews, D. Wing.
5361 A Document Format for Requesting Consent. October 2008. (Obsoletes RFC 3489) (Updated
G. Camarillo. October 2008. (Status: Proposed by RFC 7350) (Status: Proposed Standard)
Standard) 5390 Requirements for Management of Overload in
5362 The Session Initiation Protocol (SIP) Pending the Session Initiation Protocol. J. Rosenberg.
Additions Event Package. G. Camarillo. December 2008. (Status: Informational)
October 2008. (Status: Proposed Standard) 5393 Addressing an Amplification Vulnerability in
5363 Framework and Security Considerations for Session Initiation Protocol (SIP) Forking
Session Initiation Protocol (SIP) URI-List Proxies. R. Sparks, Ed., S. Lawrence,
Services. G. Camarillo, A.B. Roach. October A. Hawrylyshen, B. Campen. December 2008.
2008. (Status: Proposed Standard) (Updates RFC 3261) (Status: Proposed
Standard)
5364 Extensible Markup Language (XML) Format
Extension for Representing Copy Control 5405 Unicast UDP Usage Guidelines for Application
Attributes in Resource Lists. M. Garcia-Martin, Designers. L. Eggert, G. Fairhurst. November
G. Camarillo. October 2008. (Status: Proposed 2008. (Also BCP 0145) (Status: Best Current
Standard) Practice)
5365 Multiple-Recipient MESSAGE Requests in the 5411 A Hitchhiker’s Guide to the Session Initiation
Session Initiation Protocol (SIP). M. Garcia- Protocol (SIP). J. Rosenberg. February 2009.
Martin, G. Camarillo. October 2008. (Status: (Status: Informational)
Proposed Standard)
834 ◾ Appendix B
5432 Quality of Service (QOS) Mechanism Selection 5621 Message Body Handling in the Session
in the Session Description Protocol (SDP). Initiation Protocol (SIP). G. Camarillo.
J. Polk, S. Dhesikan, G. Camarillo. March 2009. September 2009. (Updates RFC 3204, RFC 3261,
(Status: Proposed Standard) RFC 3459) (Status: Proposed Standard)
5483 ENUM Implementation Issues and Experiences. 5626 Managing Client-Initiated Connections in the
L. Conroy, K. Fujiwara. March 2009. Session Initiation Protocol (SIP). C. Jennings,
(Status: Informational) Ed., R. Mahy, Ed., F. Audet, Ed. October 2009.
(Updates RFC 3261, RFC 3327) (Status:
5491 GEOPRIV Presence Information Data Format Proposed Standard)
Location Object (PIDF-LO) Usage
Clarification, Considerations, and 5627 Obtaining and Using Globally Routable User
Recommendations. J. Winterbottom, Agent URIs (GRUUs) in the Session Initiation
M. Thomson, H. Tschofenig. March 2009. Protocol (SIP). J. Rosenberg. October 2009.
(Updates RFC 4119) (Updated by RFC 7459) (Status: Proposed Standard)
(Status: Proposed Standard)
5628 Registration Event Package Extension for
5502 The SIP P-Served-User Private-Header Session Initiation Protocol (SIP) Globally
(P-Header) for the 3GPP IP Multimedia (IM) Routable User Agent URIs (GRUUs). P. Kyzivat.
Core Network (CN) Subsystem. J. van Elburg. October 2009. (Status: Proposed Standard)
April 2009. (Status: Informational)
5629 A Framework for Application Interaction in the
5506 Support for Reduced-Size Real-Time Transport Session Initiation Protocol (SIP). J. Rosenberg.
Control Protocol (RTCP): Opportunities and October 2009. (Status: Proposed Standard)
Consequences. I. Johansson, M. Westerlund.
April 2009. (Updates RFC 3550, RFC 3711, RFC 5630 The Use of the SIPS URI Scheme in the Session
4585) (Status: Proposed Standard) Initiation Protocol (SIP). F. Audet. October
2009. (Updates RFC 3261, RFC 3608) (Status:
5547 A Session Description Protocol (SDP) Offer/ Proposed Standard)
Answer Mechanism to Enable File Transfer.
M. Garcia-Martin, M. Isomaki, G. Camarillo, 5631 Session Initiation Protocol (SIP) Session
S. Loreto, P. Kyzivat. May 2009. Mobility. R. Shacham, H. Schulzrinne,
(Status: Proposed Standard) S. Thakolsri, W. Kellerer. October 2009.
(Status: Informational)
5552 SIP Interface to VoiceXML Media Services.
D. Burke, M. Scott. May 2009. 5638 Simple SIP Usage Scenario for Applications in
(Status: Proposed Standard) the Endpoints. H. Sinnreich, Ed., A. Johnston,
E. Shim, K. Singh. September 2009.
5553 Resource Reservation Protocol (RSVP) (Status: Informational)
Extensions for Path Key Support. A. Farrel, Ed.,
R. Bradford, JP. Vasseur. May 2009. (Status: 5658 Addressing Record-Route Issues in the Session
Proposed Standard) Initiation Protocol (SIP). T. Froment, C. Lebel,
B. Bonnaerens. October 2009.
5583 Signaling Media Decoding Dependency in the (Status: Proposed Standard)
Session Description Protocol (SDP). T. Schierl,
S. Wenger. July 2009. (Status: Proposed 5688 A Session Initiation Protocol (SIP) Media
Standard) Feature Tag for MIME Application Subtypes.
J. Rosenberg. January 2010. (Status: Proposed
5589 RFC 5589 Session Initiation Protocol (SIP) Call Standard)
Control—Transfer. R. Sparks, A. Johnston, Ed.,
D. Petrie. June 2009. (Status: Best Current 5707 Media Server Markup Language (MSML).
Practice) A. Saleem, Y. Xin, G. Sharratt. February 2010.
(Status: Informational)
5606 Implications of “retransmission-allowed” for
SIP Location Conveyance. J. Peterson, 5727 Change Process for the Session Initiation
T. Hardie, J. Morris. August 2009. Protocol (SIP) and the Real-time Applications
(Status: Informational) and Infrastructure Area. J. Peterson,
C. Jennings, R. Sparks. March 2010. (Obsoletes
RFC 3427) (Updates RFC 3265, RFC 3969)
(Status: Best Current Practice)
Appendix B ◾ 835
5890 Internationalized Domain Names for 6050 A Session Initiation Protocol (SIP) Extension for
Applications (IDNA): Definitions and the Identification of Services. K. Drage.
Document Framework. J. Klensin. August 2010. November 2010. (Status: Informational)
(Obsoletes RFC 3490) (Status: Proposed
Standard)
836 ◾ Appendix B
6051 Rapid Synchronisation of RTP Flows. C. Perkins, 6188 The Use of AES-192 and AES-256 in Secure RTP.
T. Schierl. November 2010. (Updates RFC 3550) D. McGrew. March 2011. (Status: Proposed
(Status: Proposed Standard) Standard)
6061 Uniform Resource Name (URN) Namespace for 6189 ZRTP: Media Path Key Agreement for Unicast
the National Emergency Number Association Secure RTP. P. Zimmermann, A. Johnston, Ed.,
(NENA). B. Rosen. January 2011. (Status: J. Callas. April 2011. (Status: Informational)
Informational)
6228 Session Initiation Protocol (SIP) Response
6066 Transport Layer Security (TLS) Extensions: Code for Indication of Terminated Dialog.
Extension Definitions. D. Eastlake 3rd. January C. Holmberg. May 2011. (Status: Proposed
2011. (Obsoletes RFC 4366) (Status: Proposed Standard)
Standard)
6280 An Architecture for Location and Location
6068 The “mailto” URI Scheme. M. Duerst, Privacy in Internet Applications. R. Barnes,
L. Masinter, J. Zawinski. October 2010. M. Lepinski, A. Cooper, J. Morris, H.
(Obsoletes RFC 2368) (Status: Proposed Tschofenig, H. Schulzrinne. July 2011.
Standard) (Updates RFC 3693, RFC 3694) (Also BCP 0160)
(Status: Best Current Practice)
6072 Certificate Management Service for the
Session Initiation Protocol (SIP). C. Jennings, 6337 Session Initiation Protocol (SIP) Usage of the
J. Fischl, Ed., February 2011. (Status: Proposed Offer/Answer Model. S. Okumura, T. Sawada,
Standard) P. Kyzivat. August 2011. (Status: Informational)
6076 Basic Telephony SIP End-to-End Performance 6350 vCard Format Specification. S. Perreault.
Metrics. D. Malas, A. Morton. January 2011. August 2011. (Obsoletes RFC 2425, RFC 2426,
(Status: Proposed Standard) RFC 4770) (Updates RFC 2739) (Updated by
RFC 6868) (Status: Proposed Standard)
6086 Session Initiation Protocol (SIP) INFO Method
and Package Framework. C. Holmberg, 6357 Design Considerations for Session Initiation
E. Burger, H. Kaplan. January 2011. (Obsoletes Protocol (SIP) Overload Control. V. Hilt,
RFC 2976) (Status: Proposed Standard) E. Noel, C. Shen, A. Abdelal. August 2011.
(Status: Informational)
6116 The E.164 to Uniform Resource Identifiers
(URI) Dynamic Delegation Discovery System 6416 RTP Payload Format for MPEG-4 Audio/Visual
(DDDS) Application (ENUM). S. Bradner, Streams. M. Schmidt, F. de Bont, S. Doehla,
L. Conroy, K. Fujiwara. March 2011. (Obsoletes J. Kim. October 2011. (Obsoletes RFC 3016)
RFC 3761) (Status: Proposed Standard) (Status: Proposed Standard)
6117 IANA Registration of Enumservices: Guide, 6442 Location Conveyance for the Session Initiation
Template, and IANA Considerations. Protocol. J. Polk, B. Rosen, J. Peterson.
B. Hoeneisen, A. Mayrhofer, J. Livingood. December 2011. (Status: Proposed Standard)
March 2011. (Obsoletes RFC 3761)
(Status: Proposed Standard) 6443 Framework for Emergency Calling Using
Internet Multimedia. B. Rosen, H. Schulzrinne,
6140 Registration for Multiple Phone Numbers in J. Polk, A. Newton. December 2011. (Status:
the Session Initiation Protocol (SIP). Informational)
A.B. Roach. March 2011. (Updates RFC 3680)
(Status: Proposed Standard) 6455 The WebSocket Protocol. I. Fette, A. Melnikov.
December 2011. (Status: Proposed Standard)
6141 Re-INVITE and Target-Refresh Request
Handling in the Session Initiation Protocol 6501 Conference Information Data Model for
(SIP). G. Camarillo, Ed., C. Holmberg, Y. Gao. Centralized Conferencing (XCON). O. Novo,
March 2011. (Updates RFC 3261) (Status: G. Camarillo, D. Morgan, J. Urpalainen. March
Proposed Standard) 2012. (Status: Proposed Standard)
6157 IPv6 Transition in the Session Initiation 6503 Centralized Conferencing Manipulation
Protocol (SIP). G. Camarillo, K. El Malki, Protocol. M. Barnes, C. Boulton, S. Romano,
V. Gurbani. April 2011. (Updates RFC 3264) H. Schulzrinne. March 2012. (Status: Proposed
(Status: Proposed Standard) Standard)
Appendix B ◾ 837
6505 A Mixer Control Package for the Media Control 7158 The JavaScript Object Notation (JSON) Data
Channel Framework. S. McGlashan, Interchange Format. T. Bray, Ed. March 2014.
T. Melanchuk, C. Boulton. March 2012. (Status: (Obsoleted by RFC 7159) (Status: Proposed
Proposed Standard) Standard)
6517 Mandatory Features in a Layer 3 Multicast BGP/ 7203 An Incident Object Description Exchange
MPLS VPN Solution. T. Morin, Ed., B. Niven- Format (IODEF) Extension for Structured
Jenkins, Ed., Y. Kamite, R. Zhang, N. Leymann, Cybersecurity Information. T. Takahashi,
N. Bitar. February 2012. (Status: Informational) K. Landfield, Y. Kadobayashi. April 2014.
(Status: Proposed Standard)
6567 Problem Statement and Requirements for
Transporting User-to-User Call Control 7230 Hypertext Transfer Protocol (HTTP/1.1):
Information in SIP. A. Johnston, L. Liess. April Message Syntax and Routing. R. Fielding, Ed.,
2012. (Status: Informational) J. Reschke, Ed. June 2014. (Obsoletes RFC
2145, RFC 2616) (Updates RFC 2817, RFC 2818)
6665 SIP-Specific Event Notification. A.B. Roach. July (Status: Proposed Standard)
2012. (Obsoletes RFC 3265) (Updates RFC
3261, RFC 4660) (Status: Proposed Standard) 7231 Hypertext Transfer Protocol (HTTP/1.1):
Semantics and Content. R. Fielding, Ed.,
6698 The DNS-Based Authentication of Named J. Reschke, Ed. June 2014. (Obsoletes RFC
Entities (DANE) Transport Layer Security (TLS) 2616) (Updates RFC 2817) (Status: Proposed
Protocol: TLSA. P. Hoffman, J. Schlyter. August Standard)
2012. (Updated by RFC 7218) (Status: Proposed
Standard) 7232 Hypertext Transfer Protocol (HTTP/1.1):
Conditional Requests. R. Fielding, Ed.,
6714 Connection Establishment for Media J. Reschke, Ed. June 2014. (Obsoletes RFC
Anchoring (CEMA) for the Message Session 2616) (Status: Proposed Standard)
Relay Protocol (MSRP). C. Holmberg, S. Blau,
E. Burger. August 2012. (Status: Proposed 7233 Hypertext Transfer Protocol (HTTP/1.1): Range
Standard) Requests. R. Fielding, Ed., Y. Lafon, Ed.,
J. Reschke, Ed. June 2014. (Obsoletes RFC
6787 Media Resource Control Protocol Version 2 2616) (Status: Proposed Standard)
(MRCPv2). D. Burnett, S. Shanmugham.
November 2012. (Status: Proposed Standard) 7234 Hypertext Transfer Protocol (HTTP/1.1):
Caching. R. Fielding, Ed., M. Nottingham, Ed.,
6881 Best Current Practice for Communications J. Reschke, Ed. June 2014. (Obsoletes RFC
Services in Support of Emergency Calling. 2616) (Status: Proposed Standard)
B. Rosen, J. Polk. March 2013. (Also BCP 0181)
(Status: Best Current Practice) 7235 Hypertext Transfer Protocol (HTTP/1.1):
Authentication. R. Fielding, Ed., J. Reschke, Ed.
6910 Completion of Calls for the Session Initiation June 2014. (Obsoletes RFC 2616) (Updates RFC
Protocol (SIP). D. Worley, M. Huelsemann, 2617) (Status: Proposed Standard)
R. Jesske, D. Alexeitsev. April 2013. (Status:
Proposed Standard) 7250 Using Raw Public Keys in Transport Layer
Security (TLS) and Datagram Transport Layer
6989 Additional Diffie–Hellman Tests for the Internet Security (DTLS). P. Wouters, Ed., H. Tschofenig,
Key Exchange Protocol Version 2 (IKEv2). Ed., J. Gilmore, S. Weiler, T. Kivinen. June 2014.
Y. Sheffer, S. Fluhrer. July 2013. (Updates RFC (Status: Proposed Standard)
5996) (Status: Proposed Standard)
7315 Private Header (P-Header) Extensions to the
7022 Guidelines for Choosing RTP Control Protocol Session Initiation Protocol (SIP) for the 3GPP.
(RTCP) Canonical Names (CNAMEs). A. Begen, R. Jesske, K. Drage, C. Holmberg. July 2014.
C. Perkins, D. Wing, E. Rescorla. September (Obsoletes RFC 3455) (Status: Informational)
2013. (Obsoletes RFC 6222) (Updates RFC
3550) (Status: Proposed Standard) 7339 7 Session Initiation Protocol (SIP) Overload
Control. V. Gurbani, Ed., V. Hilt,
7088 Session Initiation Protocol Service Example— H. Schulzrinne. September 2014.
Music on Hold. D. Worley. February 2014. (Status: Proposed Standard)
(Status: Informational)
838 ◾ Appendix B
7433 A Mechanism for Transporting User-to-User 7462 URNs for the Alert-Info Header Field of the
Call Control Information in SIP. A. Johnston, Session Initiation Protocol (SIP). L. Liess, Ed.,
J. Rafferty. January 2015. (Status: Proposed R. Jesske, A. Johnston, D. Worley, P. Kyzivat.
Standard) March 2015. (Updates RFC 3261)
(Status: Proposed Standard)
7434 Interworking ISDN Call Control User
Information with SIP. K. Drage, Ed., 7463 Shared Appearances of a Session Initiation
A. Johnston. January 2015. (Status: Proposed Protocol (SIP) Address of Record (AOR).
Standard) A. Johnston, Ed., M. Soroushnejad, Ed.,
V. Venkataramanan. March 2015. (Updates RFC
3261, RFC 4235) (Status: Proposed Standard)
Handbook on
Engineering - Electrical
Roy
Session Initiation
Session Initiation Protocol (SIP), standardized by the Internet Engineering Task Force (IETF), has
emulated the simplicity of the protocol architecture of hypertext transfer protocol (HTTP) and is being
popularized for VoIP over the Internet because of the ease with which it can be meshed with web services.
However, it is difficult to know exactly how many requests for comments (RFCs) have been published
Handbook on
useful.
The text of each RFC from the IETF has been reviewed by all members of a given working group made up
of world-renowned experts, and a rough consensus made on which parts of the drafts need to be mandatory
and optional, including whether an RFC needs to be Standards Track, Informational, or Experimental.
Texts, ABNF syntaxes, figures, tables, and references are included in their original form. All RFCs, along
with their authors, are provided as references. The book is organized into twenty chapters based on the
major functionalities, features, and capabilities of SIP.
K27057
6000 Broken Sound Parkway, NW
Suite 300, Boca Raton, FL 33487 ISBN: 978-1-4987-4770-7
711 Third Avenue 90000
New York, NY 10017
an informa business 2 Park Square, Milton Park