Session Initiation Protocol Tutorial
Session Initiation Protocol Tutorial
Audience
This tutorial has been prepared for professionals aspiring to learn the basics of SIP and
make a career in telecom testing.
Prerequisites
Before proceeding with this tutorial, you should have a good grasp over preliminary
networking concepts including some of the basic protocols such as TCP, UDP, HTTP, SMTP,
and VoIP.
Table of Contents
About this Tutorial ................................................................................................................................. i
Audience ................................................................................................................................................ i
Prerequisites .......................................................................................................................................... i
Copyright & Disclaimer ........................................................................................................................... i
Table of Contents .................................................................................................................................. ii
1.
INTRODUCTION ...................................................................................................................... 1
VoIP Technology .................................................................................................................................... 1
SIP Overview....................................................................................................................................... 2
Where Does SIP Fit In?........................................................................................................................... 2
2.
3.
4.
ii
5.
6.
7.
SDP ............................................................................................................................................ 40
Purpose of SDP .................................................................................................................................... 40
Session Description Parameters .......................................................................................................... 41
An SDP Example .................................................................................................................................. 44
8.
iii
9.
iv
14. B2BUA...................................................................................................................................... 67
B2BUA How it Works? ....................................................................................................................... 67
Functions of B2BUA ............................................................................................................................. 67
Example of B2BUA ............................................................................................................................... 67
1. Introduction
Session Initiation Protocol (SIP) is one of the most common protocols used in VoIP
technology. It is an application layer protocol that works in conjunction with other
application layer protocols to control multimedia communication sessions over the
Internet.
VoIP Technology
Before moving further, let us first understand a few points about VoIP.
VOIP is a technology that allows you to deliver voice and multimedia (videos,
pictures) content over the Internet. It is one of the cheapest way to communicate
anytime, anywhere with the Internets availability.
Low cost
Portability
No extra cables
Flexibility
Video conferencing
For a VOIP call, all that you need is a computer/laptop/mobile with internet
connectivity. The following figure depicts how a VoIP call takes place.
SIP Overview
Given below are a few points to note about SIP:
SIP embodies client-server architecture and the use of URL and URI
from HTTP and a text encoding scheme and a header style from SMTP.
SIP takes the help of SDP (Session Description Protocol) which describes a session
and RTP (Real Time Transport Protocol) used for delivering voice and video over IP
network.
Other SIP applications include file transfer, instant messaging, video conferencing,
online games, and steaming multimedia distribution.
Typically, the SIP protocol is used for internet telephony and multimedia distribution
between two or more endpoints. For example, one person can initiate a telephone call to
another person using SIP, or someone may create a conference call with many
participants.
The SIP protocol was designed to be very simple, with a limited set of commands. It is
also text-based, so anyone can read a SIP message passed between the endpoints in a
SIP session.
2. Network Elements
There are some entities that help SIP in creating its network. In SIP, every network
element is identified by a SIP URI (Uniform Resource Identifier) which is like an address.
Following are the network elements:
User Agent
Proxy Server
Registrar Server
Redirect Server
Location Server
User Agent
It is the endpoint and one of the most important network elements of a SIP network. An
endpoint can initiate, modify, or terminate a session. User agents are the most intelligent
device or network element of a SIP network. It could be a softphone, a mobile, or a laptop.
User agents are logically divided into two parts:
User Agent Client (UAC): The entity that sends a request and receives a
response.
User Agent Server (UAS): The entity that receives a request and sends a
response.
SIP is based on client-server architecture where the callers phone acts as a client which
initiates a call and the callees phone acts as a server which responds the call.
Proxy Server
It is the network element that takes a request from a user agent and forwards it to another
user.
It has some intelligence to understand a SIP request and send it ahead with the
help of URI.
Stateless Proxy Server: It simply forwards the message received. This type of
server does not store any information of a call or a transaction.
Registrar Server
The registrar server accepts registration requests from user agents. It helps users to
authenticate themselves within the network. It stores the URI and the location of users in
a database to help other SIP servers within the same domain.
Take a look at the following example that shows the process of a SIP Registration.
Here the caller wants to register with the TMC domain. So it sends a REGISTER request
to the TMCs Registrar server and the server returns a 200 OK response as it authorized
the client.
Redirect Server
The redirect server receives requests and looks up the intended recipient of the request in
the location database created by the registrar.
The redirect server uses the database for getting location information and responds with
3xx (Redirect response) to the user. We will discuss response codes later in this tutorial.
Location Server
The location server provides information about a caller's possible locations to the redirect
and proxy servers.
Only a proxy server or a redirect server can contact a location server.
The following figure depicts the roles played by each of the network elements in
establishing a session.
The lowest layer of SIP is its syntax and encoding. Its encoding is specified using
an augmented Backus-Naur Form grammar (BNF).
At the second level is the transport layer. It defines how a Client sends requests
and receives responses and how a Server receives requests and sends responses
over the network. All SIP elements contain a transport layer.
The layer above the transaction layer is called the transaction user. Each of the
SIP entities, except the stateless proxy, is a transaction user.
The following image shows the basic call flow of a SIP session.
9. BYE reaches directly from Alice to Bob bypassing the proxy server.
10. Finally, Bob sends a 200 OK response to confirm the BYE and the session is
terminated.
11. In the above basic call flow, three transactions are (marked as 1, 2, 3) available.
The complete call (from INVITE to 200 OK) is known as a Dialog.
SIP Trapezoid
How does a proxy help to connect one user with another? Let us find out with the help of
the following diagram.
The topology shown in the diagram is known as a SIP trapezoid. The process takes place
as follows:
1. When a caller initiates a call, an INVITE message is sent to the proxy server. Upon
receiving the INVITE, the proxy server attempts to resolve the address of the callee
with the help of the DNS server.
2. After getting the next route, callers proxy server (Proxy 1, also known as outbound
proxy server) forwards the INVITE request to the callees proxy server which acts
as an inbound proxy server (Proxy 2) for the callee.
3. The inbound proxy server contacts the location server to get information about the
callees address where the user registered.
4. After getting information from the location server, it forwards the call to its
destination.
5. Once the user agents get to know their address, they can bypass the call, i.e.,
conversations pass directly.
4. SIP Messaging
The opening line of a request contains a method that defines the request, and a
Request-URI that defines where the request is to be sent.
Request Methods
SIP requests are the codes used to establish a communication. To complement them,
there are SIP responses that generally indicate whether a request succeeded or failed.
These SIP requests which are known as METHODS make SIP message workable.
METHODS can be regarded as SIP requests, since they request a specific action to
be taken by another user agent or server.
Core Methods
Extension Methods
Core Methods
There are six core methods as discussed below.
INVITE
INVITE is used to initiate a session with a user agent. In other words, an INVITE
method is used to establish a media session between the user agents.
INVITE can contain the media information of the caller in the message body.
10
A successful INVITE request establishes a dialog between the two user agents
which continues until a BYE is sent to terminate the session.
INVITE Example
The following code shows how INVITE is used.
INVITE sips:[email protected] SIP/2.0
Via: SIP/2.0/TLS client.ANC.com:5061;branch=z9hG4bK74bf9
Max-Forwards: 70
From: Alice<sips:[email protected]>;tag=1234567
To: Bob<sips:[email protected]>
Call-ID: [email protected]
CSeq: 1 INVITE
Contact: <sips:[email protected]>
Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, NOTIFY
Supported: replaces
Content-Type: application/sdp
Content-Length: ...
v=0
o=Alice 2890844526 2890844526 IN IP4 client.ANC.com
s=Session SDP
c=IN IP4 client.ANC.com
11
t=3034423619 0
m=audio 49170 RTP/AVP 0
a=rtpmap:0 PCMU/8000
BYE
BYE is the method used to terminate an established session. This is a SIP request that can
be sent by either the caller or the callee to end a session.
BYE request normally routes end to end, bypassing the proxy server.
REGISTER
REGISTER request performs the registration of a user agent. This request is sent by a
user agent to a registrar server.
It carries the AOR (Address of Record) in the To header of the user that is being
registered.
One user agent can send a REGISTER request on behalf of another user agent. This
is known as third-party registration. Here, the From tag contains the URI of the
party submitting the registration on behalf of the party identified in the To header.
CANCEL
CANCEL is used to terminate a session which is not established. User agents use this
request to cancel a pending call attempt initiated earlier.
CANCEL is a hop by hop request, i.e., it goes through the elements between the
user agent and receives the response generated by the next stateful
element.
12
ACK
ACK is used to acknowledge the final responses to an INVITE method. An ACK always goes
in the direction of INVITE.ACK may contain SDP body (media characteristics), if it is not
available in INVITE.
ACK may not be used to modify the media description that has already been sent
in the initial INVITE.
13
A stateful proxy receiving an ACK must determine whether or not the ACK should
be forwarded downstream to another proxy or user agent.
For 2xx responses, ACK is end to end, but for all other final responses, it works
on hop by hop basis when stateful proxies are involved.
OPTIONS
OPTIONS method is used to query a user agent or a proxy server about its capabilities and
discover its current availability. The response to a request lists the capabilities of the user
agent or server. A proxy never generates an OPTIONS request.
Extension Methods
Subscribe
SUBSCRIBE is used by user agents to establish a subscription for the purpose of getting
notification about a particular event.
After the time period passes, the subscription will automatically terminate.
You can re-subscription again by sending another SUBSCRIBE within the dialog
before the expiration time.
14
Users can unsubscribe by sending another SUBSCRIBE method with Expires value
0(zero).
NOTIFY
NOTIFY is used by user agents to get the occurrence of a particular event. Usually a NOTIFY
will trigger within a dialog when a subscription exists between the subscriber and the
notifier.
NOTIFY contain an Event header field indicating the event and a subscriptionstate header field indicating the current state of the subscription.
15
PUBLISH
PUBLISH is used by a user agent to send event state information to a server.
PUBLISH is mostly useful when there are multiple sources of event information.
REFER
REFER is used by a user agent to refer another user agent to access a URI for the dialog.
REFER must contain a Refer-To header. This is a mandatory header for REFER.
A 202 Accepted will trigger a REFER request which indicates that other user
agent has accepted the reference.
INFO
INFO is used by a user agent to send call signalling information to another user agent with
which it has established a media session.
16
UPDATE
UPDATE is used to modify the state of a session if a session is not established. User could
change the codec with UPDATE.
PRACK
PRACK is used to acknowledge the receipt of a reliable transfer of provisional response
(1XX).
The PRACK method applies to all provisional responses except the 100 Trying
response, which is never reliably transported.
17
MESSAGE
It is used to send an instant message using SIP. An IM usually consists of short messages
exchanged in real time by participants engaged in text conversation.
A 200 OK response is normally received to indicate that the message has been
delivered at its destination.
18
5. Response Codes
A SIP response is a message generated by a user agent server (UAS) or SIP server to
reply a request generated by a client. It could be a formal acknowledgement to prevent
retransmission of requests by a UAC.
A response may contain some additional header fields of info needed by a UAC.
1xx to 5xx has been borrowed from HTTP and 6xx is introduced in SIP.
1xx is considered as a provisional response and the rest are final responses.
1. 1xx: Provisional/Informational Responses
2. 2xx: Success Responses
3. 3xx: Redirect Responses
4. 4xx: Client Failure Responses
5. 5xx: Server Failure Responses
6. 6xx: Global Failure Responses
Informational (1xx)
Informational responses are used to indicate call progress. Normally the responses are
end to end (except 100 Trying). The main objective of informational responses is to stop
retransmission of INVITE requests.
Informational responses include the following responses:
100 Trying
180 Ringing
This response is used to indicate that an INVITE has been received by the user
agent and alerting is taking place.
This response is used to indicate that the call has been forwarded to another
endpoint.
19
It gives the status of the caller, as a forwarding operation may result in the call
taking longer to be answered.
This response is used to indicate that the INVITE has been received and will be
processed in a queue.
Success(2xx)
This class of responses is meant for indicating that a request has been accepted. It includes
the following responses:
200 OK
202 Accepted
202 Accepted indicates that the UAS has received and understood the request,
but that the request may not have been authorized or processed by the server.
Redirection (3xx)
Generally these class responses are sent by redirect servers in response to INVITE. They
are also known as redirect class responses. It includes the following responses:
It contains multiple Contact header fields to indicate that the location service has
returned multiple possible locations for the SIP URI in the Request-URI.
20
This redirection response contains a Contact header field with the new URI of the
called party.
This redirection response contains a URI that is currently valid but is not
permanent.
That is, for the specified duration of time the location is valid.
This response points to certain proxy server which is having some authoritative
information about the calling party.
This response could be sent by a UAS issuing a proxy for incoming call screening.
This response returns a URI that indicates the type of service the called party
would like.
This indicates that the server could not understood the request.
Request might be missing required header fields such as To, From, Call-ID, or
CSeq.
401 Unauthorized
The response contains WWW-Authenticate header field which requests for correct
credentials from the calling user agent.
21
A subsequent REGISTER will trigger from the User Agent with correct credentials.
403 Forbidden
403 Forbidden is sent when the server has understood the request, found the
request to be correctly formulated, but will not service the request.
It indicates that server has not found the indicated SIP URI by the User.
It indicates that the request has contains list of methods that are not allowed.
It contains an Allow field which inform the UAC as to what methods are
acceptable.
The Accept header field in the request did not contain any options supported by
the UAS.
This request sent by a proxy indicates that the UAC first has to authenticate itself
with the proxy before the request can be processed.
22
When the specified time period mentioned in the Expires header field of INVITE
request has passed, this response comes.
The minimum allowed interval is indicated in the required Min-SE header field.
The calling party may retry the request without the Session-Expires header field
or with a value less than or equal to the specified minimum.
The response must contain a Min-Expires header field listing the minimum
expiration interval that the registrar will accept.
This response indicates that the request has reached the correct destination, but
the called party is not available for some reason.
The response should contain a Retry-After header indicating when the request
may be able to be fulfilled.
This response indicates that the request has been forwarded the maximum
number of times as set by the Max-Forwards header that is 70 in the request.
This indicates the user agent is busy and cannot accept the call.
23
This response can be sent by a UA that has received a CANCEL request for a
pending INVITE request.
A 200 OK is sent to acknowledge the CANCEL, and a 487 is sent to cancel the
INVITE transaction.
500indicates that the server has experienced some kind of error that is
preventing it from processing the request.
It is one kind of server failure that indicates the client to retry the request again
at this server after several seconds.
It indicates that the server is unable to process the request because it is not
supported.
It indicates some problem in the other network is preventing the request from
being processed.
The request can be retried after a few seconds, or after the expiration of the
Retry-After header field.
This response comes when the request failed due to a timeout occurred in the
other network to which the gateway connects.
It is a server error class response because the call is failing due to a failure of the
server in accessing resources outside the SIP network.
24
The server denies a request when it comes with a different SIP version number.
The denial is indicated in this message.
This response is used by a UAS to indicate that the request size was too large for
it to process.
This response indicates that the call to the specified Request-URI could be
answered in other locations.
603 Decline
This response could indicate the called party is busy, or simply does not want to
accept the call.
This response is similar to the 404 Not Found response but indicates that the
user in the Request-URI cannot be found anywhere.
This response should only be sent by a server having access to all the information
about the user.
25
This response indicates that some aspect of the desired session is not acceptable
to the UAS, and as a result, the session cannot be established.
The response may contain a Warning header field with a numerical code
describing exactly what was not acceptable.
26
6. SIP Headers
A header is a component of a SIP message that conveys information about the message.
It is structured as a sequence of header fields.
SIP header fields in most cases follow the same rules as HTTP header fields. Header fields
are defined as Header: field, where Header is used to represent the header field name,
and field is the set of tokens that contains the information. Each field consists of a fieldname followed by a colon (":") and the field-value (i.e., field-name: field-value).
Compact form
To
Via
Call-ID
Contact
From
Subject
Content-Length
27
The header field describes media types using the format type/sub-type commonly
used in the Internet.
A list of media types can have preferences set using q value parameters.
Accept-Encoding
The Accept-Encoding header field is used to specify acceptable message body encoding
schemes.
Encoding can be used to ensure a SIP message with a large message body fits
inside a single UDP datagram.
The use of q value parameters can set preferences. If none of the listed schemes
are acceptable to the UAC, a 406 Not Acceptable response is returned. If not
included, the assumed encoding will be text/plain.
To
To indicates the final recipient of the request. Any response generated by a UA will
contain this header field with the addition of a tag. It is a mandatory header.
Any response generated by a proxy must have a tag added to the To header
field.
From
From header field indicates the originator of the request. It is one of two addresses used
to identify a dialog.
A From header field may contain a tag used to identify a particular call.
28
It may contain a display name, in which case the URI is enclosed in <>.
It is a mandatory header.
Call-ID
The Call-ID header field is mandatory in all SIP requests and responses. It is used to
uniquely identify a call between two user agents.
All registrations for a user agent should use the same Call-ID.
Via
Via is used to record the SIP route taken by a request which helps to route a response
back to the originator.
A proxy forwarding the request adds a Via header field containing its own address
to the top of the list of Via header fields.
A proxy or UA generating a response to a request copies all the Via header fields
from the request in order into the response, then sends the response to the
address specified in the top Via header field.
A proxy receiving a response checks the top Via header field and matches its own
address.
The top Via header field is then removed, and the response forwarded to the
address specified in the next Via header field.
Via header fields contain protocol name, version number, and transport
(SIP/2.0/UDP, SIP/2.0/TCP, etc.) and may contain port numbers and parameters
such as received, rport, branch, maddr, and ttl.
A branch parameter is added to Via header fields by UAs and proxies, which is
computed as a hash function of the Request-URI, and the To, From, Call-ID, and
CSeq number.
29
CSeq
The CSeq header field is a required header field in every request. It contains a decimal
number that increases for each request.
Usually, it increases by 1 for each new request, with the exception of CANCEL
and ACK requests, which use the CSeq number of the INVITE request to which it
refers.
The CSeq header field is used by UACs to match a response to the request it
references.
For example, a UAC that sends an INVITE request then a CANCEL request can tell
by the method in the CSeq of a 200 OK response if it is a response to the
invitation or cancellation request.
Contact
The Contact header field is used to convey the other user about the address of the request
originator. Once a Contact header field has been received, the URI can be cached and used
for routing future requests within a dialog.
For example, a Contact header field in a 200 OK response to an INVITE can allow the
acknowledgment ACK message and all future requests during this call to bypass proxies
and go directly to the called party.
Record-Route
The Record-Route header field is used to force routing through a proxy for all subsequent
requests in a session (dialog) between two UAs.
Normally, the presence of a Contact header field allows UAs to send messages directly
bypassing the proxy chain used in the initial request.
A proxy inserting its address into a Record-Route header field overrides this and
forces future requests to include a Route header field containing the address of
the proxy that forces this proxy to be included.
A proxy wishing to implement this inserts the header field containing its own URI,
or adds its URI to an already present Record-Route header field.
The URI is constructed so that the URI resolves back to the proxy server. The
UAS copies the Record-Route header field into the 200 OK response to the
request.
The header field is forwarded unchanged by proxies back to the UAC. The UAC
then stores the Record-Route proxy list plus a Contact header field if present in
the200 OK for use in a Route header field in all subsequent requests.
30
Organization
The Organization header field is used to indicate the organization to which the originator
of the message belongs.
Like all SIP header fields, it can be used by proxies for making routing decisions
and by UAs for making call screening decisions.
Retry-After
It is used to indicate when a resource or service may be available again.
In 404 Not Found, 600 Busy Everywhere, and 603 Decline responses, it indicates
when the called UA may be available again.
Subject
The optional Subject header field is used to indicate the subject of the media session.
The contents of the header field can also be displayed during alerting to aid the user in
deciding whether to accept the call.
Example:
Subject: How are you?
Supported
The Supported header field is used to list one or more options implemented bya UA or
server.
If a UAC lists an option in a Supported header field, proxies or UASs may use the
option during the call.
If the option must be used or supported, the Require header field is used instead.
Example:
Supported: rel100
31
Expires
The Expires header field is used to indicate the time interval in which the request or
message contents are valid.
When present in an INVITE request, the header field sets a time limit on the
completion of the INVITE request.
That is, the UAC must receive a final response (non-1xx) within the time period
or the INVITE request is automatically cancelled with a 408 Request Timeout
response.
Once the session is established, the value from the Expires header field in the
original INVITE has no effectthe Session-Expires header field must be used for
this purpose.
If present in a REGISTER request, the header field sets the time limit on the URIs
in Contact header fields that do not contain an expires parameter.
User-Agent
This header field is used to convey information about the UA originating the request.
Event
This header field is used in a SUBSCRIBE or NOTIFY method to indicate which event
package is being used by the method.
In a SUBSCRIBE, it lists the event package to which the client would like to
subscribe.
In a NOTIFY, it lists the event package that the notification contains state
information about.
32
Join
The Join header field is used in an INVITE to request that the dialog (session) be joined
with an existing dialog (session).
The parameters of the Join header field identify a dialog by the Call-ID, To tag,
and From tag in a similar way to the Replaces header field.
If the Join header field references a point-to-point dialog between two user
agents, the Join header field is effectively a request to turn the call into a
conference call.
If the dialog is already part of a conference, the Join header field is a request to
be added into the conference.
Proxy-Authorization
The Proxy-Authorization header field is to carry the credentials of a UA in a request to a
server.
If the credentials are correct, any remaining entries are kept in the request when
it is forwarded to the next proxy.
Proxy-Require
The Proxy-Require header field is used to list features and extensions that a UA requires
a proxy to support in order to process the request.
A 420 Bad Extension response is returned by the proxy listing any unsupported
feature in an Unsupported header field
Max-Forwards
The Max-Forwards header field is used to indicate the maximum number of hops that a
SIP request may take.
The value of the header field is decremented by each proxy that forwards the
request.
A proxy receiving the header field with a value of zero discards the message and
sends a 483 Too Many Hops response back to the originator.
33
Priority
The Priority header field is used by a UAC to set the urgency of a request. Values are
non-urgent, normal, urgent, and emergency.
Refer-To
The Refer-To header field is a mandatory header field in a REFER request, which contains
the URI or URL resource that is being referenced. It may contain any type of URI from a
sip or sips to a telURI.
Referred-By
The Referred-By header field is an optional header field in a REFER request and a
request triggered by a REFER.
It provides the recipient of a triggered request with information that the request
was generated as a result of a REFER and the originator of the REFER.
Replaces
Replaces is used for replacing an existing call with a new call.
If the Replaces header field matches no dialog, the INVITE must be rejected with
a 481 Dialog Does Not Exist response.
Request-Disposition
The Request-Disposition header field can be used to request servers to either proxy,
redirect.
Example:
Request-Disposition: redirect
Require
The Require header field is used to list features and extensions that a UAC requires a UAS
to support in order to process the request.
A 420 Bad Extension response is returned by the UAS listing any unsupported features in
an Unsupported header field.
34
Example:
Require: rel100
Route
The Route header field is used to provide routing information for requests.
RFC 3261 introduces two types of routing: strict routing and loose routing,
which have similar meaning as the IP routing modes of the same name.
In strict routing, a proxy must use the first URI in the Route header field to
rewrite the Request-URI, which is then forwarded.
In loose routing, a proxy does not rewrite the Request-URI, but either forwards
the request to the first URI in the Route header field or to another loose routing
element.
In loose routing, the request must route through every server in the Route list
before it may be routed based on the Request-URI.
In strict routing, the request must only route through the set of servers in the
Route header field with the Request-URI being rewritten at each hop.
A proxy or UAC can tell if the next element in the route set supports loose routing
by the presence of an lr parameter.
Example:
Route: sip:[email protected];lr
RAck
The RAck header field is used within a response to a PRACK request to reliably
acknowledge a provisional response that contained an RSeq header field.
Its value is combination ofCSeq and the RSeq from the provisional response.
The reliable sequence number is incremented for each response sent reliably.
Example:
RAck: 3452337 17 INVITE
Session-Expires
The Session-Expires header field is used to specify the expiration time of the session.
35
SIP - If - Match
The SIP-If-Match header field is part of the SIP publication mechanism. It is included in a
PUBLISH request meant to refresh, modify, or remove previously published state.
The header field contains the entity tag of the state information that was returned
in a SIP-ETag header field in a 2xx response to an earlier PUBLISH.
If the entity-tag is no longer valid, the server will return a 412 Conditional
Request Failed response.
Example:
SIP-If-Match: 56jforRr1pd
Subscription-State
The Subscription-State header field is a required header field in a NOTIFY request. It
indicates the current state of a subscription. Values defined include active, pending, or
terminated.
Example:
Subscription-State: terminated;reason=rejected
The header field contains an integer number of seconds that represents the
minimum expiration interval that the registrar will accept.
A client receiving this header field can update the expiration intervals of the
registration request accordingly and resend the REGISTER request.
Min-SE
The Min-SE header field is a required header field in a 422 Session Timer Interval Too
Small response.
The response may also be present in an INVITE or UPDATE containing a Session-Expires
header field. It contains an integer number of seconds.
Proxy-Authenticate
The Proxy-Authenticate header field is used in a 407 Proxy Authentication Required
authentication challenge by a proxy server to a UAC.
36
It contains the nature of the challenge so that the UAC may formulate credentials in a
Proxy- Authorization header field in a subsequent request.
SIP-ETag
The SIP-ETag header field is part of the SIP publication mechanism. The SIP-ETag header
field is returned in a 2xx response to a PUBLISH request.
It contains an entity tag uniquely identifying the state information that has been
processed.
This entity tag can then be used to do conditional publications on this data
including refreshing, modifying, and removing.
Unsupported
The Unsupported header field is used to indicate features that are not supported by the
server.
The header field is used in a 420 Bad Extension response to a request containing an
unsupported feature listed in a Require header field.
Example:
Unsupported: rel100
WWW-Authenticate
The WWW-Authenticate header field is used in a 401 Unauthorized authentication
challenge by a UA or registrar server to a UAC.
It contains the nature of the challenge so that the UAC may formulate credentials in a
Proxy-Authorization header field in a subsequent request.
RSeq
The RSeq header field is used in provisional (1xx class) responses to INVITEs to request
reliable transport.
The header field may only be used if the INVITE request contained the Supported: rel100
header field.
The RSeq header field contains a reliable sequence number that is an integer
randomly initialized by the UAS.
Each subsequent provisional response sent reliably for this dialog will have a
monotonically increasing RSeq number.
The UAS will retransmit a reliably sent response until a PRACK is received with a
RAck containing the reliable sequence number and CSeq.
37
Only those encoding schemes listed in an Allow-Encoding header field may be used.
Content-Disposition
The Content-Disposition header field is used to describe the function of a message body.
Values include session, icon, alert, and render.
The value session indicates that the message body contains information to describe a
media session.
Content-Language
The Content-Language header field is used to indicate the language of a message body.
It contains a language tag, which identifies the language.
Example:
Content-Language: en
Content-Length
The Content-Length is used to indicate the number of octets in the message body.
A Content-Length: 0 indicates no message body.
Content-Type
The Content-Type header field is used to specify the Internet media type in the message
body.
If an Accept header field was present in the request, the response Content-Type
must contain a listed type, or a 415 Unsupported Media Type response must be
returned.
38
MIME-Version
The MIME-Version header field is used to indicate the version of MIME protocol used to
construct the message body.
SIP, like HTTP, is not considered MIME compliant because parsing and semantics are
defined by the SIP standard, not the MIME specification. Version 1.0 is the default value.
Example:
MIME-Version: 1.0
39
7. SDP
SDP stands for Session Description Protocol. It is used to describe multimedia sessions in
a format understood by the participants over a network. Depending on this description, a
party decides whether to join a conference or when or how to join a conference.
SDP is generally contained in the body part of Session Initiation Protocol popularly
called SIP.
SDP is defined in RFC 2327. An SDP message is composed of a series of lines, called
fields, whose names are abbreviated by a single lower-case letter, and are in a
required order to simplify parsing
Purpose of SDP
The purpose of SDP is to convey information about media streams in multimedia sessions
to help participants join or gather info of a particular session.
It conveys the name and purpose of the session, the media, protocols, codec
formats, timing and transport information.
The format has entries in the form of <type>= <value>, where the
<type>defines a unique session parameter and the <value>provides a specific
value for that parameter.
The line begins with a single lower-case letter, for example, x. There are never
any spaces between the letter and the =, and there is exactly one space between
each parameter. Each field has a defined number of parameters.
40
v= (protocol version)
o= (owner/creator and session identifier)
s= (session name)
i=* (session information)
u=* (URI of description)
e=* (email address)
p=* (phone number)
c=* (connection information -not required if included in all media)
b=* (bandwidth information)
z=* (time zone adjustments)
k=* (encryption key)
a=* (zero or more session attribute lines)
Protocol Version
The v= field contains the SDP version number. Because the current version of SDP is 0, a
valid SDP message will always begin with v=0.
Origin
The o= field contains information about the originator of the session and session
identifiers. This field is used to uniquely identify the session.
The version is a numeric field that is increased for each change to the session,
also recommended to be a NTP timestamp.
41
URI
The optional u= field contains a uniform resource indicator (URI) with more information
about the session.
Connection Data
The c= field contains information about the media connection.
The address-type is defined as IP4 for IPv4 addresses and IP6 for IPv6 addresses.
The connection-address is the IP address or host that will be sending the media
packets, which could be either multicast or unicast.
connection-address=base-multicast-address/ttl/number-of-addresses
where ttl is the time-to-live value, and number-of-addresses indicates how many
contiguous multicast addresses are included starting with the base-multicast
address.
Bandwidth
The optional b= field contains information about the bandwidth required. It is of the
form:
b=modifier:bandwidth-value
42
Media Announcements
The optional m= field contains information about the type of media session. The field
contains:
m=media port transport format-list
The media parameter is either audio, video, text, application, message, image, or
control. The port parameter contains the port number.
The transport parameter contains the transport protocol or the RTP profile used.
The format-list contains more information about the media. Usually, it contains
media payload types defined in RTP audio video profiles.
Example:
m=audio 49430 RTP/AVP 0 6 8 99
One of these three codecs can be used for the audio media session. If the intention is to
establish three audio channels, three separate media fields would be used.
Attributes
The optional a= field contains attributes of the preceding media session. This field can be
used to extend SDP to provide more information about the media. If not fully
understood by a SDP user, the attribute field can be ignored. There can be one or more
attribute fields for each media payload type listed in the media field.
Attributes in SDP can be either
session level, or
media level.
Session level means that the attribute is listed before the first media line in the SDP. If
this is the case, the attribute applies to all the media lines below it.
Media level means itis listed after a media line. In this case, the attribute only applies to
this particular media stream.
SDP can include both session level and media level attributes. If the same attribute
appears as both, the media level attribute overrides the session level attribute for that
particular media stream. Note that the connection data field can also be either session
level or media level.
43
An SDP Example
Given below is an example session description, taken from RFC 2327:
v=0
o=PPT 40467 40468 IN IP4 192.168.2.1
s=c=IN IP4 192.168.2.1
b=AS:49
t=0 0
b=RR:0
b=RS:0
a=rtpmap:97 AMR/8000/1
m=audio 6000 RTP/AVP 96
a=fmtp:102 0-15
a=ptime:20
a=maxptime:240
44
The use of SDP with SIP is given in the SDP offer answer RFC 3264. The default message
body type in SIP is application/sdp.
The calling party lists the media capabilities that they are willing to receive in SDP,
usually in either an INVITE or in an ACK.
The called party lists their media capabilities in the 200 OK response to the INVITE.
A typical SIP use of SDP includes the following fields: version, origin, subject, time,
connection, and one or more media and attribute.
The subject and time fields are not used by SIP but are included for compatibility.
In the SDP standard, the subject field is a required field and must contain at least
one character, suggested to be s=- if there is no subject.
The time field is usually set to t=00. SIP uses the connection, media, and attribute
fields to set up sessions between UAs.
The version is incremented each time the SDP is changed. If the SDP being sent
is unchanged from that sent previously, the version is kept the same.
As the type of media session and codec to be used are part of the connection
negotiation, SIP can use SDP to specify multiple alternative media types and to
selectively accept or decline those media types.
Example
In the following example, the caller Tom wants to set up an audio and video call with two
possible audio codecs and a video codec in the SDP carried in the initial INVITE:
v=0
o= John 0844526 2890844526 IN IP4 172.22.1.102
s=c=IN IP4 172.22.1.102
t=0 0
m= audio 6000 RTP/AVP 97 98
a=rtpmap:97 AMR/16000/1
45
a=rtpmap:98 AMR-WB/8000/1
m=video 49172 RTP/AVP 32
a=rtpmap:32 MPV/90000
The codecs are referenced by the RTP/AVP profile numbers 97, 98.
The called party Marry answers the call, chooses the second codec for the first media field,
and declines the second media field, only wanting AMR session.
v=0
o=Marry 2890844526 2890844526 IN IP4 172.22.1.110
s=c=IN IP4 200.201.202.203
t=0 0
m=audio 60000 RTP/AVP 8
a=rtpmap:97 AMR/16000
m=video 0 RTP/AVP 32
If this audio-only call is not acceptable, then Tom would send an ACK then a BYE to cancel
the call. Otherwise, the audio session would be established and RTP packets exchanged.
As this example illustrates, unless the number and order of media fields is maintained, the
calling party would not know for certain which media sessions were being accepted and
declined by the called party.
The offer/answer rules are summarized in the following sections.
The answer must have the same number of m= lines in the same order as the
answer.
46
The listed payloads for each media type must be a subset of the payloads listed in
the offer.
For dynamic payloads, the same dynamic payload number does not need to be
used in each direction. Usually, only a single payload is selected.
The origin (o=) line version number must either be the same as the last one sent,
which indicates that this SDP is identical to the previous exchange, or it may be
incremented by one, which indicates new SDP that must be parsed.
The offer must include all existing media lines and they must be sent in the same
order.
Additional media streams can be added to the end of the m= line list.
An existing media stream can be deleted by setting the port number to 0. This
media line must remain in the SDP and all future offer/answer exchanges for this
session.
Call Hold
One party in a call can temporarily place the other on hold. This is done by sending an
INVITE with an identical SDP to that of the original INVITE but with a=sendonly attribute
present.
The call is made active again by sending another INVITE with the a=sendrecv attribute
present. The following illustration shows the call flow of a call hold.
47
48
9. SIP Mobility
Personal mobility is the ability to have a constant identifier across a number of devices.
SIP supports basic personal mobility using the REGISTER method, which allows a mobile
device to change its IP address and point of connection to the Internet and still be able to
receive incoming calls.
SIP can also support service mobility the ability of a user to keep the same services
when mobile.
The first registration is with the new service operator, which binds the Contact URI
of the device with the new service providers AOR URI.
The second REGISTER request is routed back to the original service provider and
provides the new service providers AOR as the Contact URI.
As shown later in the call flow, when a request comes in to the original service providers
network, the INVITE is redirected to the new service provider who then routes the call to
the user.
49
50
The second registration message with the roaming URI would be:
REGISTER sip:home.registrar2.in SIP/2.0
Via: SIP/2.0/UDP 172.22.1.102:5060;branch=z9hG4bKah4vn2u
Max-Forwards: 70
To: Tom <sip:[email protected]>
From: Tom <sip:[email protected]>;tag=45375
Call-ID:[email protected]
CSeq: 6421 REGISTER
Contact: <sip:[email protected]>
Content-Length: 0
The first INVITE that is represents in the above figure would be sent to sip:registrar2.in;
the second INVITE would be sent to sip: sip:[email protected], which would be
forwarded to sip:[email protected]. It reaches Tom and allows the session to be
established. Periodically both registrations would need to be refreshed.
Performs a re-INVITE to allow the signalling and media flow to the new IP address.
If the UA can receive media from both networks, the interruption is negligible. If this is
not the case, a few media packets may be lost, resulting in a slight interruption to the call.
51
52
53
The existing dialog between Tom and Jerry includes the old visited proxy server.
The new dialog using the new wireless network requires the inclusion of the new
visited proxy server.
As a result, an INVITE with Replaces is sent by Tom, which creates a new dialog
that includes the new visited proxy server but not the old visited proxy server.
When Jerry accepts the INVITE, a BYE is automatically sent to terminate the old
dialog that routes through the old visited proxy server that is now no longer
involved in the session.
The resulting media session is established using Toms new IP address from the
SDP in the INVITE.
54
Service Mobility
Services in SIP can be provided in either proxies or in UAs. Providing service mobility along
with personal mobility can be challenging unless the users devices are identically
configured with the same services.
SIP can easily support service mobility over the Internet. When connected to Internet, a
UA configured to use a set of proxies in India can still use those proxies when roaming in
Europe. It does not have any impact on the quality of the media session as the media
always flows directly between the two UAs and does not traverse the SIP proxy servers.
Endpoint resident services are available only when the endpoint is connected to the
Internet. A terminating service such as a call forwarding service implemented in an
endpoint will fail if the endpoint has temporarily lost its Internet connection. Hence some
services are implemented in the network using SIP proxy servers.
55
Sometime a proxy server forwards a single SIP call to multiple SIP endpoints. This process
is known as forking. Here a single call can ring many endpoints at the same time.
With SIP forking, you can have your desk phone ring at the same time as your softphone
or a SIP phone on your mobile, allowing you to take the call from either device easily.
Generally, in an office, suppose boss unable to pick the call or away, SIP forking allow the
secretary to answer calls his extension.
Forking will be possible if there is a stateful proxy available as it needs to perform and
response out of the many it receives.
We have two types of forking:
Parallel Forking
Sequential Forking
Parallel Forking
In this scenario, the proxy server will fork the INVITE to, say, two devices (UA2, UA3) at
a time. Both the devices will generate 180 Ringing and whoever receives the call will
generate a 200 OK. The response (suppose UA2) that reaches the Originator first will
establish a session with UA2. For the other response, a CANCEL will be triggered.
56
If the originator receives both the responses simultaneously, then based on q-value, it will
forward the response.
Sequential Forking
In this scenario, the proxy server will fork the INVITE to one device (UA2). If UA2 is
unavailable or busy at that time, then the proxy will fork it to another device (UA3).
57
58
The CSeq spaces in the two directions of a call leg are independent. Within a single
direction, the sequence number is incremented for each transaction.
Voicemail
Voicemail is very common now-a-days for enterprise users. Its a telephone application. It
comes to picture when the called party is unavailable or unable to receive the call, the PBX
will announce to calling party to leave a voice message.
User agent will either get a 3xx response or redirect to voicemail server if the called partys
number is unreachable. However, some kind of SIP extension is needed to indicate to the
voicemail system which mailbox to usethat is, which greeting to play and where to store
the recorded message. There are two ways to achieve this:
59
The following illustration shows how the Request-URI carries the mailbox identifier and
the reason (here 486).
60
As we know, a proxy server can be either stateless or stateful. Here, in this chapter, we
will discuss more on proxy servers and SIP routing.
Stateless proxies forget about the SIP request once it has been forwarded.
Stateful proxies remember the request after it has been forwarded, so they can
use it for advance routing. Stateful proxies maintain transaction state. Transaction
implies transaction state, not call state.
Stateful proxies can fork and retransmit if required.(e.g.: call forward busy, for
example).
Via
Via headers are inserted by servers into requests to detect loops and to help responses to
find their way back to the client. This is helpful for only responses to reach their
destination.
A UA himself generate and add its own address in a Via header field while sending
request.
A proxy forwarding the request adds a Via header field containing its own address
to the top of the list of Via header fields.
61
A proxy or UA generating a response to a request copies all the Via header fields
from the request in order into the response, then sends the response to the address
specified in the top Via header field.
A proxy receiving a response checks the top Via header field and matches its own
address. If it does not match, the response has been discarded.
The top Via header field is then removed, and the response forwarded to the
address specified in the next Via header field.
A received tag is added to a Via header field if a UA or proxy receives the request
from a different address than that specified in the top Via header field.
A branch parameter is added to Via header fields by UAs and proxies, which is
computed as a hash function of the Request-URI, and the To, From, Call-ID, and
CSeq number.
62
SIP (Softphone) and PSTN (Old telephone) both are different networks and speaks
different languages. So we need a translator (Gateway here) to communicate between
these two networks.
Let us take an example to show how a SIP phone places a telephone call to a PSTN through
PSTN gateway.
In this example, Tom (sip:[email protected]) is a sip phone and Jerry uses a global
telephone number +91401234567.
63
3. The gateway initiates the call into the PSTN by selecting an SS7 ISUP trunk to the
next telephone switch in the PSTN.
4. The dialled digits from the INVITE are mapped into the ISUP IAM. The ISUP address
complete message (ACM) is sent back by the PSTN to indicate that the trunk has
been created.
5. The telephone generates ringtone and it goes to telephone switch. The gateway maps
the ACM to the 183 Session Progress response containing an SDP indicating the RTP
port that the gateway will use to bridge the audio from the PSTN.
6. Upon reception of the 183, the callers UAC begins receiving the RTP packets sent
from the gateway and presents the audio to the caller so they know that the callee
progressing in the PSTN.
7. The call completes when the called party answers the telephone, which causes the
telephone switch to send an answer message (ANM) to the gateway.
8. The gateway then cuts the PSTN audio connection through in both directions and
sends a 200 OK response to the caller. As the RTP media path is already established,
the gateway replies the SDP in the 183 but causes no changes to the RTP connection.
9. The UAC sends an ACK to complete the SIP signalling exchange. As there is no
equivalent message in ISUP, the gateway absorbs the ACK.
10. The caller sends BYE to gateway to terminates. The gateway maps the BYE into the
ISUP release message (REL).
11. The gateway sends the 200OK to the BYE and receives an RLC from the PSTN.
64
First, it converts an analog voice signal to its equivalent digital form so that it can
be easily transmitted.
Thereafter, it converts the compressed digital signal back to its original analog form
so that it can be replayed.
There are many codecs available in the market some are free while others require
licensing. Codecs vary in the sound quality and vary in bandwidth accordingly.
Hardware devices such as phones and gateways support several different codecs. While
talking to each other, they negotiate which codec they will use.
Here, in this chapter, we will discuss a few popular SIP audio codecs that are widely used.
G.711
G.711 is a codec that was introduced by ITU in 1972 for use in digital telephony. The codec
has two variants: A-Law is being used in Europe and in international telephone links, uLaw is used in the U.S.A. and Japan.
The bitrate is 64 kbit/s for one direction, so a call consumes 128 kbit/s.
G.711 is the same codec used by the PSTN network, hence it provides the best
voice quality. However it consumes more bandwidth than other codecs.
It works best in local area networks where we have a lot of bandwidth available.
G.729
G.729 is a codec with low bandwidth requirements; it provides good audio quality.
The codec algorithm encodes each frame into 10 bytes, so the resulting bitrate is
8 kbit/s in one direction.
G.729 is a licensed codec. End-users who want to use this codec should buy a
hardware that implements it (be it a VoIP phone or gateway).
65
G.723.1
G.723.1 is the result of a competition that ITU announced with the aim to design a codec
that would allow calls over 28.8 and 33 kbit/s modem links.
We have two variants of G.723.1. They both operate on audio frames of 30 ms (i.e.
240 samples), but the algorithms differ.
The bitrate of the first variant is 6.4 kbit/s, while for the second variant, it is 5.3
kbit/s.
The encoded frames for the two variants are 24 and 20 bytes long, respectively.
GSM 06.10
GSM 06.10 is a codec designed for GSM mobile networks. It is also known as GSM Full
Rate.
This variant of the GSM codec can be freely used, so you will often find it in open
source VoIP applications.
The codec operates on audio frames 20 ms long (i.e. 160 samples) and it
compresses each frame to 33 bytes, so the resulting bitrate is 13 kbit/.
66
14. B2BUA
Functions of B2BUA
A B2BUA provides the following functions:
Often, B2BUAs are also implemented in media gateways to bridge the media streams for
full control over the session.
Example of B2BUA
Many private branch exchange (PBX) enterprise telephone systems incorporate B2BUA
logic.
Some firewalls have built in with ALG (Application Layer Gateway) functionality, which
allows a firewall to authorize SIP and media traffic while still maintaining a high level of
security.
Another common type of B2BUA is known as a Session Border Controller (SBC).
67