Showing posts with label RESTful. Show all posts
Showing posts with label RESTful. Show all posts

RESTful communication over WebSocket

This article shows how to implement generic resource oriented communication on top of synchronous channel such as WebSocket. This is a continuation of my previous article on REST and SIP [1] and provides more concrete thoughts because I now have an actual implementation of part of this in my web conferencing application. Other people have commented on the idea of REST on WebSocket [2]. (Using the term RESTful, which inherently is stateless, is confusing with a stateful WebSocket transport. Changing the title of this article to "Resource oriented and event-based communication over WebSocket" is more appropriate.)

Following are the basic principles of such a mechanism.
  1. Enable resource-oriented (instead of RPC-style) communication.
  2. Enable asynchronous notification when a resource (or its child resource) changes.
  3. Enable asynchronous event subscribe, publish, and notification.
  4. Enable Unix file system style access control on resources.
  5. Enable the concept of transient vs persistent resources.
Consider my favorite example of web conferencing application. The logged in users list is represented as resource /login relative to the hosting domain, and the calls list as /call. If the provider supports concept of registered subscribers, those can be at /user. For example, /user/[email protected] can be my user profile.

Now let us look at how the four motivational points apply in this example.

1) Enable resource-oriented (instead of RPC-style) communication.

Every resource has its own representation, e.g., in JSON or XML. For example, /login/[email protected] can be {"name": "Bob Smith", "email": "[email protected]", "has-video": true, "has-audio": true}. The client-server communication can be over HTTP using standard RESTful or over WebSocket to access these resources.

Over WebSocket, the client sends a request of the form '{"method":"PUT","resource":"/login/[email protected]", "type":"application/json","entity":{"name":"Bob Smith", ...}}' to login. The server treats this as same as that received on just HTTP using RESTful PUT request.

A resource-oriented (instead of RPC-style) communication allows us to keep all the business logic in the client, which uses the server only as a data store. The standard HTTP methods allow access to such data, e.g., POST to create, PUT to update, GET to read and DELETE to delete. POST is a special method that must return the resource identifier of the newly created resource. For example, when a client does POST /call to create a new call, the server returns {"id": "conf123"} to indicate that the new resource identifier is "conf123" relative to /call and call be accessed at "/call/conf123".

2) Enable asynchronous notification when a resource (or its child resource) changes.

Many web applications including web conferencing need the notion of asynchronous notifications, e.g., when a user is online, or a user joins/leaves a call. Traditionally, Internet communication has used protocols such as SIP and XMPP for asynchronous notifications. With the advent of WebSocket (and the popular socket.io project) it is possible to implement persistent client-server connection for asynchronous notifications and messages within the web browser.

In this mechanism, a generic notification architecture is applied to resources. A new method named "SUBSCRIBE" is used to subscribe to any resource. A subscriber receives notification whenever the resource or its immediate children are changed (created, updated or deleted). For example, a conference participant sends the following over WebSocket: '{"method":"SUBSCRIBE","resource":"/call/conf123"}'. Whenever the server detects that a new PUT is done for "/call/conf123/participant12" or a new POST is done for "/call/conf123" it sends a notification message to the subscriber over WebSocket: '{"notify":"UPDATE","resource":"/call/conf123","type":"application/json","entity":{ ... child resource}, "create":"participant12"}'. On the other hand, if the moderator does a PUT on "/call/conf123", then the server sends a notification as '{"notify":"PUT","resource":"/call/conf123","type":"application/json", "entity":{... parent resource}}'. In summary, the server generates the notification to both the modified resource "/call/conf123/participant12" as well as the parent resource, "/call/conf123".

The notification message contains a new "notify" attribute instead of re-using the "method" attribute to indicate the type of notification. For example, "PUT", "POST", "DELETE" means that the resource identified in "resource" attribute has been modified using that method by another client. In this case the "type" and "entity" attribute represent the "resource". Similarly, "UPDATE" means that a child resource has been modified and the details of the child resource identifier is in "create", "update" or "delete" attribute. In this case the "type" and "entity" attribute represent the child resource identified in "create", "update" or "delete".

The concept of notifications when a resource change is available in ActionScript programming language. For example, a markup text can use width="{size}" to bind the "width" property of a user interface component to the "size" variable. Whenever the "size" changes the "width" is updated. A property change event is dispatched to enable the notification. Similarly in our mechanism, a resource can be subscribed for to detect change in its value or the value of its children resources by the client application.

3) Enable asynchronous event subscribe, publish, and notification

The previous point enables a client to receive notification when a resource changes and these notifications are server generated notifications. Additionally, we need a generic end-to-end publish-subscribe mechanism to allow a client to send notification to all the subscribers without dealing with a resource modification. This allows end-to-end notifications from one client to others, via the server.

When a client subscribes to a resource, it also receives generic notifications sent by another client on that resource. A new NOTIFY method is defined. For example, if a client sends '{"method":"NOTIFY","resource":"/login/[email protected]","type":"application/json","data":{"invite-to":"/call/conf123","invite-id":"6253"}}', and another client is subscribed to /login/[email protected], then it receives a notification message as '{"notify":"NOTIFY", "resource":"/login/[email protected]","type":"application/json","data":{...}}'. In summary, the server just passes the "data" attribute to all the subscribers. The "notify" value of "NOTIFY" means an end-to-end notification generated because another client sent a NOTIFY method.

In a web conferencing application, most of the notifications are associated with a resource, e.g., conference membership change, presence status change, etc. Some notifications such as call invitation or cancel can be independent of a resource, and the NOTIFY mechanism can be used. For example, sending a NOTIFY to /login/[email protected] is received by all the subscribers of this resource.

4) Enable Unix file system style access control on resources.

Without an authentication and access control mechanism, the resource oriented communication protocol described earlier becomes useless. Fortunately, it is possible to design a generic access control mechanism similar to Unix file system. Essentially, each resource is treated as a file and a directory. In analogy, all the child resources of this resource belong to the directory, whereas the resource entity belongs to the file. The service provider can configure top-level directories with user permissions, e.g., anyone can add child to "/user", and once added will be owned by that user. Thus if user Bob creates /user/bob, then Bob owns the subtree of this resource. It is up to Bob to configure the permissions of its child resources. For example, it can configure /user/bob/inbox to be writable by anyone but readable only by self, e.g., permissions "rwx-w--w-". This allows a web based voice and video mail application.

Unlike traditional file system data model with create, update, read and write, we also need permissions bit for subscription. For example, only Bob should be allowed to subscribe to /user/bob so that other users cannot get notifications sent to Bob. The concept of group is little vague but can be incorporated in this mechanism as well. Finally, a notion of anonymous user needs to be added so that any client which does not have account with the service provider can also get permissions if needed.

In summary, the permissions bits become five bits for each of the four categories: self, group, others-authenticated, others-anonymous. The four bits define permissions to allow create, read, update, write and subscribe. Existing authentication such as HTTP basic/digest, cookies or oAuth based sessions can be used to actually authenticate the client.

5) Enable the concept of transient vs persistent resources.

In software programming, application developers typically use local and global variables to represent transient and persistent data respectively. A similar concept is needed in the generic resource oriented communication application. So far we have seen how to read, write, update and create resources. Each resource can be transient, so that it is deleted when the client which created the resource is disconnected, or persistent which remains even after the client disconnects. For example, when a client POSTs to /call/conf123, it wants that resource to be transient which gets deleted when the client is disconnected. This causes the resource to be used as a conference membership resource, and the server notifies other participants when an existing participant is disconnected. On the other hand, when a client POSTs to /user/[email protected], it wants it to be the persistent user profile which is available even when this user has disconnected.

The concept of transient and persistent in the resource-oriented system allows the client to easily create a variety of applications without having to write custom kludges. In general a new resource should be created as transient by default, unless the client requests a persistent resource. Whenever the client disconnects the WebSocket all the transient resources (or local variables) of that client are deleted, and appropriate notifications are sent to the subscribers as needed.

Implementation

I have implemented part of this concept in my web conferencing project. The server side (called as service provider) is a generic PHP "restserver.php" application that accepts WebSocket connections and uses a back-end MySQL database to store and manage resources and subscriptions. Each connected client is assigned a unique client-id. There are two database tables: the resource table has fields resource-id (e.g., "/login/[email protected]"), parent-resource-id (e.g., "/login"), type (e.g., "application/json"), entity (i.e., actual JSON representation string), and client-id, whereas the subscribe table has fields resource-id of the target resource and client-id of the subscriber. The subscriptions are always transient, i.e., when the client disconnects the all subcribe rows are removed for that client-id. The resources can be transient or persistent. By default any new resource is created as transient and the client-id is stored in that row. When the client disconnects all the resources with the matching client-id are removed and appropriate notifications generated. A client can create persistent resource by supplying "persistent": true attribute in the PUT or POST request, and the server puts empty client-id for that new resource.

The generic "restserver.php" server application can be used in a variety of web communication applications, and we have shown it to work with web conferencing and slides presentation application.


REST, Flash Player, Flex

In doing some experiments with Flex and RESTful architecture, I realized that there seems to be a whole lot of problems. Flash Player was designed to support traditional HTTP web access such as form posting or resource retrieval using GET and POST. So a number of features that are used in RESTful design are not supported by Flash Player. People have written additional client-side or on server-side kludges to work around the problems with HTTP support in Flash Player. Most of the server side changes are hacks, and the client-side changes are in external third-party libraries, which are sometimes missing crucial features like cookies, TLS, etc. Even in the best possible scenario, you still need to provide crossdomain policy file even if the Flash application is accessing resources on the same server. One of the reasons I suspect is that Flash Player relies on the browser for HTTP support, and hence supports only the lowest common set of features, which are not enough for REST architecture.

What is the solution? Depends on what you want to do.
  1. If you have control over server-side of the system, you need to incorporate certain kludges to support Flash restrictions. For example, provide crossdomain policy-file, map some header or URL to method PUT and DELETE, return 200 success response with message body containing actual error code (e.g., 404 vs 405 vs 501) or headers. Any of these techniques looks like a hack at best.
  2. If you have control over the client, you can use an existing third-party RESTful client library in ActionScript. However, be prepared to provide a crossdomain policy-file or incorporate a proxy. Alternatively, you can also perform some Flash Player-specific translations on the fly in your proxy.
  3. Use Flash Player's ExternalInterface mechanism and incorporate your RESTful client code in JavaScript. This is sometimes not easy, error-free or feasible. Moreover, now your Flash application depends on your Javascript.
What are the problems with these solutions?
  1. You cannot build a general purpose RESTful web application (server-side) and expect Flash Player application to use them. You will need to be Flash Player-aware.
  2. You cannot build a general purpose Flash application (client-side) to use an existing RESTful web service without additional dependencies and network elements (proxies).
What is the real solution?
  1. It looks like HTTP+REST won't really be able to solve all the problems in ActionScript for Flash Player without Adobe's blessings, e.g., incorporate full HTTP stack in the Flash Player or provide a different way other than crossdomain to do authentication of scripts.
Some additional references [1, 2, 3, 4]

REST and SIP

This article describes a RESTful SIP application server architecture.

Why do we need this?
SIP is the protocol of choice for Internet session initiation and control such as for VoIP or multimedia calls. Although SIP is similar to HTTP in many respects, there are crucial differences in the design. Two of the major difficulties among web developers in adopting SIP are (1) no existing SIP-based web tools similar to programming libraries for HTTP and XMPP on Flash Player, (2) the initial cost to get started with basic working system is huge with lot of specifications, e.g., for NAT and firewall traversal. On the other hand, web developers are used to building applications on top of HTTP which works for most cases out of the box. More recently RESTful architectures are gaining popularity among web services. In the absence of easy to use web tools for SIP and large set of specifications for a SIP system, web developers tend to resort to quick and dirty hacks which in the end are short term and not interoperable. Hence there is a need for a easy to use RESTful architecture for SIP-based systems that allows quick application development by web developers. This article proposes such an architecture.

What exactly is difficult?
SIP supports both UDP and TCP transports. Many earlier systems implemented UDP, whereas both transports are a must for SIP proxy servers. In client-server communication, with several clients behind NAT and firewall, UDP causes problem. Secondly, with UDP you also need the reliability of transactions and hence the transaction state machines in SIP. The SIP request forking and early media feature have created lot of stir and confusion among developers. Several other telephony-style features are also not needed for many Internet oriented SIP applications that do not talk to a phone network. The NAT and firewall traversal are defined outside core SIP, e.g., using rport, sip-outbound. A developer usually prefers to have an integrated application library and API that is quick and easy to use. Moreover with lots of RFCs related to SIP, it becomes difficult to figure out what specifications are core and what are optional for a particular use case. A number of new web-based video communication systems use proprietary technologies such as on Flash Player because of lack of a ready-to-use SIP library to satisfy the needs.

To solve the difficulties faced by web developers, a subset of the core features of SIP are needed as an easy to use API. Such an API could be available as a built-in browser feature or a plugin. Once the core set of resources are identified, rest of the API can be customized by the application server providers and developers, or in separate communities.

What use cases are considered?
SIP is designed to be used consistently in different use cases such as client-to-client communication, client-to-server as well as server-to-server. The core SIP says that each SIP user agent (application client) has both UAC (client) and UAS (server). In this article I refer to client as a user agent and server as an application server, which are different from SIP terminology. Since the target audience for the proposal is application developers, only the client-server interface needs to be considered. The backend application server can translate the client-server request to appropriate SIP messaging for server-to-server case if needed, e.g., for service provider's network you may need high performance UDP based server-to-server SIP messages.

What are the SIP-related resources?
Once we focus on a small subset of the problem -- define RESTful API for client-server communication to access a SIP application server -- rest of the solution falls in place naturally. In particular, the SIP application server will provide two core resources: "/login" and "/call" to represent list of currently logged in users and list of active calls. Additionally, it can provide user profiles of signed up users at "/user" which internally may contain things like voicemail resources for the user. The client uses standard HTTP requests, with some additional methods as shown below, to access the resources and interact with others. One difference with standard RESTful architecture is that the client-server connection may be long lived, and also used for notification from server to client. In that sense it does not remain pure RESTful.

Login: The SIP registration and unregistration are mapped to "/login/{email}" resource, e.g., "/login/[email protected]". Doing a "POST /login/{email}" with message body containing your contacts, can be used to REGISTER. The response will return your unique identifier for the login resource, e.g., "/login/{email}/{contact_id}. Later, you can use "DELETE /login/{email}/{contact_id}" to un-REGISTER or a subsequent "PUT /login/{email}/{contact_id}" to do a REGISTER refresh. The actual representation of the login contact information can be in XML, JSON or plain text and is application dependent. For example one could combine the presence update including rich presence with the registration method. Clearly the login update requires appropriate authentication.
 POST /login/[email protected]      -- new registration
request-body: {"contact": "sip:[email protected]:5062"}
response-body: {"url": "/https/blog.kundansingh.com/login/[email protected]", "id": 1, "expires": 3600}

PUT /login/[email protected]/1 -- registration refresh
request-body: "sip:[email protected]:5062"

DELETE /login/[email protected]/1 -- unregister

GET /login/[email protected] -- get list of contact locations
response-body: [{"id": 1, "contact": "sip:[email protected]:5062", ...},...]
Call: The call is split into two part: conference resource and invitation. The conference is represented using a "/call/{call_id}" resource, where a client can "POST /call" to create a new call identifier, or "POST /call/{call_id}" to join an existing call. The conference resource represents the list of participants in a call.
 POST /call             -- create a new call context
request-body: {"subject": "some discussion topic", ...}
response-body: {"id": "123", "url": "/https/blog.kundansingh.com/call/123" }

POST /call/123 -- join a call
request-body: {"url": "/https/blog.kundansingh.com/login/[email protected]", "session": "rtsp://...", ...}
response-body: {"id": 2, "url": "/https/blog.kundansingh.com/call/123/2", ...}

GET /call/123 -- get participant list and call info
response-body: {"subject": "some discussion topic",
"children": [{"url": "/https/blog.kundansingh.com/call/123/2", "session": "rtsp://..."}]
}

Invite: Call invitation requires a new message such as "SEND". For example, "SEND /login/{email}" sends the given message body to the target logged in user. Similarly, "CANCEL /login/{email}/1" cancels a previously sent message it is not already sent. The message body gives additional details such as whether the message is a call invitation or an instant message. The message body is application dependent. The SIP application server does not need to understand the message body, as long as it can send a SEND message from one client to another. This makes a SEND more closer to an XMPP instead of a SIP INVITE. If the callee wants to accept the call invitation, it joins the particular session URL independently.
 SEND /login/[email protected]     -- send call invitation
request-body: {"command": "invite", "url": "/https/blog.kundansingh.com/call/123", "id": 567}

SEND /login/[email protected] -- cancel an invitation
request-body: {"command": "cancel", "url": "/https/blog.kundansingh.com/call/123", "id": 567}

SEND /login/[email protected] -- sending a response
request-body: {"command": "reject", "url": "/https/blog.kundansingh.com/call/123", "id": 567, "reason": ...}
Event: SIP includes an event subscription and notification mechanism which can be used in several applications including presence updates and conference membership updates. Similarly, one needs to define new mechanism to subscribe to any resource and get notification of a change. This gives rise to a concept known as active-resource. The idea is as follows: if a client does a GET on active resource, and does not terminate the connection, then the client keeps getting the initial state of the resource, as well as any future updates until the connection is terminated. The future updates may include the full state or a difference depending on the request parameter.
 GET /call/123          -- keep track of membership information
response 1: ... -- initial membership information
response 2: ... -- any addition or deletion in the membership

GET /login/[email protected] -- keep track of presence updates
response 1: ... -- initial presence information
response 2: ... -- subsequent presence updates.
Profile and messages: The SIP application server will host user profile at "/user/{user_id}". The concept of user identifier will be implementation dependent. In particular, the client could "POST /user" to create a new user account, and get the identifier in the response. It can then do a "GET /user/{user_id}" to know various URLs to get contact location of this user. It can then do a GET on that URL to fetch the contacts or do a SEND on that URL to send a message or call invitation.
 POST /user                            -- signup with a new account
request-body: {"email": "[email protected]", ...}
response-body: {"id": "[email protected]", "url": "/https/blog.kundansingh.com/user/[email protected]" }

POST /user/[email protected]/message -- send offline messages (voice/video mail)
request-body: {"url": "rtsp://..."}

GET /user/[email protected]/message -- retrieve list of messages
response-body: [{"url": "rtsp://...", ...]
Miscelleneous: There are several other design questions that are left unanswered in the above text. Most of these can be intuitively answered. For example, the HTTP authentication credential defines the sender of a message, i.e., SIP "From" header. The sequential or parallel forking is a decision left to the client application. The decision whether to use a SDP or XML-based session description is application and implementation dependent. For example, if the client is creating a conference on RTSP server, it will just send the RTSP URL in the call invitations. Similarly, for Flash Player conferencing it will send an RTMP URL in the call invitation. The call property such as participant's session description can be fetched by accessing the call resource on the server. Thus, whether an RTSP/RTMP server is used to host a conference or a multicast address is used is all client or application dependent. The application server will provide tools to allow such freedom.

Conclusion: A RESTful interface to SIP application server is an interesting idea described in this article. The idea looks feasible and doable using existing software and tools, and hopefully will benefit both the web developer and SIP community in getting wider usage of SIP systems. The goal is not to replace SIP, but to provide a new mechanism that allows web-centric applications to use services of a SIP application server and to allow building such easy to use SIP application server.

Several of the pieces described in this article are already implemented in Python, e.g., RESTful server tools, video conferencing application server, SIP-RTMP translation and SIP server and client library. The next step would be to combine these pieces to build a complete REST and SIP project. If you are interested in doing the project feel free to get in touch with me!

REST, RESTful and restlite

This post announces a new open source software: http://code.google.com/p/restlite/

What is restlite? Restlite is a light-weight Python implementation of server tools for quick prototyping of your RESTful web service. Instead of building a complex framework, it aims at providing functions and classes that allows your to build your own application.

restlite = REST + Python + JSON + XML + SQLite + authentication

Features
  1. Very lightweight module with single file in pure Python and no other dependencies hence ideal for quick prototyping.
  2. Two levels of API: one is not intrusive (for low level WSGI) and other is intrusive (for high level @resource).
  3. High level API can conveniently use sqlite3 database for resource storage.
  4. Common list and tuple-based representation that is converted to JSON and/or XML.
  5. Supports pure REST as well as allows browser and Flash Player access (with GET, POST only).
  6. Integrates unit testing using doctest module.
  7. Handles HTTP cookies and authentication.
  8. Integrates well with WSGI compliant applications.

Motivation: As you may have noticed, the software provides tools such as (1) regular expression based request matching and dispatching WSGI compliant router, (2) high-level resource representation using a decorator and variable binding, (3) functions for converting from unified list representation to JSON and XML, and (3) data model and authentication classes. These tools can be used independent of each other. For example, you just need the router function to implement RESTful web services. If you also want to do high-level definitions of your resources you can use the @resource decorator, or bind functions to convert your function or object to WSGI compliant application that can be given to the router. You can return any representation from your application. However, if you want to support multiple consistent representations of XML and JSON, you can use the represent function of request.response method to do so. Finally, you can have any data model you like, but implementations of common SQL style data model and HTTP basic and cookie based authentication are provided for you to use if needed.

This software is provided with a hope to help you quickly realize RESTful services in your application without having to deal with the burden of large and complex frameworks. Any feedback is appreciated. If you have trouble using the software or want to learn more on how to use, feel free to send me a note!