0% found this document useful (0 votes)
79 views5 pages

NDT CIvan Paper2

The document proposes a framework for accessing data resources on the Internet exposed as RESTful web services through a common interface. The framework aims to provide a unified way to access different data sources by defining a common data model and mappings between the generic model and specific models of each data source. It includes modules for a SOAP service, dispatcher, security registry, and web service invoker to process search requests against single or multiple data sources using a common or generic request format.

Uploaded by

cosmiivan4612
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views5 pages

NDT CIvan Paper2

The document proposes a framework for accessing data resources on the Internet exposed as RESTful web services through a common interface. The framework aims to provide a unified way to access different data sources by defining a common data model and mappings between the generic model and specific models of each data source. It includes modules for a SOAP service, dispatcher, security registry, and web service invoker to process search requests against single or multiple data sources using a common or generic request format.

Uploaded by

cosmiivan4612
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

A framework for accessing data resources on the Internet exposed as RESTful web services

Cosmina Ivan Department of Computer Science Technical University of Cluj-Napoca, Romania [email protected] Abstract
Over the last decade, Internet users have been using web search intensively. However, the everincreasing number of web search providers makes it difficult to search for data in different data sources. More than that, developers face the problem of accessing different resources exposed as Web services by having to support various sets of constraints and rules. Our framework aims at identifying a common way to accessing these different data sources. We also provide a common and simple data model for some of these various sources, thus helping the developer and ultimately the end-user finding and combining results from more than one web search provider. Thus, applications that search for a resource simultaneously at different web search providers will be developed more easily.

Dan Borza Technical University of Cluj-Napoca, Romania [email protected]


one for every data source. Each of the web search providers or data sources on the Internet offers a specific way of accessing their data, a different URL, different query parameters, different authentication mechanism, etc. Wouldn't it be easier if s/he could write a client in the same manner for each of these sources? Still, the greatest difficulty of all may lie in the fact that each provider offers its data in a different format and structure. Some use XML, the others JSON, etc. But, even if the data format is the same (e.g. XML), the structure differs. There is no common way of specifying a unified model of data structure and semantics, although research is being made [7]. Web Services have emerged in the last decade as an important way to leverage interoperability and integration between heterogeneous software systems as [2] suggests. More complex applications can be built starting from existing ones with the use of web services. They offer a common interface between the systems. Two major types of web services are being used at the moment: SOAP-based and REST-based [8]. SOAP (Simple Object Access Protocol) is a suited protocol since it is platform and language independent. More than that, it can be mapped on various transport protocols, HTTP being the most often used. However, because of the XML format it induces a great transport overhead [10]. As an alternative, RESTful (REpresentational State Transfer) web services have emerged lately as an alternative to SOAP services [9][11]. They use HTTP as transport protocol with well-known operations (GET, POST, PUT, DELETE, etc.). Therefore, RESTful web services are suited to accessing data over the Internet, since they use HTTP directly with it's simple, builtin operations. Recently, the major web search providers published a set of REST APIs. These API's can be used by developers to build software that uses their search functions. Microsoft [5], Yahoo [4], Google [6],

1.Introduction
Internet search has become a common aspect of our lives. We use it on daily basis in order to find important information. We are also witnessing an increase of Internet search providers. The existence of multiple data sources on the Internet is benefic to the community. More providers means greater competition and inherently an increase in the quality of their services. Despite all these advantages we can think of, this ever-increasing number of data sources is becoming more and more difficult to manage by the end-user and the software developer alike. For instance, the end user has to open multiple browser windows and enter the search query in each of these. Wouldn't it be more simple if s/he had the chance to access a portal, enter the query string only once and select the web search provider(s) that are most appropriate? The response results from various sources will be displayed in the browser in a common interface. The developer also faces a challenge in having to design and implement software clients:

Amazon and eBay are well-known examples. Let's consider the following scenario: we want to search for the query computers in the section news and retrieve the results in xml format. Both Microsoft and Yahoo offer a way to accomplish the task. At Yahoo!, we would have to submit the following URL: http://boss.yahooapis.com/ysearch/news/v1/ computers?appid={appID}&format=xml. Microsoft would provide the same service at the following URL: http://api.search.live.net/xml.aspx?Appid={appID}& query=computers&sources=news. The appID is a parameter specific to each search provider. We observe from this simple example, that each web search provider has a specific way of building the request URL, a different set of parameters, etc. In a similar fashion, databases might be queried using this approach too. Suppose we have an database with users stored in it. Each user has an id, a name and an address. This database might be exposed with the ease of web services at the URL http://database.company.com. If we would like to search for the user with id=5, the following HTTP request would suffice: GET

http://database.company.com/users/id=5. We could imagine other scenarios such as listing all the users by using the following request: GET http://database.company.com/users. Alternatively, we could query for all the users that start with A using GET http://database.company.com/users/ name=A*. Therefore, we conclude that accessing a database over the Internet is as simple as issuing a corresponding HTTP GET method. Still, there is no unified way to accessing all these data sources (web search results or databases). We propose a solution that aims in facilitating the access to various RESTful data sources over the Internet, offering a common registry for storing information about these data sources such as their URL, expected parameters and description. We will provide a generic way to form requests for all of the web services. The requests will contain the service id, the query string and the needed parameters. Search queries will be executed against multiple data sources with ease. Only one authentication mechanism will be needed.

Figure 1. Conceptual Architecture

More than that, we propose a simple yet intuitive solution in defining a generic data model for more than one source. In this way, one could query Yahoo! and Microsoft using the same set of generic parameters, instead of forming two different requests, each using specific parameters. Therefore, given N different data sources we will provide the mechanism to define one generic model and N mappings between the generic data model and the model used by the specific data sources. As a proof of concept we will build an application on top of the framework proving the way in which one can form a request on a single data source, a request on multiple data sources and a generic request using a common data model between two data sources.

2. Architecture and Design


In order to build the framework for accessing different such web search providers in a common way, we used the principles of SOA Service Oriented Architecture. We use the existing web services as portals to data sources and also in order to retrieve the results. Access to our framework is available through a SOAP client. With the use of this client we build a request and send it to the business logic. This request is decoded and the appropriate action takes place. We will present the framework logical architecture and briefly describe the logical modules as shown in Fig. 1. SOAP Service Module enables the client to execute a search request towards one or more search sources. This module exposes a web services which can be called to perform queries on the database alongside with other operations (logging, building custom requests). The SOAP Client exposes a method named submitRequest which takes as parameter an instance of an implementation of the abstract class GenericRequest. Our framework provides three implementations of GenericRequest: SingleSearchRequest, MultipleSearchRequest and GenericSearchRequest. SingleSearchRequest is used in order to form a request on a single data source. MultipleSearchRequest is composed of at least one SingleSearchRequest. It is used for executing queries on more the one data source. GenericSearchRequest is a special type of request. We use it in order to form generic search requests. These requests use a common data model for different data sources. We pass them a generic set of parameters, and the business logic maps this generic set of parameters on a specific set of parameters that are understandable by each of the web service providers.

The Browser Web App Module offers a webbrowser based approach to the framework. This module is still in work and will enable the end-user to build requests in a graphical way. He will be able to select various data sources and execute a simultaneous search against these data sources. The results will be displayed in a uniform style. The Dispatcher module this module analyzes the request and routes it to the specific handler. The types of requests are: user login, user logout, user single search, user multiple search, user generic search. If the request is of type user login/logout then the security registry will be invoked to handle this specific request. If the request is of type user single search, user multiple search or user generic search, the WS Invoker module will be used to handle the search request. The Security Registry module this module uses a database in order to authenticate and authorize the users. It receives the requests and based on the user credentials it grants or not a specific privilege. The WSInvoker Module this module lies at the core of our framework. It receives the search request and analyzes it. This module makes extensive use of the SingleWsInvoker class which is used to start a thread that executes the search on a single data source with parameters specific to that data source. SingleWsInvoker uses an instance of the GenericRestClient class. This class uses the Jersey API to facilitate the communication with a REST resource on the Internet. If the request received is of type SingleSearchRequest, only one instance of the SingleWsInvoker class is used. However, if the request is ot type MultipleSarchRequest, the module launches into execution concurrently more SingleWsInvoker threads. The most important part is that of GenericSearchRequest. If the requestparameters to the specific parameters forming a SingleSearchRequest or a MultipleSearchRequest which are executed as described above. In order to achieve the mapping, we use the RequestBuilder module. The WS Registry Module contains a database with information regarding the web services. The developer has access to this registry, collecting the information needed to form a request. Available information to the developer is the following: the name of the services, their unique ids, and the parameters they used alongside their description. The Request Builder module this is the document that enables the use of generic requests and generic responses. It uses a database that defines the generic request, the generic responses and the mappings between these and the service specific parameters. This module has methods that receive a GenericSearchRequest and, using the specific mapping return a SingleSearchRequest or a MultipleSearchRequest that can be easily handled by the WSInvoker module. More than that, given a specific response retrieved from the SingleWsInvoker they

form a GenericResponse composed of generic response fields. The RESTful Data Providers we execute the search queries against them. These are external APIs provided by web search providers (such as Yahoo! and Microsoft) and by RESTful databases. These are the data sources available on the Internet used to form the results.

3.Implementation Details
We use the Java platform (JDK 1.6) in order to build the framework. The reasons we chose this platform is because if offers a complete stack of tools that integrate well and are easy to use, yet powerful enough to model such a complex task. We use the JAX-WS specification with its Metro implementation to implement and access a the SOAP module. In order to implement the access to the REST resources on the Internet, we used the JAXWS specification with the Jersey implementation. Of course, we could use simple Http communication, but this is more cumbersome as [3] illustrates. A better option that is also recommended by the author is the one that we adopted. Jersey is a new API and will be shipped as part of the Java EE 6 specification. To illustrate the ease of use with Jersey, let's consider the following REST web service invocation: Client client = Client.create(); WebResource wr = client.resource(finalUrl); String response = wr.get(String.class); EJB3.0 technology is used to implement the business logic. The various modules described earlier are designed as beans in order to achieve SOA goals as: separating service implementations from service interfaces with the use of remote interfaces, modularity, reusability of code, service granularity and integration of other existing systems such as databases. EJB3.0 also offers container management for multithreading on the server-side. Therefore, server-side programming is relatively easy. Persistence is also provided by means of the entity beans and annotations. The container can easily handle references to database resources that are stored on the server, and the application server used is GlassFish.

parameters. Using them, we are able to identify the service we want to access and the string we want to search for. If, for instance, we want to obtain information about the available services, their parameters and expected values, we can use the SOAP endpoint. But, for this example we assume that we know that service-id=1 is the id assigned to yahoo web search. start, format and count yahoo web search specific parameters. Information about these is available by using the SOAP endpoint or accessing the Yahoo site [4].
Map params = new HashMap(); params.put("service-id", "1"); params.put("query-string", "computers"); params.put("start", "0"); params.put("format", "xml"); params.put("count", "10"); SingleSearchRequest request = new SingleSearchRquest(params);

After building the request, we invoke the web service through the SOAP endpoint.
SOAPServicePort service = Utils.getSOAPServicePort(); String result = service.submitRequest(request);

After retrieving the result, we can parse the result as we wish. As seen, accessing a resource is very easy. We are able to identify the corresponding service's id and it's parameters in a straightforward manner (using the SOAP endpoint). Building the parameter list and submitting the request is also task that involves no difficulty.This is a process particular to each request, since it is not a generic request. It uses specific parameters, used by the web service invoked. Unfortunately, at this stage we aren't able to provide a generic request example since the module is in the design and implementation phase.

5. Related work
In his book [1], Mark D. Hansen uses SOA for integrating eBay, Amazon and Yahoo! Shopping. He builds a SOA application for searching and buying items from different locations. This was the starting point of our approach. In this work, the author offers access to various resources on the Internet (http://search.creativecommons.org/) mechanism, but without trying to integrate the results. Their browserbased search portal has the option of searching google, yahoo, flickr, blip.tv, owl and spinxexpress. However, they merely offer a portal to these services by displaying those pages in theirs. Leapfish (http://www.leapfish.com/) offers a similar functionality with that of my framework, but also

4. Results
Next, we will offer an example of accessing a web service. The service used will be yahoo web search. The yahoo web search is stored in the registry at id=1. The query string is computers. First, we build the request by using a parameter map. service-id and query-string are general

without trying offer a unified view on the different data sources. I think it is somewhat better than Creative Commons, in the sense that it offers the possibility to simultaneously search different sources (web, news, blogs) and display the results in a unified form. As shown by [7], the task of achieving a common data representation over the Internet is a current issue that is being researched. This is one of the reasons we werent able to offer a generic approach towards web search. Besides that, we could provide a mechanism for defining and accessing generic data sources by means of specific mappings. The different formats and structures of the data provided by the data sources is also a great issue in finding such an approach.

6. Future developments
More web search providers should be integrated using this pattern. Besides that, refinements could be made in the way we map generic data sources to specific data sources. Still, research has to be made in the fields of semantic web, ontology and semantic web services to overcome the issues mentioned before. A more comprehensive security mechanism could also be integrated in the current framework, and The Request Builder module is yet to be implemented

7.References
[1] Mark D. Hansen, SOA Using Java Web Services, Prentice Hall, 2007 [2] Michael Rosen, Boris Lublinsky, Kevin T. Smith, Marc J. Balcer, Applied SOA. Service Oriented Architecture and Design, Wiley Publishing 2009 [3] Martin Kali, Java Web Services Up and Running 1st ed, OReilly, 2009 [4] Yahoo Web Services, http://developer.yahoo.com [5] Microsoft Live Search, http://dev.live.com/livesearch [6] Google REST API, http://code.google.com/apis/ajaxsearch [7] D.F.Huynh, D.R.Karger, Adopting a Common Data Model for End-User Web Programming Tools, http://davidhuynh.net/media/papers/2009/chi2009eup.pdf, 2009 [8] Web Service, http://en.wikipedia.org/wiki/Web_service [9] REST, http://en.wikipedia.org/wiki/Representational_State_T ransf

You might also like