Towards A New Generation
Towards A New Generation
Towards a New Generation of Social Networks: Merging Social Web with Semantic Web
Liana Razmerita (Copenhagen Business School, Denmark [email protected]) Rokas Firantas, Martynas Juseviius (IT University of Copenhagen, Denmark [email protected], [email protected])
Abstract: This article investigates several well-known social network applications such as Facebook, LinkedIn, Last.fm, Flickr and identifies semantics and social data portability as the main technical issues that need to be addressed in the future. We argue that these issues can be addressed by building social networks like Semantic Web applications with FOAF, SIOC, and Linked Data technologies, and show how this can be done by implementing a prototype application using Java and core Semantic Web technologies. The developed prototype shows how features from semantic websites such as Freebase and DBpedia can be reused in social applications and lead to more relevant content and stronger social connections. In addition, the article discusses a number of issues that need to be addressed in the design and implementation of semantic social networks, which could serve as a guideline in the implementation of other semantic social applications. Keywords: social networks, Semantic Web, Linked Data, semantic interoperability, ontology Categories: H.1.2, H.5.1, H.5.2, H.3.4, H.5.3, H.5.4
1 Introduction
Social web or Web 2.0 has fundamentally changed the way in which people interact, communicate and share information. The on-line collaboration, interaction and participation is facilitated by social networking sites (LinkedIn, Facebook) and social applications (YouTube, Flickr,) where the social dimension, and in particular the user community tightly integrated into the application domain is crucial for the success of the application. User communities are created around various domains like music sharing, photos sharing, videos sharing, social bookmarking sharing, professional networks, etc. On the Social Web, users are provided with means to express their views and are encouraged to provide content. User-generated content and metadata is provided in simple forms like tagging, ratings, comments, blogging, podcasting or wikis. Social networking applications are driven exclusively by the content provided by the users, which is one of the defining characteristics of the Social Web. As a result large amounts of data and information are gathered in walled websites. Relevant information is often poorly structured, buried in low-quality information or surrounded by highly subjective data. Most of the social networking applications are walled websites and the online communities are like islands in a sea [Bojars, 08]. Despite their tremendous success, social networking websites have a number of limitations that are identified and discussed below. Lack of interoperability between
413
data and lack of semantics in different social network applications limits access to relevant content available on various social networking sites, and limits the integration and reuse of available data and information. This may result in a growing dissatisfaction of the user community and a reduced usability of the websites. This article shows through the implementation of a prototype that Semantic Web technologies can be used to build a next generation of social networks that overcome limitations of current social network applications and enable new features currently not exploited by them. The article emphasizes the need for semantics and mechanisms to better structure this information and make it interoperable. Research combining social networks and the Semantic Web is an interdisciplinary field, attracting researchers from both social and computer sciences. More research combining social networks and the Semantic Web is required to address the abovementioned limitations. An important line of research combining social networks and the Semantic Web focuses on the extraction of semantic data from existing social applications, its representation and its analysis. Existing work in this area explores the possibilities of extracting ontologies from user contributed folksonomies through collaborative tagging systems and of integrating ontologies with folksonomies [Specia, '07; Xu, '06], while other approaches propose the development and evolution of lightweight ontologies in a collaborative way [Angeletou, '07; Mika, '07]. Researchers seem to agree that folksonomies and lightweight ontologies have more properties in common than differences and will be further integrated, and thus, in the future, a community-based bottom-up approach might prevail over top-down controlled engineering efforts. Much of the current research for representing simple user profiles is based on the Friend of a Friend (FOAF) project 1 a project aimed at creating a Web of machine-readable pages describing people, the links between them and the things they create and do. FOAF is currently an important source of RDF data available on the Web which has already been used for social network analysis [Ding, '05; Finin, '05; Paolillo, '05]. A related initiative is the Semantically-Interlinked Online Communities (SIOC) project, 2 which provides an ontology for describing items and relationships from Internet discussion methods (such as blogs, forums, and mailing lists) to facilitate interconnection of these methods by publishing metadata [J. Breslin, '07; J. G. Breslin, '05]. Many recent papers highlight a growing interest in portability issues among social network applications; they are regarded as fundamental problems, and semantic technologies, mainly FOAF, are being proposed to solve them [Mika, '05]. Theoretical work combines the Semantic Web (SW) and social networks, especially for the analysis of social networks and the extraction of knowledge from existing data [Ding, '04]. However, neither the creation of new end-user semantic social applications nor their design and implementation is well explored. Existing social network applications do not employ SW technologies, although most of the standards infrastructure is already in place. Most are walled websites, which provide limited means for users and developers to control, publish, and access social data. This limits possibilities for reuse and integration, which are the driving forces
1 2
http://www.foaf-project.org/ http://www.sioc-project.org/
414
behind both Web 2.0 and the Semantic Web, and results in growing dissatisfaction in the user community. This article discusses some of the limitations of current social network applications and shows how Semantic Web technologies can be used to build a new generation of social networks which overcome these limitations and enable new features currently not exploited. The article is structured into six sections. The section below discusses a number of common features and technological limitations of current social networking applications. Section 3 identifies and describes a number of issues that need to be handled in the design and implementation of the next generation of semantic social networks. Section 4 presents a concrete scenario usage for a semantic social application. A presentation of a semantic social network prototype featuring semantic mashups is proposed in section 5. The last section summarizes the role of semantics for the development of a new generation of social networks in order to better harness and integrate the collective knowledge made available by the various social applications.
415
the public, which is later most often used in various mashups. Most sites publish RSS or custom XML data feeds, while others provide programmatic methods to control the site to the same extent as using the graphical interface. Social networking applications have a number of technological limitations as summarized below: It is not possible to export/import profile data from one application to another It is not possible to export/import social relationships from one application to another There is usually less data available in machine-readable formats than the application contains Application Programming Interfaces (APIs) are based on a variety of custom formats and protocols, some of which are non-standard (such as Facebook Query Language (FQL)-a language similar to SQL and Facebook Markup Language (FBML) which is HTML with custom elements in Facebook). Our observations fit well with statements by initiatives such as Open Social Web, 3 Social Network Portability, 4 DataPortability, 5 OpenID, 6 OpenSocial, 7 which have emerged as a result of a growing dissatisfaction in user communities. Semantic social networks will still focus on the community dimension while drawing on Semantic Web technologies to aggregate content. Some examples of semantic social networks are further discussed in section 2.1. 2.1 Examples of Semantic Applications Freebase is an open database of the worlds information. It acquires structured data spanning different domains such as music, people and locations from various sources such as Wikipedia and MusicBrainz. 8 The data is aggregated, identical or related concepts being linked together. In addition, users in the community can add, edit, and even upload data. Topics in Freebase are organized by types which are grouped into domains. An important feature is that users can not only fill already predefined types with instance data or edit it, but can also create their own types and define their properties, i.e. they can create new schemas and extend Freebases domain model using the same interface. Furthermore, it provides an open but proprietary API for its data and encourages its use in applications and mashups. DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web via a semantic representation. It provides an RDF dataset extracted from Wikipedia, which contains mostly free text but also structured information such as categories, lists, infoboxes, links to external
3
4
416
pages etc. DBPedia makes it possible to perform complex queries (such as German musicians who were born in Berlin) over a SPARQL query interface. DBPedia is a prime example of Linked Data publishing and can be browsed using semantic browsers. It is interlinked with other semantic datasets such as Geonames, 9 MusicBrainz etc.
417
and navigate through different data sources through RDF links, just as conventional Web browsers navigate through HTML links. However, this kind of presentation is probably too advanced for mainstream Web users (see Figure 1).
Figure 1: Tabulator view It can be assumed that a Semantic Web application interface visualizes its domain ontology in such a way that each class and instance has its own page, linked to others through class-instance and instance-instance relationships. This generic approach is used in many semantic websites, and is probably best illustrated by Freebase. Another approach, which we call specific, is used by conventional Web applications, as well as social networks. Every type of information (such as a car, a user, or an event) has its own specific user interface. For each new type a new interface has to be created; the same interface cannot be used for different types, and interfaces have to be fixed when the schema changes. This approach is obviously not feasible on the Semantic Web, where ontologies are meant be extended, reused, and integrated from different sources. If social networks are to become extensible semantic applications, it is likely that they will have to adopt the generic approach. 3.3 Domain model Social network applications (Last.fm, Flickr, LinkedIn, etc.) are usually developed for different application domains such as music, photos, and business. However, they share a common property: the domains are fixed and non-extensible. Users are encouraged to contribute and improve application data, but this is restricted to instance data for predefined types. Semantic applications such as Freebase take a different approach and allow users to edit the domain model itself: not only to fill in instance data, but to extend and edit types, add new types, and define properties in the underlying ontology. Following this approach, social network applications would empower users to express their identities by creating or reusing concepts and relationships relevant to them, and share them with others. The domain model could be left to the community to control and further develop it in a direction which is currently of most interest to it, keeping it relevant over time. People would connect
418
through things they have in common, achieving object-centered sociality [Breslin, 07]. In the future, this may be achieved by integrating lightweight ontology development into the means of user collaboration and content contribution. In order to implement this approach, applications are usually modelled in RDF/OWL as they offer more expressivity than object-oriented and relational models. This data model is based on formal semantics and therefore interpreted unambiguously by different agents. Furthermore, they need to reuse FOAF and SIOC ontologies, which are currently the state of the art representations of social networks on the Semantic Web, as well as other relevant ontologies. Most current SW applications are also static and fixed in the sense that ontologies are known and mapped manually at design time [Razmerita, 03]. Although semantic technologies are designed with extensibility and openness in mind, current programming languages and tools are not able to fully exploit it. It is expected that future semantic applications will be using multiple ontologies, discover them and integrate them on request. 3.4 Publishing and reusing data and metadata Large amounts of meaningfully interlinked RDF data available on the Web are crucial for achieving the Semantic Web vision. However, many social networks do not offer interfaces and APIs to access application data. Others make the contents of the website (such as lists of users, songs, or pictures) available via a simple read-only REST interface in a software-processable data format, usually a custom schema of XML, Atom, or RSS. Some provide full APIs with add/update methods, invoked via various interfaces such as REST, XML-RPC, SOAP, Atom, or OpenSocial. A variety of publishing formats (especially non-standard) make reuse difficult. We advocate that semantic social networks should publish their data in RDF, designed specifically for distributed knowledge representation. Furthermore, all resources in social networks (including non-information, real-world resources) should be given URIs, distinguished from URIs of the representations that describe them, and published as Linked Data. 15 APIs should be replaced by SPARQL endpoints, which would allow running remotely structured queries against application data. Semantic data representation and advanced interfaces would help to overcome portability issues of proprietary APIs and interconnect social networks with different data sources, enable use of semantic browsers, and facilitate semantic mashups.
http://linkeddata.org/
419
what events are there, and where and when they take place. Most of the information is online, but it is spread across many different websites and is also incomplete. In a club-goers portal she usually finds a list of parties for the next week. However, it is compiled by editors from the official announcements received from clubs, and consequently many smaller venues or underground events are left out. For an overview of current exhibitions and opening hours, she has to visit homepages of several museums, check another portal to see the movie timetable, and finally put all the dates and times on one list to see how they match her schedule. Sarah often likes to try something new, so even when she finds an event of interest, she may have no previous knowledge of the venue it takes place in. Therefore she has to look up the address in yet another website, a mapping service, to find her way there. Now, imagine she comes across a website where users themselves are allowed and encouraged to contribute content about events and places. Places are listed by type (such as Club or Museum) and location, which can be shown on the map. Events are also shown by type (such as Party or Exhibition), and are related to place and time. Sarah can immediately see a list of events of any type, at any place, or any date, as well as a list of places at a certain location. Moreover, users are able to indicate which events they are attending and which places they usually prefer, and on the basis of this lists of the most popular events and places are built, which might help her to make her decision. She might subscribe to a feed of upcoming events and follow it in her RSS reader. If Sarah wants to contribute to the site or make use of the social features, she has to sign up and create her profile with some personal information. After that, she can add friends as well as favourite events and places to her profile. From an analysis of events and places she and her friends like, the application might suggest some new possibilities. If Sarah knows an event will be taking place but is not listed on the website, she can simply fill in a short description form, set the place for it, and save it. Since the website is built with semantic technologies, Sarah is also able to import friend profiles from other social networks that publish them in a semantic format. She is also able to run structured queries against the data in the website and put a list of events she will be attending in a mashup on her homepage.
420
endpoint, retrieves its description and homepage address in real time and presents it to the user in the same fashion as local properties and values. The prototype is implemented based on a RESTful Web framework which treats HTTP resources as first-class objects and follows a Model View Controller (MVC) pattern and W3C standards. On top of the data layer, Model-View-Controller (MVC) pattern is often used for logical separation of application domain and presentation layers [Krasner, 88].
Figure 2: Browsing the ontology in a Tabulator view 5.1 Model View Controller (MVC) design pattern Model View Controller is a design pattern used in software engineering. In complex computer applications that present lots of data to the user, one often wishes to separate data (model) and user interface (view) concerns, so that changes to the user interface do not impact the data handling and the data can be reorganized without changing the user interface. The MVC design pattern solves this problem by decoupling data access and business logic from data presentation and user interaction, by introducing an intermediate component: the controller. The controller translates interactions with the view into actions to be performed by the model. In a stand-alone GUI client, user interactions could be button clicks or menu selections, whereas in a web application they appear as HTTP GET and POST requests. The actions performed by the model include activating business processes or changing the state of the model. Based on the user interactions and the outcome of the model actions, the controller responds by selecting an appropriate view.
421
Figure 3: Domain ontology graphical view Within the prototype, the Model is the ontology layer, the Java code is the Controller and Views are generated by integrating SPARQL queries results and transforming them into XHTML using XSLT. The application domain is modelled as an RDF/OWL ontology stored in a RDF triple store, accessed using Jena, 16 and queried using SPARQL. The domain ontology extends classes such as foaf:Person and adds a number of new classes such as: Place and Event as represented in Figure 3. FOAF and SIOC classes and properties are reused. Views become representations of REST resources (XHTML, RDF). They join several SPARQL XML results and transform them directly to output XHTML using XSLT, or serve raw RDF/XML for the Linked Data interface, depending on the HTTP Accept header. Controller dispatches requests to resources which have explicit URIs, implements HTTP methods, and can be related to domain instances using foaf:topic, and return view representations. Most of the current object-oriented languages are statically typed and do not allow classes to be changed or extended at run-time. Thus it is not easy with the existing tools to map Event class in OWL to an Event class in Java so that it can be changed or extended at run-time. Tools such as RDFReactor 17 and Elmo 18 generate object-oriented Java code from our ontologies, but this code is static and not extensible at run time and therefore it was not used.
422
discussed in section 2. Based on this analysis, in section 3, the authors propose four main areas (creation of metadata, user interface design, open domain models, publishing and reusing data and metadata) that require further investigations and developments in order to implement semantic social networks. Semantics is the key for overcoming current limitations of social applications. Furthermore, social data portability issues are leading to dissatisfaction in both user and developer communities. This is caused by limited amounts of social data published openly and lack of tools to import them as well as formats and APIs of limited interoperability. Social networks would benefit from Semantic Web technologies. FOAF, SIOC, and Linked Data can solve portability issues and enable data reuse. The prototype described in section 5 shows how features from semantic websites such as Freebase and DBpedia can be reused in social applications and lead to more relevant content and stronger social connections. New generations of social applications may take advantage of the advanced data model that SW technologies provide. Semantic data representations and advanced interfaces would help to overcome portability issues of proprietary APIs, interconnect social networks with different data sources, enable the use of semantic browsers and facilitate semantic mashups. Domain models could be collaboratively developed by users of the application. This approach requires a new generic user interface based on classes, instances, and properties. This could lead to more up-to-date and relevant content, which in turn would facilitate social connections through points of common interest. Another interesting approach that could be further explored in the future is the use of an AJAX-enabled application interface, a form-based interface for SPARQL and dynamic, run-time object-ontology mapping tools. Acknowledgment A short version of this article has been presented at the Adaptation and Personalisation for Web 2.0 conference, associated with User Modeling, Adaptation and Personalisation 2009 conference.
References
[Andersen, 05] Andersen, B.: (2005). Meta Tags: The Poor Mans RDF? Retrieved May 2008, from http://weblog.scifihifi.com/2005/08/05/meta-tags-the-poor-mans-rdf/ [Angeletou, 07] Angeletou, S., Sabou, M., Specia, L., & Motta, E.: (2007). Bridging the Gap Between Folksonomies and the Semantic Web: An Experience Report. Paper presented at the ESWC 2007, Workshop on Bridging the Gap between Semantic Web and Web 2.0. [Ankolekar, 08] Ankolekar, A., Krtzsch, M., Tran, T., & Vrandecic, D.: The two cultures: Mashing up Web 2.0 and the Semantic Web. Web Semantics: Science, Services and Agents on the World Wide Web, 6(1), 70-75, 2008. [Berners-Lee, 06] Berners-Lee, T., Hall, W., Hendler, J. A., O'Hara, K., Shadbolt, N. and Weitzner, D. J. (2006). A Framework for Web Science. Foundations and Trends in Web Science. 1(1), 1-130.
423
[Bojars, 08] Bojars, U., Breslin, J., G.,, Peristeras, V., Tummarello, G., & Decker, S. (2008). Interlinking the Social Web with Semantics. IEEE Intelligent Systems, 23(3), 29-40. [Breslin, 07] Breslin, J., & Decker, S.: The Future of Social Networks on the Internet: The Need for Semantics. IEEE Internet Computing magazine, 11 (6), 86-90, 2007. [Breslin, 05] Breslin, J. G., Harth, A., Bojars, U., & Decker, S.: (2005). Towards SemanticallyInterlinked Online Communities. [Ding, 04] Ding, L., Finin, T., & Joshi, A.: Analyzing Social Networks on the Semantic Web? IEEE Intelligent Systems (Trends & Controversies), 8(6), 2004. [Ding, 05] Ding, L., Zhou, L., Finin, T., & Joshi, A.: (2005). How the Semantic Web is Being Used: An Analysis of FOAF Documents. [Finin, 05] Finin, T., Ding, L., Zhou, L., & Joshi, A.: Social networking on the semantic web. The Learning Organization, 12(5), 418-435, 2005. [Krasner, 88] Krasner, G., E., & Pope, S., T. (1988). A cookbook for using the model-view controller user interface paradigm in Smalltalk-80. Journal of Object-Oriented Programming, SIGS publication, 1(3), 26-49. [Mika, 05] Mika, P.: Flink: Semantic Web technology for the extraction and analysis of social networks. Web Semantics: Science, Services and Agents on the World Wide Web, 3(2-3), 211223, 2005. [Mika, 07] Mika, P.: Ontologies are us: A unified model of social networks and semantics. Journal of Web Semantics, 5, 5-15, 2007. [Paolillo, 05] Paolillo, J. C., Mercure, S., & Wright, E.: (2005). The social semantics of livejournal foaf: Structure and change from 2004 to 2005. [Razmerita, 03] Razmerita, L., Angehrn, A., & Maedche, A.: (2003). Ontology-based User Modeling for Knowledge Management Systems. Paper presented at the UM Pittsburgh, USA. [Specia, 07] Specia, L., & Motta, E.: (2007). Integrating Folksonomies with the Semantic Web. Paper presented at the Lecture Notes In Computer Science. [Xu, 06] Xu, Z., Fu, Y., Mao, J., & Su, D.: (2006). Towards the semantic web: Collaborative tag suggestions. Paper presented at the Collaborative Web Tagging Workshop at WWW 2006, Edinburgh, UK.