UNIT
UNIT
Introduction to Semantic Web, Limitations of current Web, Development of Semantic Web, Emergence of the Social Web,
Social Network analysis, Development of Social Network Analysis, Key concepts and measures in network analysis, Historical
overview of privacy and security, Major paradigms, for understanding privacy and security
The Semantic Web is the application of advanced knowledge technologies to the Web and distributed systems in general.
Information that is missing or hard to access for our machines can be made accessible using ontologies. Ontologies are formal,
which allows a computer to emulate human ways of reasoning with knowledge.
Ontologies carry a social commitment toward using a set of concepts and relationships in an agreed way.
The Semantic Web adds another layer on the Web architecture that requires agreementsto ensure interoperability.
aptation to our primary interface to the vast information that constitutes the
Web: the search engine.
The search engine will not know that 4GB is the capacity of the music player.
Problem is that general purpose search engines do not know anything about music players or
their properties and how to compare such properties.
Another bigger problem in our machines is trying to collect and aggregate product information
from the Web. The information extraction methods used for this purpose have a very difficult
task and it is easy to see why if we consider how a typical product description page looks like to
the eyes of the computer.
Even if an algorithm can determine that the page describes a music player, information about the
product is very difficult to spot.
Further, what one vendor calls “capacity” and another may call “memory”. In order to compare
music players from different shops we need to determine that these two properties are actually
the same and we can directly compare their values.
Google Scholar and CiteSeer are the two most well-known examples.
They suffer from the typical weaknesses of information extraction, e.g. when searching York
Sure, the name of a Semantic Web researcher, Scholar returns also publications that are
published in New York, but have otherwise nothing to do with the researcher in question. The
cost of such errors is very low, however: most of us just ignore the incorrect results.
In the first case, the search is limited to the stores known by the system. On the other hand, the
second method is limited by the human effort required for maintaining product categories as well
as locating websites and implementing methods of information extraction. As a result, these
comparison sites feature only a selected number of vendors, product types and attributes.
How to improve current Web?
Increasing automatic linking among data
Increasing recall and precision in search
Increasing automation in data integration
Increasing automation in the service life
cycle Adding semantics to data and services is the
solution!
The vision of extending the current human-focused Web with machine process able descriptions
of web content has been first formulated in 1996 by Tim Berners-Lee, the original inventor of
the Web.
Semantic Web has been actively promoted by the World Wide Web Consortium. The
organization is chiefly responsible for setting technical standards on the Web.
agencies on both
sides of the Atlantic, reshaping much of the AI research agenda in a relatively short period of
time.
precisely delineate the boundaries of this network. For research on the Semantic Web
community, researchers have submitted publications or held an organizing role at any of the past
International Semantic Web Conferences.
academia (79%) and to a lesser degree from industry (21%). Geographically, the community
covers much of the United States, Europe, with some activity in Japan and Australia.
-based languages for knowledge
representation and reasoning has been developed in the research field of Artificial Intelligence.
the potential for connecting information sources on a Web-scale emerged, the languages
that have been used in the past to describe the content of the knowledge bases of stand-alone
expert systems have been adapted to the open, distributed environment of the Web.
Since the exchange of knowledge in standard languages is crucial for the interoperability of tools
and services on the Semantic Web, these languages have been standardized by the W3C.
Technology adoption
The Semantic Web was originally conceptualized as an extension of the current Web, i.e. as the
application of metadata for describing Web content. In this vision, the content that is already on
the Web.
Semantic Web.
Difficulties:
The problem is that as a technology for developers, users of the Web never experiences the
Semantic Web directly, which makes it difficult to convey Semantic Web technology to
stakeholders. Further, most of the times the gains for developers are achieved over the long term,
i.e. when data and services need to reused and re-purposed. The semantic web suffers from Fax-
effect.
When the first fax machines were introduced, they came with a very hefty price tag. Yet they
were almost useless. The usefulness of a fax comes from being able to communicate with other
fax users. In this sense every fax unit sold increases the value of all fax machines in use.
Semantic Web the beginning the price of technological investment is very high. One
has to adapt the new technology which requires an investment in learning. The technology
needs time to become more reliable.
required a certain kind of agreement to get the system working on a global scale: all fax
machines needed to adopt the same protocol for communicating over the telephone line. This is
similar to the case of the Web where global interoperability is guaranteed by the standard
protocol for communication (HTTP).
Our machines can also help in this task to the extent that some of the meaning can be describedin
formal rules (e.g. if A is true, B should follow). But formal knowledge typically captures
onlythe smaller part of the intended meaning and thus there needs to be a common grounding in
an external reality that is shared by those at separate ends of the line.
the Web, have executed a set of temporal queries using the search engine Altavista.
ssary.
Each query measured the number of documents with the given term(s) at the given point in time.
The below figure shows the number of documents with the terms basketball, Computer Science,
and XML. The flat curve for the term basketball validates this strategy: the popularity of
basketball to be roughly stable over this time period. Computer Science takes less and less share
of the Web as the Web shifts from scientific use to everyday use. The share of XML, a popular
pre-semantic web technology seems to grow and stabilize as it becomes a regular part of the
toolkit of Web developers.
Fig2. Number of webpage with the terms basketball, Computer Science, and XML over time
and as a fraction of the number of pages with the term web.
Against this general backdrop there was a look at the share of Semantic Web related terms
and formats, in particular the terms RDF, OWL and the number of ontologies (Semantic Web
Documents) in RDF or OWL. As Figure 1.3.b shows most of the curves have flattened out
after January, 2004. It is not known at this point whether the dip in the share of Semantic
Web is significant. While the use of RDF has settled at a relatively high level, OWL has yet
to break out from a very low trajectory.
Fig3. Number of WebPages with the terms RDF, OWL and the number of ontologies in RDF or
OWL over time. Again, the number is relative to the number of pages with the term web.
The share of the mentioning of Semantic Web formats versus the actual number of Semantic Web
documents using that format. The resulting talking vs. doing curve shows the phenomenon of
technology hype in both the case of XML, RDF and OWL. this is the point where the technology
“makes the press” and after which its becoming increasingly used on the Web.
Fig.4 The hype cycle of Semantic Web related technologies as shown by the number of web
pages about a given technology relative to its usage
The five-stage hype cycle of Gartner Research is defined as follows: The first phase of a Hype
Cycle is the “technology trigger” or breakthrough, product launch or other event that generates
significant press and interest. In the next phase, a frenzy of publicity typically generates over-
enthusiasm and unrealistic expectations. There may be some successful applications
of a technology, but there are typically more failures. Technologies enter the “trough
of disillusionment” because they fail to meet expectations and quickly become
unfashionable. Although the press may have stopped covering the technology, some
businesses continue through the “slope of enlightenment” and experiment to
understand the benefits and practical application of the technology. A technology
reaches the “plateau of productivity” as the benefits of it become widely
demonstrated and accepted. The technology becomes increasingly stable andevolves
in second and third generations. The final height of the plateau varies according to
whether the technology is broadly applicable or benefits only a niche market.
tions, hype is
unavoidable for the adoption of network technologies such as the Semantic Web.
not reaching yet the mainstream user and developer community of the Web.
inspire even more confidence in the corporate world. This could lead an earlier
realization of the vision of the Se mantic Web as a “web of data”, which could
ultimately result in a resurgence of general interest on the Web.