Semantic Web
Semantic Web
UNIT - I
Introduction: Introduction to Semantic Web, the Business Case for the Semantic Web, XML,
and Its Impact on the Enterprise.
“The first step is putting data on the Web in a form that machines can naturally understand, or
converting it to that form. This creates a Semantic Web—a web of data that can be processed
directly or indirectly by machines.”
The Semantic Web as a logical extension of the current Web instead of a distant possibility.
The Semantic Web is both achievable and desirable.
According to the vision of Tim Berners-Lee we can define the Semantic Web as a machine
processable web of smart data.
We can also further define smart data as data that is application-independent, composable,
classified, and part of a larger information ecosystem (ontology).
The World Wide Web Consortium (W3C) has established an Activity dedicated to
implementing the vision of the Semantic Web.The following diagram shows an example of
Semantic Web with Web of documents and Web of data.
2
The World Wide Web, commonly known as the web, has become an integral part of modern
society. It has revolutionized the way we communicate, work, and access information. However,
the web we know today has evolved significantly since its inception in 1989. In this section we are
going to see an overview of the evolution of the web, from its early days to the current era of web
4.0.
In the first-generation web technology, we can read and share information on web pages.
This is based on bookmarking and hyperlinking. There is a concept of static pages.
The web as we know it today began in 1989 when Tim Berners-Lee, a British computer
scientist, proposed a new system for sharing information over the internet.
This system, which he called the World Wide Web, was based on a simple set of protocols
that allowed users to access and share information using hypertext links. The first website,
which was created by Berners-Lee, went live in 1991.
In the early days of the web, websites were primarily static pages that provided information
to users.
These pages were created using HTML (Hypertext Markup Language), a markup
language that allowed developers to structure web pages using tags. Websites were
primarily created by developers and were often difficult for users to navigate.
In the second-generation web, we can read, write, and interact with each other. This web
2.0 dynamic page and user-generated content replace the static pages.
The next stage in the evolution of the web was the emergence of Web 2.0 in the early
2000s. This era was characterized by the emergence of user-generated content and social
media platforms.
The term Web 2.0 was coined by Tim O’Reilly, a technology entrepreneur and writer, to
describe a new generation of web-based applications that were more interactive and
dynamic.
Web 2.0 was characterized by the emergence of social media platforms such
as Facebook, Twitter, and LinkedIn. These platforms allowed users to create and share
content with each other, and they became a central part of everyday life for many people.
The emergence of Web 2.0 also saw the rise of e-commerce, with companies such as
Amazon and eBay becoming dominant players in the online retail space.
Web 3.0: The Semantic Web
In the third-generation web, machines can think of information rather than humans. This
web 3.0 is also known as the semantic web.
The next stage in the evolution of the web is Web 3.0, also known as the Semantic Web.
The Semantic Web is characterized by a shift from the current web, which is focused on
content, to a web that is focused on meaning.
The Semantic Web aims to create a more intelligent web that can understand and interpret
the meaning of information, making it easier for users to find what they are looking for.
The Semantic Web is based on a set of technologies and standards, including RDF
(Resource Description Framework) and OWL (Web Ontology Language), which
allow data to be represented in a machine-readable format. This allows machines to
understand the meaning of information and to make intelligent decisions based on that
information.
Web 4.0: The Intelligent Web
In the fourth-generation web is also known as the Symbiotic web. With this, humans and
machines can interact with each other.
Web 4.0, also known as the Intelligent Web, is the next stage in the evolution of the web.
This era is characterized by the emergence of artificial intelligence (AI) and machine
learning (ML) technologies, which are being used to create more intelligent and
personalized web experiences for users.
The Intelligent Web is based on a combination of AI and ML technologies, including
natural language processing (NLP), image recognition, and predictive analytics. These
technologies allow websites to provide personalized recommendations, automate tasks,
and interact with users in a more human-like way.
The Intelligent Web is already being used in a variety of industries, including healthcare,
finance, and e-commerce.
For example, healthcare companies are using AI and ML to analyse patient data and to
develop personalized treatment plans.
E-commerce companies are using these technologies to provide personalized
recommendations to users based on their browsing and purchasing history.
The first step is a paradigm shift in the way we think about data. Historically, data has been
locked away in proprietary applications.
Data was seen as secondary to processing the data. This incorrect attitude gave rise to
the expression “garbage in, garbage out,” or GIGO. GIGO basically reveals the flaw in the
original argument by establishing the dependency between processing and data.
Figure 1.2 displays the progression of data along a continuum of increasing intelligence.
Figure 1.2 shows four stages of the smart data continuum.
The Semantic Web is not just for the World Wide Web.
It represents a set of technologies that will work equally well on internal corporate
intranets.
The Semantic Web will resolve several key problems facing current information
technology architectures.
Information Overload
Information overload is the most obvious problem in need of a solution, and technology
experts have been warning us about it for 50 years. This condition results from having a rapid
rate of growth in the amount of information available, while days remain 24 hours long and
our brains remain in roughly the same state of development as they were when cavemen
communicated by scrawling messages in stone.
Stovepipe Systems
A stovepipe system is a system where all the components are hardwired to only work
together. Therefore, information only flows in the stovepipe and cannot be shared by other
systems or organizations that need it.
For example, the client can only communicate with specific middleware that only
understands a single database with a fixed schema.
Poor Content Aggregation
Putting together information from disparate sources is a recurring problem in a number
of areas, such as financial account aggregation, portal aggregation, comparison shopping, and
content mining.
Unfortunately, the most common technique for these activities is screen scraping.
The main drawback of screen scraping method is that it scrapes messages written in
HTML, which describes the format (type size, paragraph spacing, etc.) but doesn’t give a clue
about the meaning of a document.
XML is the syntactic foundation layer of the Semantic Web. All other technologies
providing features for the Semantic Web will be built on top of XML.
Requiring other Semantic Web technologies (like the Resource Description
Framework) to be layered on top of XML guarantees a base level of interoperability.
The technologies that XML is built upon are Unicode characters and Uniform
Resource Identifiers (URIs).
The Unicode characters allow XML to be authored using international characters.
URIs are used as unique identifiers for concepts in the Semantic Web.
The answer is no, because XML only provides syntactic interoperability. In other words,
sharing an XML document adds meaning to the content; however, only when both parties know
and understand the element names.
For example, if I label something a <price> $12.00 </price> and us label that field on
the invoice <cost> $12.00 </cost>, there is no way that a machine will know those two mean
the same thing unless Semantic Web technologies like RDF and ontologies are added.
Web services are software services identified by a URI that are described, discovered, and
accessed using Web protocols.
The important point about Web services is that they consume and produce XML. Thus, the
first way that Web services fit into the Semantic Web is by furthering the adoption of XML,
or more smart data.
As Web services proliferate, they become similar to Web pages in that they are more
difficult to discover.
Semantic Web technologies will be necessary to solve the Web service discovery problem.
There are several research efforts under way to create Semantic Web-enabled Web
services. Figure 1.3 demonstrates the various convergences that combine to form Semantic
Web services.
The third way that Web services fit into the Semantic Web is in enabling Web services
to interact with other Web services.
It is important to state that the concepts described so far (classes, subclasses, properties)
are not rigorous enough for inference. To each of these basic concepts, additional formalisms
are added. For example, a property can be further specialized as a symmetric property or a
transitive property. Here are the rules that define those formalisms:
1.9.4 Rules.
With XML, RDF, and inference rules, the Web can be transformed from a collection of
documents into a knowledge base.
An inference rule allows us to derive conclusions from a set of premises. A well-known
logic rule called “modus ponens” states the following:
If P is TRUE, then Q is TRUE. P is TRUE.
Therefore, Q is TRUE.
An example of modus ponens is as follows:
An apple is tasty if it is not cooked. This apple is not cooked. Therefore, it is tasty.
The Semantic Web can use information in an ontology with logic rules to infer new
information. Let’s look at a common genealogical example of how to infer the “uncle” relation
as depicted in Figure 1.7:
If a person C is a male and “childOf” a person A, then person C is a “sonOf” person A.
If a person B is a male and siblingOf a person A, then person B is a “brotherOf” person
A.
If a person C is a “sonOf” person A, and person B is a “brotherOf” person A, then person
B is the “uncleOf” person C.
1.9.5 Trust.
The five directions of semantic web are logical assertions, classification, formal class
models, rules, and trust will move corporate intranets and the Web into a semantically rich
17
knowledge base where smart software agents and Web services can process information and
achieve complex tasks.
18
We may have projects that could share lessons learned, provide competitive intelligence
information, and save us a lot of time and work.
If we had a corporate knowledge base that could be searched and analysed by software
agents, we could have Web based applications that save us a lot of time and money.
Information sharing and communication are paramount in any organization, but as most
organizations grow and collect more information, this is a major struggle.
We all understand the importance of not reinventing the wheel, but how many times
have we unintentionally duplicated efforts? When organizations get larger, communication
gaps are inevitable.
With a little bit of effort, a corporate knowledge base could at least include a registry of
descriptions of projects and what each team is building.
Imagine how easy it would be for the employees to be able to find relevant information.
Using Semantic Web enabled Web services can allow us to create such a registry.
19
Administration and Automation
A side effect of having such a knowledge base is the ability of software programs to
automate administrative tasks.
Booking travel, for example, is an example where the Semantic Web and Web services
could aid in making a painful task easy.
Making travel arrangements can be an administrative nightmare. Everyone has personal
travel preferences and must take items such as the following into consideration:
Transportation preference (car, train, bus, plane)
Hotel preference and rewards associated with hotel
Airline preference and frequent flyer miles
Hotel proximity to meeting places
Hotel room preferences (nonsmoking, king, bar, wireless network in lobby)
Rental car options and associated rewards
Price (lodging and transportation per diem rates for the company)
Part-III XML
3. Introduction XML:
3.1 What Is XML?
XML has become the universal syntax and framework for exchanging data between
organizations. By agreeing on a standard schema, organization can produce these text
documents that can be validated, transmitted, and parsed by any application regardless of
hardware or operating system.
XML provides universal accepted language for creating semantically rich new markup
languages in a particular domain.
In other words, we can apply XML to create new markup languages.
Any language created via the rules of XML, like the Math Markup Language
(MathML), CML (Chemical Markup Language) are called the applications of XML.
20
A markup language’s primary concern is how to add semantic information about the
raw content in a document; thus, the vocabulary of a markup language is the external “marks”
to be attached or embedded in a document.
21
3.4 What are the Characteristics of XML?
There are three important characteristics of XML that make it useful in a variety of systems
and solutions:
XML is extensible: XML allows you to create your own self-descriptive tags or
language, that suits to the application.
XML carries the data, does not present it: XML allows you to store the data
irrespective of how it will be presented.
22
XML tags are not predefined like HTML tags
XML documents form a tree structure that starts at "the root" and branches to"the
leaves".
<?xml version="1.0"?>
<contact_info>
<name>Rajesh</name>
<company>TCS</company>
<phone>9333332354</phone>
</contact_info>
You can notice there are two kinds of information in the above example:
The following diagram depicts the syntax rules to write different types of markups and text in
an XML document.
23
Let us see each component of the above diagram in detail:
3.8 XML Declaration
The XML document can optionally have an XML declaration. It is written as below:
<?xml version="1.0" encoding="UTF-8"?>
Where version is the XML version and encoding specifies the character encoding used in
the document.
The XML declaration is case sensitive and must begin with "<?xml>" where
"xml" is written in lower-case.
The XML declaration strictly needs be the first statement in the XML document.
24
An HTTP protocol can override the value of encoding that you put in the XML
declaration.
3.9 Tags and Elements
<element>
Element Syntax: Each XML-element needs to be closed either with start or with endelements
as shown below:
<element/>
An XML-element can contain multiple XML-elements as its children, but the children
elements must not overlap. i.e., an end tag of an element must have the same nameas that of
the most recent unmatched start tag.
<contact_info>
25
<company>TCS
<contact_info>
</company>
<?xml version="1.0"?>
<contact_info>
<company>TCS</company>
<contact_info>
An XML document can have only one root element. For example, following is not a correct
XML document, because both the x and y elements occur at the top level without a root
element:
<x>...</x>
<y>...</y>
<root>
<x>...</x>
<y>...</y>
26
</root>
3.13 Attributes:
An attribute specifies a single property for the element, using a name/value pair.An
XML-element can have one or more attributes. For example:
<a href="http://www.w3.org/Index.html">XML is Meta Language</a>
Attribute names are defined without quotation marks, whereas attribute values must always
appear in quotation marks. Following example demonstrates incorrect xml syntax:
In the above syntax, the attribute value is not defined in quotation marks.
3.14 XML References
References usually allow you to add or include additional text or markup in an XML
document. References always begin with the symbol "&” which is a reservedcharacter and
end with the symbol ";". XML has two types of references:
27
Entity References: An entity reference contains a name between the start and the end
delimiters. For example, & where amp is name. The name refers to a predefined string
of text and/or markup.
Character References: These contain references, such as A contains a hash mark (“#”)
followed by a number. The number always refers to the Unicode code ofa character. In this
case, 65 refers to alphabet "A".
28
An Example XML Document
29
XML documents are formed as element trees.
An XML tree starts at a root element and branches from the root to child elements. All
elements can have sub elements (child elements):
<root>
<child>
<subchild> .............. </subchild>
</child>
</root>
The terms parent, child, and sibling are used to describe the relationships between
elements.
Parents have children. Children have parents. Siblings are children on the same
level (brothers and sisters).
The first key principle of XML is markup is separate from content. A corollary to that
principle is that markup can surround or contain content.
An XML element is an XML container consisting of a start tag, content (contains
character data, sub elements, or both), and an end tag—except for empty elements, which use
a single tag denoting both the start and end of the element.
The content of an element can be other elements. Following is an example of an element:
<footnote>
</footnote>
30
Here we have one element, called “footnote,” which contains character data and two
subelements: “author” and “title.”
An XML document validated against a DTD is both "Well Formed" and "Valid".
A "Valid" XML document is a "Well Formed" XML document, which also conforms to
the rules of a DTD.
DTD is the basic building block of XML.
The purpose of a DTD is to define the structure of an XML document. It defines the
structure with a list of legal elements.
A document type definition defines the rules and the legal elements and attributes for an
XML document.
31
<!DOCTYPE book[
!DOCTYPE book defines that the root element of the document is book
!ELEMENT book defines that the book element must contain the elements:
2) External DTD.
32
1) Internal / Embedded DTD.
33
“student.xml”
The XML specification defined two levels of conformance for XML documents: well-
formed and valid. Well-formedness is mandatory, while validity is optional.
A well-formed XML document complies with all the W3C syntax rules of XML
(explicitly called out in the XML specification as well-formedness constraints) like naming,
nesting, and attribute quoting.
Well Formed XML Documents
An XML document with correct syntax is called "Well Formed".
The syntax rules were given bellow.
XML documents must have a root element
XML elements must have a closing tag
XML tags are case sensitive
XML elements must be properly nested
XML attribute values must be quoted
34
This requirement guarantees that an XML processor can parse (break into identifiable
components) the document without error.
If a compliant XML processor encounters a well- formedness violation, the specification
requires it to stop processing the document and report a fatal error to the calling application.
A valid XML document references and satisfies a schema.
Valid XML Documents
A "well formed" XML document is not the same as a "valid" XML document.
A "valid" XML document must be well formed. In addition, it must conform to a
documenttype definition or schema.
There are two different document type definitions that can be used with XML:
35
XML Schema is analogous to a database schema, which defines the column names and
data types in database tables.
XML Schema became a W3C Recommendation (synonymous with standard) on May
5, 2001.
XML Schema is not the only definition language, and us may hear about others like
Document Type Definitions (DTDs), RELAX NG, and Schematron (see the sidebar titled
“Other Schema Languages”).
As shown in Figure 3.5, we have two types of documents: a schema document (or
definition document) and multiple instance documents that conform to the schema.
A good analogy to remember the difference between these two types of documents is
that a schema definition is a blueprint (or template) of a type and each instance is an incarnation
of that template. This also demonstrates the two roles that a schema can play:
Template for a form generator to generate instances of a document type
Validator to ensure the accuracy of documents
36
XML Schemas allow validation of instances to ensure the accuracy of field values and
document structure at the time of creation.
The accuracy of fields is checked against the type of the field; for example, a quantity
typed as an integer or money typed as a decimal.
The structure of a document is checked for things like legal element and attribute
names, correct number of children, and required attributes.
All XML documents should be checked for validity before they are transferred to
another partner or system.
Here is an example of an element called “author” that can contain any number of text
characters:
The preceding element declaration enables an instance document to have an element like this:
37
If a built-in data type does not constrain the values the way the document designer
wants, XML Schema allows the definition of custom data types.
A complex type is an element that either contains other elements or has attached
attributes.
Let’s first examine an element with attached attributes and then a more complex element
that contains child elements.
Here is a definition for a book element that has two attributes called “title” and “pages”:
38
<xsd:element name=book”>
<xsd:complexType>
</xsd:complexType>
</xsd:element>
<xsd:element name=“product”>
<xsd:complexType>
<xsd:sequence>
</xsd:sequence>
39
<xsd:attribute name=“id” type=“xsd:ID” />
</xsd:complexType>
</xsd:element>
</product>
Namespaces are a simple mechanism for creating globally unique names for the
elements and attributes of the markup language.
This is important for two reasons: to deconflict the meaning of identical names in
different markup languages and to allow different markup languages to be mixed together
without ambiguity.
Unfortunately, namespaces were not fully compatible with DTDs, and therefore their
adoption has been slow.
The current markup definition languages, like XML Schema, fully support namespaces.
40
Namespaces are implemented by requiring every XML name to consist of two parts: a
prefix and a local part. Here is an example of a fully qualified element name:
<xsd:integer>
The local part is the identifier for the meta data (in the preceding example, the local
part is “integer”), and the prefix is an abbreviation for the actual namespace in the namespace
declaration.
The actual namespace is a unique Uniform Resource Identifier (URI). Here is a sample
namespace declaration:
<xsd:schema xmlns:xsd=“http://www.w3.org/2001/XMLSchema”>
The preceding example declares a namespace for all the XML Schema elements to be used in
a schema document.
<xsd:schema xmlns:xsd=”http://www.w3.org/2001/XMLSchema”
targetNamespace=”http://www.mycompany.com/markup”>
41
XML Schema Example
The following example shows the employe schema for employee details.
employee.xsd
1. <?xml version="1.0"?>
2. <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" >
3. <xsd:element name="employee">
4. <xsd:complexType>
5. <xsd:sequence>
6. <xsd:element name="firstname" type="xsd:string"/>
7. <xsd:element name="lastname" type="xsd:string"/>
8. <xsd:element name="email" type="xsd:string"/>
9. </xsd:sequence>
10. </xsd:complexType>
11. </xsd:element>
12. </xsd:schema>
The following example shows the employe xml document or xml instance for the above
employee schema.
employee.xml
1. <?xml version="1.0"?>
2. <employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
3. xsi:noNamespacSchemaLocation="./employee.xsd">
4. <firstname>vimal</firstname>
5. <lastname>jaiswal</lastname>
6. <email>[email protected]</email>
7. </employee>
42
DTD vs XSD
There are many differences between DTD (Document Type Definition) and XSD (XML Schema
Definition). In short, DTD provides less control on XML structure whereas XSD (XML schema)
provides more control.
XML parser validates the document and check that the document is well formatted.
43
The Parser could be categorized as validating and non-validating
Validating Parser: It needs a Document type Declaration to parse and gives an error if
the respective document doesn’t match with DTD and constraints.
Non-Validating: This Parser eliminates DTD and the parser checks for the well-formed
document.
Types of Parsers:
There are three common ways to parse an XML document: by using the Simple API
for XML (SAX), by building a Document Object Model (DOM), and by employing a new
technique called pull parsing.
SAX is a style of parsing called event-based parsing where each information class in the
instance document generates a corresponding event in the parser as the document is traversed.
44
SAX parsers are useful for parsing very large XML documents or in low-memory
environments.
Pull parsing is a new technique that aims for both low-memory consumption and high
performance.
Pull parsing is also an event-based parsing technique; however, the events are read by
the application (pulled) and not automatically triggered as in SAX.
The majority of applications use the DOM approach to parse XML.
The Document Object Model (DOM) is a language-neutral data model and application
programming interface (API) for programmatic access and manipulation of XML and HTML.
The Document Object Model is an in-memory representation of an XML or HTML
document and methods to manipulate it.
DOMs can be loaded from XML documents, saved to XML documents, or dynamically
generated by a program.
The DOM has provided a standard set of classes and APIs for browsers and
programming languages to represent XML and HTML.
The DOM is represented as a set of interfaces with specific language bindings to those
interfaces.
Unlike XML instances and XML schemas, which reside in files on disk, the DOM is an
in-memory representation of a document.
45
An object is a specific instance of a class.
Figure 3.6 graphically portrays a class and two objects
The DOM in Figure 3.7 can also be accessed using specific subclasses of Node for each major
part of the document like Document, DocumentFragment, Element, Attr (for attribute), Text,
and Comment.
46
Types of DOM Levels?
The DOM has steadily evolved by increasing the detail of the representation, increasing the
scope of the representation, and adding new manipulation methods. This is accomplished by
dividing the DOM into conformance levels, where each new level adds to the feature set.
47
There are currently three DOM levels:
DOM Level 1. This set of classes represents XML 1.0 and HTML 4.0 documents.
DOM Level 2. This extends Level 1 to add support for namespaces; cascading style
sheets, level 2 (CSS2); alternate views; user interface events; and enhanced tree manipulation
via interfaces for traversal and ranges.
DOM Level 3. This extends Level 2 by adding support for mixed vocabularies (different
namespaces), XPath expressions, load and save methods, and a representation of abstract
schemas (includes both DTD and XML Schema).
XPath is a language to select a set of nodes within a document. Load and save methods
specify a standard way to load an XML document into a DOM and a way to save a DOM into
an XML document.
Part- IV
4.1. Impact of XML on Enterprise IT
XML is spreading through the all areas of the enterprise, from the IT department to the
intranet, extranet, Web sites, and databases.
XML has become integrated with the majority of commercial products on the market,
either as a primary or enabling technology.
The current and future impact of XML in 10 specific areas are given bellow.
Data exchange and interoperability.
XML has become the universal syntax for exchanging data between organizations. By
agreeing on a standard schema, organization can produce these text documents that can be
validated, transmitted, and parsed by any application regardless of hardware or operating
system.
The government has become a major adopter of XML and is moving all reporting
requirements to XML. Companies report financial information via XML, and local
governments report regulatory information.
XML has been called the next Electronic Data Interchange (EDI) system, which
formerly was extremely costly, was cumbersome, and used binary encoding.
48
E-business
Business-to-business (B2B) transactions have been revolutionized through XML. B2B
revolves around the exchange of business messages to conduct business transactions.
There are dozens of commercial products supporting numerous business vocabularies
developed by RosettaNet, OASIS, and other organizations.
Enterprise Application Integration (EAI).
Enterprise Application Integration is the assembling of legacy applications, databases,
and systems to work together to support integrated Web views, e-commerce, and Enter- prise
Resource Planning (ERP).
The Open Applications is a nonprofit consortium of companies to define standards for
application integration. It currently boasts over 250 live sites and more than 100 vendors
(including SAP, PeopleSoft, and Oracle) supporting the Open Applications Group Integration
Specification (OAGIS) in their products.
Enterprise IT architectures.
The impact of XML on IT architectures has grown increasingly important as a bridge
between the Java 2 Enterprise Edition (J2EE) platform and Microsoft’s .NET platform.
Large companies are implementing both architectures and turning to XML Web services
to integrate them.
Additionally, XML is influencing development on every tier of the N-tier network. On
the client tier, XML is transformed via XSLT to multiple presentation languages like Scalable
Vector Graphics (SVG).
On the Web tier, XML is used primarily as the integration format of choice and merged
in middleware.
XML is used to configure and deploy applications on web tier like Java Server Pages
(JSP) and Active Server Pages (ASP).
In the back-end tier, XML is being stored and queried in relational databases and native
XML databases.
Content Management Systems (CMS).
CMS is a Web-based system to manage the production and distribution of content to
intranet and Internet sites.
49
XML technologies are central to these systems in order to separate raw content from
its presentation.
Content can be transformed on the fly via the Extensible Stylesheet Language
Transformation (XSLT) to browsers or wireless clients.
Knowledge management and e-learning.
Knowledge management involves the capturing, cataloging, and distribution of
corporate knowledge on intranets.
The corporate knowledge as an asset.
Electronic learning (e-learning) is part of the knowledge acquisition for employees
through online training.
XML is driving the future of knowledge management.
Portals and data integration
A portal is a customizable, multipaned view tailored to support a specific community of
users.
XML is supported via standard transformation portlets that use XSLT to generate
specific presentations of content (as discussed previously under Content Management
Systems), syndication of content, and the integration of Web services.
A portlet is a dynamically pluggable application that generates content for one pane (or
sub window) in a portal.
Syndication is the reuse of content from another site. The most popular format for
syndication is an XML-based format called the Resource Description Framework Site
Summary (RSS).
All the major portal vendors are integrating Web services into their portal products.
Customer relationship management (CRM)
CRM systems enable an organization’s sales and marketing staff to understand, track,
inform, and service their customers. CRM involves many of the other systems we have
discussed here, such as portals, content management systems, data inte- gration, and databases
(see next item), where XML is playing a major role. XML is becoming the glue to tie all these
systems together to enable the sales force or customers (directly) to access information when
they want and wherever they are (including wireless).
50
Databases and data mining.
XML has had a greater effect on relational database management systems (DBMS) than
object-oriented programming DBMS (object-oriented database management systems, or
OODBMS).
XML has a new category of databases called native XML databases exclusively for the
storage and retrieval of XML.
All the major database vendors have responded to this challenge by supporting XML
translation between relational tables and XML schemas.
Additionally, all of the database vendors are further integrating XML into their systems
as a native data type.
This trend toward storing and retrieving XML will accelerate with the completion of the
W3C XQuery specification.
Collaboration technologies and peer-to-peer (P2P)
Collaboration technologies allow individuals to interact and participate in joint activities
from disparate locations over computer networks.
P2P is a specific decentralized collaboration protocol.
XML is being used for collaboration at the protocol
51
The more computers understand, the more effectively they can handle complex
tasks.
We have not yet invented all the ways a semantically aware computing system can
drive new business and decrease your operation costs. But to get there, we must push
beyond simple meta data modelling to knowledge modelling and standard knowledge
processing. Here are three emerging steps beyond simple meta data: semantic levels,
rule languages, and inference engines.
Semantic Levels
o The following figure shows the evolution of data fidelity required for
semantically aware applications.
Instead of just meta data, we will have an information stack composed of semantic
levels. We are currently at Level 1 with XML Schema, which is represented as
modelling the properties of our data classes.
We are capturing and processing meta data about isolated data classes like purchase
orders, products, employees, and customers.
52
On the left side of the diagram, we associate a simple physical metaphor to the state
of each level.
Level 1 is analogous to describing singular concepts or objects.
In Level 2, we will move beyond data modelling (simple meta data properties) to
knowledge modelling. This includes the Resource Description Framework (RDF)
and taxonomies.
Knowledge modelling enables us to model statements both about the relationships
between Level 1 objects and about how those objects operate. This is diagrammed as
connection between our objects in Figure 3.9.
Beyond the knowledge statements of Level 2 are the superstructures or “closed world
modelling” (CWM) of Level 3. The technology that implements these sophisticated
models of systems is called ontologies.
Rules and Logic
The semantic levels of information provide the input for software systems.
The operations that a software system uses to manipulate the semantic information
will be standardized into one or more rule languages.
In general, a rule specifies an action if certain conditions are met. The general form
is this: if (x) then y.
Inference Engines
Applying rules and logic to semantic data requires standard, embeddable inference
engines. These programs will execute a set of rules on a specific instance of data using
an ontology.
An early example of these types of inferencing engines is the open-source software
Closed World Machine (CWM).
CWM is an inference engine that allows you to load ontologies or closed worlds, then
it executes a rule language on that world.
So, meta data is a starting point for semantic representation and processing. The rise of
meta data is related to the ability to reuse meta data between organizations and systems.
53
Part-V: Topics beyond the Syllabus
The Semantic Web's expansion and the tools it brings to the table are putting machines'
analytical skills to work in the areas of content creation, management, learning, support,
general.
The developing semantic web of material and data is a big potential to tap into when it
The Semantic Web will continue to give birth to new careers, businesses, and global
innovators.
Connect content sets from both inside and outside the company.
The oil and gas industry was reported to be using RDF/OWL in 2007 to combine data
from various sources and standardize data exchange, sharing, and integration across
54
The BBC website employed semantic web technology to dynamically display material
Facebook may then utilise this information to figure out what a user likes, provide