XML Mod4
XML Mod4
XML Mod4
◼ a family of technologies:
- XML 1.1
- Xlink
- Xpointer & Xpath
- XML Namespaces
- XML Schemas
The Difference Between XML
and HTML
<?xml version=“1.0”?>
<author> Ditel & Ditel </author>
<title> Internet and World Wide Web </title>
<price> 850 </price>
A Simple XML Document with
Attributes Attributes with
name and value
<author>Gerhard Weikum</author>
<title> The Web in Ten Years </title>
In order to evolve
XML Components
A DTD can be
➢ Internal
➢ The DTD is part of the document file
➢ external
➢ The DTD and the document are on separate files
➢ An external DTD may reside
➢ In the local file system (where the document is)
➢ In a remote file system
Connecting a Document with its DTD
➢ An internal DTD
<?xml version="1.0"?>
<!DOCTYPE db [<!ELEMENT ...> … ]>
<db> ... </db>
Specifying the Structure
<!ELEMENT E content-model>
Content-model ::=
P1, P2 | P1 | P2 | P1? | P1+ | P1* | (P)
– E’ element type
– P1 , P2 concatenation
– P1 | P2 disjunction
– P? optional
– P+ one or more occurrences
– P* the Kleene closure
– (P) grouping
Element Type Definition
<name>Jeff Cohen</name>
<greet> Dr. Cohen</greet>
<email>[email protected]</email>
Some Difficult Structures
<!ELEMENT employee
((name, age, ssn) | (age, ssn, name) |
(ssn, name, age) | ...
•Structure •Format
•Schema Workflow
DTD versus Schema
XML Schemas
•Information •Elements
•Items •Attributes
Declaration & Definition
• Declaration Components
– are associated by (qualified) names to
information items being validated.
– It is like declaring objects in OOP.
• Definition Components
– define internal schema components that can be
used in other schema components.
– Type definition is like defining classes in OOP.
Type Definitions
• Inheritance
Each complex type definition is either
– a restriction of a complex type definition
– an extension of a simple or complex type definition
– a restriction of the ur-type definition.
• Example
•<xs:complexType name="personName"> •<xs:complexType name="extendedName">
• <xs:sequence> • <xs:complexContent>
• <xs:extension base="personName">
• <xs:element name="title" • <xs:sequence>
•… …
minOccurs="0"/> • <xs:element name="generation"
• </xs:sequence> • minOccurs="0"/>
• </xs:sequence>
•</xs:complexType> • </xs:extension>
• </xs:complexContent>
An XML Instance Document Example
XML Schema
Reusability and Conformance
• Two mechanisms
– Including and Redefining existing XML
Schemas components in an XML Schema
– Extending or Restricting existing data types in
an XML Schema definition
Building Reusable XML Schema- (1)
• xs:include • xs:redefine
– Similar to a copy and paste – Similar to xs:include
of the included schema – except that it lets you
– The calling schema doesn't redefine declarations from
allow to override the the included schema.
definitions of the included
Conformance Example: note.xml and note.xsd
•<?xml version="1.0"? >
•<note timestamp=“2002-12-20”>
• <to>Dove</to>
• <from>Jani</from>
• <heading>Reminder</heading>
• <body>Don't forget me this weekend!</body>
•<?xml version="1.0"?>
•<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" >
•<xs:element name="note">
• <xs:complexType>
• <xs:sequence>
• <xs:element name="to" type="xs:string"/>
• <xs:element name="from" type="xs:string"/>
• <xs:element name="heading" type="xs:string"/>
• <xs:element name="body" type="xs:string"/>
• </xs:sequence>
• <xs:attribute name=“timestamp” type=“xs:date” />
• </xs:complexType>
XML Parsers
•Microsoft Visual
•Data base
tools •XML IDE provides comprehensive XML
development support and complements
other software development tools
eXtensible Stylesheet Language
What is XSL?
◼ XSL stands for Extensible Stylesheet Language
◼ CSS was designed for styling HTML pages, and can be used to
style XML pages
◼ XSL was designed specifically to style XML pages, and is
much more sophisticated than CSS
◼ XSL consists of three languages:
◼ XSLT (XSL Transformations) is a language used to transform XML
documents into other kinds of documents (most commonly HTML, so
they can be displayed)
◼ XPath is a language to select parts of an XML document to transform
with XSLT
◼ XSL-FO (XSL Formatting Objects) is a replacement for CSS
◼ There are no current implementations of XSL-FO, and we won’t cover it
How does it work?
◼ The XML source document is parsed into an XML
source tree
◼ You use XPath to define templates that match parts of
the source tree
◼ You use XSLT to transform the matched part and put
the transformed information into the result tree
◼ The result tree is output as a result document
◼ Parts of the source document that are not matched by a
template are typically copied unchanged
Simple XPath
Simple XSLT
◼ <xsl:for-each select="//book"> loops through every
book element, everywhere in the document
◼ <xsl:for-each select="//book">
<xsl:value-of select="title"/>
chooses the content of the title element for each book
in the XML document
Using XSL to create HTML
◼ Our goal is to turn this: ◼ Into HTML that displays
<?xml version="1.0"?> something like this:
<book> Book Titles:
<author>Gregory Brill</author>
</book> • Java and XML
<book> Book Authors:
<title>Java and XML</title> • Gregory Brill
<author>Brett McLaughlin</author> • Brett McLaughlin
</library >
▪ Note that we’ve grouped
titles and authors
What we need to do
◼ We need to save our XML into a file (let’s call it
◼ We need to create a file (say, books.xsl) that describes
how to select elements from books.xml and embed
them into an HTML page
◼ We do this by intermixing the HTML and the XSL in the
books.xsl file
◼ We need to add a line to our books.xml file to tell it to
refer to books.xsl for formatting information
books.xml, revised
◼ <?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="books.xsl"?>
<title>XML</title> This tells you where
<author>Gregory Brill</author> to find the XSL file
<title>Java and XML</title>
<author>Brett McLaughlin</author>
</library >
Desired HTML
<title>Book Titles and Authors</title>
<body> Blue text is data extracted
<h2>Book titles:</h2> from the XML document
<li>Java and XML</li> Brown text is our
<h2>Book authors:</h2> HTML template
<li>Gregory Brill</li>
<li>Brett McLaughlin</li> We don’t necessarily
</ul> know how much data
we will have
XSL outline
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
<xsl:template match="/">
Selecting titles and authors
<h2>Book titles:</h2>
<xsl:for-each select="//book">
<li> Notice the
<xsl:value-of select="title"/> xsl:for-each
</li> loop
<h2>Book authors:</h2>
...same thing, replacing title with author
▪ Notice that XSL can rearrange the data; the HTML result
can present information in a different order than the XML
All of books.xml
◼ <?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="books.xsl"?>
<author>Gregory Brill</author>
<title>Java and XML</title>
<author>Brett McLaughlin</author>
</library > Note: if you do View Source,
this is what you will see, not the
resultant HTML
All of books.xsl
How to use it
◼ In a modern browser you can just open the XML
◼ Older browsers will ignore the XSL and just show you
the XML contents as continuous text
◼ You can use a program such as Xalan, MSXML,
or Saxon to create the HTML as a file
◼ This can be done on the server side, so that all the client
side browser sees is plain HTML
◼ The server can create the HTML dynamically from the
information currently in XML