XML Mod4
XML Mod4
XML Mod4
◼ a family of technologies:
- XML 1.1
- Xlink
- Xpointer & Xpath
- CSS, XSL, XSLT
- XML DOM
- XML Namespaces
- XML Schemas
The Difference Between XML
and HTML
<?xml version=“1.0”?>
<Book>
<author> Ditel & Ditel </author>
<title> Internet and World Wide Web </title>
<price> 850 </price>
</Book>
A Simple XML Document with
Attributes Attributes with
name and value
<article>
<author>Gerhard Weikum</author>
<title> The Web in Ten Years </title>
<text>
In order to evolve
</text>
</article>
XML Components
A DTD can be
➢ Internal
➢ The DTD is part of the document file
➢ external
➢ The DTD and the document are on separate files
➢ An external DTD may reside
➢ In the local file system (where the document is)
➢ In a remote file system
Connecting a Document with its DTD
➢ An internal DTD
<?xml version="1.0"?>
<!DOCTYPE db [<!ELEMENT ...> … ]>
<db> ... </db>
<!DOCTYPE db SYSTEM
"http://www.schemaauthority.com/schema.dtd">
Specifying the Structure
<!ELEMENT E content-model>
Content-model ::=
EMPTY | ANY | #PCDATA | E’ |
P1, P2 | P1 | P2 | P1? | P1+ | P1* | (P)
– E’ element type
– P1 , P2 concatenation
– P1 | P2 disjunction
– P? optional
– P+ one or more occurrences
– P* the Kleene closure
– (P) grouping
Element Type Definition
<addressbook>
<person>
<name>Jeff Cohen</name>
<greet> Dr. Cohen</greet>
<email>[email protected]</email>
</person>
</addressbook>
Some Difficult Structures
<!ELEMENT employee
((name, age, ssn) | (age, ssn, name) |
(ssn, name, age) | ...
)>
•Information
•Information
•Structure •Format
•Format
•Structure
•Schema Workflow
DTD versus Schema
XML Schemas
•Information •Elements
•Items •Attributes
Declaration & Definition
• Declaration Components
– are associated by (qualified) names to
information items being validated.
– It is like declaring objects in OOP.
• Definition Components
– define internal schema components that can be
used in other schema components.
– Type definition is like defining classes in OOP.
Type Definitions
• Inheritance
Each complex type definition is either
– a restriction of a complex type definition
– an extension of a simple or complex type definition
– a restriction of the ur-type definition.
• Example
•<xs:complexType name="personName"> •<xs:complexType name="extendedName">
• <xs:sequence> • <xs:complexContent>
• <xs:extension base="personName">
• <xs:element name="title" • <xs:sequence>
•… …
minOccurs="0"/> • <xs:element name="generation"
• </xs:sequence> • minOccurs="0"/>
• </xs:sequence>
•</xs:complexType> • </xs:extension>
• </xs:complexContent>
•</xs:complexType>
An XML Instance Document Example
•book.xsd
XML Schema
Reusability and Conformance
• Two mechanisms
– Including and Redefining existing XML
Schemas components in an XML Schema
definition
– Extending or Restricting existing data types in
an XML Schema definition
Building Reusable XML Schema- (1)
• xs:include • xs:redefine
– Similar to a copy and paste – Similar to xs:include
of the included schema – except that it lets you
– The calling schema doesn't redefine declarations from
allow to override the the included schema.
definitions of the included
schema.
Conformance Example: note.xml and note.xsd
•<?xml version="1.0"? >
•<note timestamp=“2002-12-20”>
• <to>Dove</to>
• <from>Jani</from>
• <heading>Reminder</heading>
• <body>Don't forget me this weekend!</body>
•</note>
•<?xml version="1.0"?>
•<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" >
•<xs:element name="note">
• <xs:complexType>
• <xs:sequence>
• <xs:element name="to" type="xs:string"/>
• <xs:element name="from" type="xs:string"/>
• <xs:element name="heading" type="xs:string"/>
• <xs:element name="body" type="xs:string"/>
• </xs:sequence>
• <xs:attribute name=“timestamp” type=“xs:date” />
• </xs:complexType>
•</xs:element>
•</xs:schema>
XML Parsers
•Microsoft Visual
studio
•.NET
• XML
IDE
•Data base
•WEB
•Programming
DEVELOPMENT
TOOLS
•Java
programming
tools •XML IDE provides comprehensive XML
development support and complements
other software development tools
XSL
eXtensible Stylesheet Language
XSLT
What is XSL?
◼ XSL stands for Extensible Stylesheet Language
◼ CSS was designed for styling HTML pages, and can be used to
style XML pages
◼ XSL was designed specifically to style XML pages, and is
much more sophisticated than CSS
◼ XSL consists of three languages:
◼ XSLT (XSL Transformations) is a language used to transform XML
documents into other kinds of documents (most commonly HTML, so
they can be displayed)
◼ XPath is a language to select parts of an XML document to transform
with XSLT
◼ XSL-FO (XSL Formatting Objects) is a replacement for CSS
◼ There are no current implementations of XSL-FO, and we won’t cover it
2
How does it work?
◼ The XML source document is parsed into an XML
source tree
◼ You use XPath to define templates that match parts of
the source tree
◼ You use XSLT to transform the matched part and put
the transformed information into the result tree
◼ The result tree is output as a result document
◼ Parts of the source document that are not matched by a
template are typically copied unchanged
3
Simple XPath
4
Simple XSLT
◼ <xsl:for-each select="//book"> loops through every
book element, everywhere in the document
◼ <xsl:for-each select="//book">
<xsl:value-of select="title"/>
</xsl:for-each>
chooses the content of the title element for each book
in the XML document
5
Using XSL to create HTML
◼ Our goal is to turn this: ◼ Into HTML that displays
<?xml version="1.0"?> something like this:
<library>
<book> Book Titles:
<title>XML</title>
<author>Gregory Brill</author>
• XML
</book> • Java and XML
<book> Book Authors:
<title>Java and XML</title> • Gregory Brill
<author>Brett McLaughlin</author> • Brett McLaughlin
</book>
</library >
▪ Note that we’ve grouped
titles and authors
separately
6
What we need to do
◼ We need to save our XML into a file (let’s call it
books.xml)
◼ We need to create a file (say, books.xsl) that describes
how to select elements from books.xml and embed
them into an HTML page
◼ We do this by intermixing the HTML and the XSL in the
books.xsl file
◼ We need to add a line to our books.xml file to tell it to
refer to books.xsl for formatting information
7
books.xml, revised
◼ <?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="books.xsl"?>
<library>
<book>
<title>XML</title> This tells you where
<author>Gregory Brill</author> to find the XSL file
</book>
<book>
<title>Java and XML</title>
<author>Brett McLaughlin</author>
</book>
</library >
8
Desired HTML
<html>
<head>
<title>Book Titles and Authors</title>
</head>
<body> Blue text is data extracted
<h2>Book titles:</h2> from the XML document
<ul>
<li>XML</li>
<li>Java and XML</li> Brown text is our
</ul>
<h2>Book authors:</h2> HTML template
<ul>
<li>Gregory Brill</li>
<li>Brett McLaughlin</li> We don’t necessarily
</ul> know how much data
</body>
</html>
we will have
9
XSL outline
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
</xsl:template>
</xsl:stylesheet>
10
Selecting titles and authors
<h2>Book titles:</h2>
<ul>
<xsl:for-each select="//book">
<li> Notice the
<xsl:value-of select="title"/> xsl:for-each
</li> loop
</xsl:for-each>
</ul>
<h2>Book authors:</h2>
...same thing, replacing title with author
▪ Notice that XSL can rearrange the data; the HTML result
can present information in a different order than the XML
11
All of books.xml
◼ <?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="books.xsl"?>
<library>
<book>
<title>XML</title>
<author>Gregory Brill</author>
</book>
<book>
<title>Java and XML</title>
<author>Brett McLaughlin</author>
</book>
</library > Note: if you do View Source,
this is what you will see, not the
resultant HTML
12
All of books.xsl
13
How to use it
◼ In a modern browser you can just open the XML
file
◼ Older browsers will ignore the XSL and just show you
the XML contents as continuous text
◼ You can use a program such as Xalan, MSXML,
or Saxon to create the HTML as a file
◼ This can be done on the server side, so that all the client
side browser sees is plain HTML
◼ The server can create the HTML dynamically from the
information currently in XML
14