Chapter 1 XML Basic3
Chapter 1 XML Basic3
1
Outline
• Introduction
• XML tree
• XML syntax rules
• XML entity references
• XML elements
• XML attributes
• XML namespaces
2
What is XML?
• XML stands for EXtensible Markup Language
• XML is a markup language much like HTML
• XML was designed to describe data, not to display
data
• XML tags are not predefined. You must define your
own tags
• XML is designed to be self-descriptive
• XML is a W3C Recommendation
3
The Difference between XML and HTML
9
XML Makes Your Data More Available
• Different applications can access your data, not only in HTML
pages, but also from XML data sources.
• With XML, your data can be available to all kinds of "reading
machines" (Handheld computers, voice machines, news feeds,
etc.), and make it more available for blind people, or people with
other disabilities.
14
Exercise: Write the xml code for the following tree
15
Solution
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
16
Cont’
• The root element in the example is
<bookstore>. All <book> elements in the
document are contained within <bookstore>.
• The <book> element has 4 children: <title>, <
author>, <year>, <price>.
17
XML Syntax Rules
19
2.XML Tags are Case Sensitive
20
3.XML Elements Must be Properly Nested
•In HTML, you might see improperly nested
elements:
<b><i>This text is bold and
italic</b></i>
In XML, all elements must be properly
nested within each other:
<b><i>This text is bold and
italic</i></b>
In the example above, "Properly nested"
simply means that since the <i> element is
opened inside the <b> element, it must be
closed inside the <b> element. 21
4.XML Documents Must Have a Root Element
22
5.XML Attribute Values Must be quoted
• XML elements can have attributes in name/value pairs just
like in HTML.
• In XML, the attribute values must always be quoted.
• Study the two XML documents below. The first one is
incorrect, the second is correct:
<note date=12/11/2007>
<to>Tove</to>
<from>Jani</from>
</note>
<note date="12/11/2007">
<to>Tove</to>
<from>Jani</from>
</note>
The error in the first document is that the date attribute in the
note element is not quoted.
23
Entity References
• Some characters have a special meaning
in XML.
• If you place a character like "<" inside an
XML element, it will generate an error
because the parser interprets it as the
start of a new element.
• This will generate an XML error:
<message>if salary < 1000 then</message>
• To avoid this error, replace the "<" character with
an entity reference:
<message>if salary < 1000 then</message> 24
There are 5 predefined entity references in XML:
Note: Only the characters "<" and "&" are strictly illegal in XML. The greater than
character is legal, but it is a good habit to replace it.
25
Comments in XML
The syntax for writing comments in XML is similar
to that of HTML.
<!-- This is a comment -->
White-space is preserved in XML
HTML truncates multiple white-space characters to
one single white-space:
With XML, the white-space in a document is not truncated.
26
XML Elements
• What is an XML Element?
• An XML element is everything from
(including) the element's start tag to
(including) the element's end tag.
• An element can contain:
– other elements
– text
– attributes
– or a mix of all of the above...
27
Example
<bookstore>
<book category="CHILDREN">
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title>Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
28
</bookstore>
Cont’
• In the example above, <bookstore> and
<book> have element contents,
because they contain other elements.
<book> also has
an attribute (category="CHILDREN").
<title>, <author>, <year>, and <price>
have text content because they contain
text.
29
Empty XML Elements
• An alternative syntax can be used for XML
elements with no content:
• Instead of writing a book element (with no
content) like this:
<book></book>
• It can be written like this:
<book />
• This sort of element syntax is called self-closing.
30
XML Naming Rules
• XML elements must follow these naming rules:
– Names can contain letters, numbers, and other characters
– Element names must start with a letter or underscore
– Names cannot start with a number or punctuation
character
– Names cannot start with the letters xml (or XML, or Xml,
etc)
– Names cannot contain spaces
– Element names can contain letters, digits, hyphens,
underscores, and periods
Any name can be used, no words are reserved. (except xml).
31
Best Naming Practices
• Make names descriptive: <first_name>, <last_name>.
• Make names short and simple, like this: <book_title> not like
this: <the_title_of_the_book>.
• Avoid "-". If you name something "first-name", some
software may think you want to subtract name from first.
• Avoid ".". If you name something "first.name", some
software may think that "name" is a property of the object
"first."
• Avoid ":". Colons are reserved to be used for something
called namespaces (more later).
• Non-English letters like éòá are perfectly legal in XML, but
watch out for problems if your software doesn't support
them. 32
Common Naming Styles
Imagine that the author of the XML document added some extra information to it:
<note>
<date>2008-01-10</date>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
40
Avoid XML Attributes?
• Some of the problems with using attributes are:
– attributes cannot contain multiple values (elements can)
– attributes cannot contain tree structures (elements can)
– attributes are not easily expandable (for future changes)
• Attributes are difficult to read and maintain. Use
elements for data. Use attributes for information
that is not relevant to the data.
• Don't end up like this:
• <note day="10" month="01" year="2008"
to="Tove" from="Jani" heading="Reminder"
body="Don't forget me this weekend!">
</note> 41
XML Attributes for Metadata
Sometimes ID references are assigned to elements. These IDs
can be used to identify XML elements in much the same way as
the id attribute in HTML. This example demonstrates this:
<messages>
<note id="501">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
<note id="502">
<to>Jani</to>
<from>Tove</from>
<heading>Re: Reminder</heading>
<body>I will not</body>
</note>
</messages> 42
Cont’
• The id attributes above are for identifying the
different notes. It is not a part of the note
itself.
• metadata (data about data) should be stored
as attributes, and the data itself should be
stored as elements.
43
XML Namespaces
Name Conflicts
•In XML, element names are defined by the developer. This often results in a
conflict when trying to mix XML documents from different XML applications.
•This XML carries HTML table information:
<table>
<tr>
<td>Apples</td>
<td>Bananas</td>
</tr>
</table>
•This XML carries information about a table (a piece of furniture):
<table>
<name>African Coffee Table</name>
<width>80</width>
<length>120</length>
</table>
44
Cont’
• If these XML fragments were added together,
there would be a name conflict. Both contain
a <table> element, but the elements have
different content and meaning.
45
Solving the Name Conflict Using a Prefix
•Name conflicts in XML can easily be avoided using a name prefix.
•This XML carries information about an HTML table, and a piece of furniture:
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
•In the example above, there will be no conflict because the two <table>
elements have different names.
46
XML Namespaces - The xmlns Attribute
• When using prefixes in XML, a so-called namespace for the
prefix must be defined.
• The namespace is defined by the xmlns attribute in the start
tag of an element.
47
•The namespace declaration has the following syntax. xmlns:
prefix="URI".
<root>
<h:table xmlns:h="http://www.w3.org/TR/html4/">
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table xmlns:f="http://www.w3schools.com/furniture">
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>
48
Cont’
• In the example above, the xmlns attribute in
the <table> tag give the h: and f: prefixes a
qualified namespace.
• When a namespace is defined for an element,
all child elements with the same prefix are
associated with the same namespace.
50
•Namespaces can be declared in the elements where they are
used or in the XML root element:
<root xmlns:h="http://www.w3.org/TR/html4/"
xmlns:f="http://www.w3schools.com/furniture">
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>
51
• Note: The namespace URI is not used by the parser to look
up information.
• The purpose is to give the namespace a unique name.
However, often companies use the namespace as a pointer
to a web page containing namespace information.
54
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>My CD Collection</h2>
<table border="1">
<tr>
<th style="text-align:left">Title</th>
<th style="text-align:left">Artist</th>
</tr>
<xsl:for-each select="catalog/cd">
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
55
XML Encoding
59
Viewing XML Files
60
XML editor tools
I recommend:
• EditiX
• Altova xmlspy
61
QUIZ
1. Select which of the following XML documents
are well-formed XML documents.
a . <productname>Electric Water
Heater&piccoro_</productname>
b . <productname>”Water
Purifier(<<6>>)”</productname>
c . <productname>Dehumidifier "XZ001"
</productname>
d . <productname/ >
62