0% found this document useful (0 votes)
50 views

Chapter 5 XML

XML is a markup language that structures data to make it more readable for humans and machines. It allows users to define their own tags to provide context to data. XML has several advantages over HTML, including being platform-independent and allowing one document to be displayed in different formats. XML documents are also modular and reusable. The document discusses XML syntax rules, validation via DTDs and schemas, and how XSLT can transform XML documents into other formats like HTML.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

Chapter 5 XML

XML is a markup language that structures data to make it more readable for humans and machines. It allows users to define their own tags to provide context to data. XML has several advantages over HTML, including being platform-independent and allowing one document to be displayed in different formats. XML documents are also modular and reusable. The document discusses XML syntax rules, validation via DTDs and schemas, and how XSLT can transform XML documents into other formats like HTML.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

CHAPTER FIVE

eXtensible Markup Language XML

1
Why XML ?

Although HTML is widely used for formatting and


structuring Web documents, it is not suitable for
specifying structured data that is extracted from
databases

A new language—namely XML has emerged as


the standard for structuring and exchanging data
over the Web.

2
What is XML ?

• XML stands for eXtensible Markup Language.


• A markup language is used to provide information
about a document.
• Tags are added to the document to provide the extra
information.
• XML tags give a reader some idea what some of the
data means
• XML and HTML have a similar syntax … both derived
from SGML

3
The Basic Rules

• XML is case sensitive


• All start tags must have end tags
• Elements must be properly nested
• XML declaration is the first statement
• Every document must contain a root element
• Attribute values must have quotation marks
• Certain characters are reserved for parsing

4
Encoding

• XML uses Unicode to encode characters.


• Unicode comes in many flavors.
• The most common one used in West is UTF-8.
• UTF-8 is a variable length code. Characters are
encoded in 1 byte, 2 bytes, or 4 bytes.

5
Example :
<?xml version = “1.0” ?>
<address>
<name>
<first>Mohammed </first>
<last>Ali</last>
</name>
<email>[email protected]</email>
<phone>05278743</phone>
<birthday>
<year>2001</year>
<month>01</month>
<day>09</day>
</birthday>
</address>

6
XML Files are Trees

• An XML document has a single root node.


• Preorder traversal are usually used.

address

name email phone birthday

first last year month day

7
HTML vs XML

• Fixed set of tags  Extensible set of tags


• Presentation oriented  Content orientated
• No data validation  Standard Data
capabilities infrastructure
• Single presentation
 Allows multiple output
• Tags are used for forms
display.
 Tags are used to
describe documents
and data.
8
Validation
• A well-formed document has a tree structure and obeys
all the XML rules.
• A particular application may add more rules in either a
DTD (document type definition) or in a schema.
• Many specialized DTDs and schemas have been created
to describe particular areas.
• DTDs were developed first, so they are not as
comprehensive as schema.

9
DTD : Document Type Definitions
• A DTD describes the tree structure of a document and something
about its data.

• There are two data types, PCDATA and CDATA.


• PCDATA is parsed character data.
• CDATA is character data, not usually parsed.

• A DTD determines how many times a node may appear, and how
child nodes are ordered.

10
DTD for address Example

<!ELEMENT address (name, email, phone, birthday)>


<!ELEMENT name (first, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
<!ELEMENT birthday (year, month, day)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT month (#PCDATA)>
<!ELEMENT day (#PCDATA)>

11
Schemas

• Schemas are themselves XML documents.


• They were standardized after DTDs and provide more
information about the document.
• They have a number of data types including string, decimal,
integer, boolean, date, and time.
• They divide elements into simple and complex types.
• They also determine the tree structure and how many
children a node may have.

12
Schema for First address Example

<?xml version="1.0" encoding="ISO-8859-1" ?>


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="address">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="email" type="xs:string"/>
<xs:element name="phone" type="xs:string"/>
<xs:element name="birthday" type="xs:date"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

13
XSLT
Extensible Stylesheet Language Transformations

• XSLT is used to transform one xml document into another,


often an html document.
• A program is used that takes as input one xml document
and produces as output another.
• If the resulting document is in html, it can be viewed by a
web browser.
• This is a good way to display xml data.

14
A Style Sheet to Transform address.xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/
XSL/Transform">
<xsl:template match="address">
<html><head><title>Address Book</title></head>
<body>
<xsl:value-of select="name"/>
<br/><xsl:value-of select="email"/>
<br/><xsl:value-of select="phone"/>
<br/><xsl:value-of select="birthday"/>
</body>
</html>
</xsl:template> Result :
</xsl:stylesheet> AMU MCA
[email protected]
123-45-6789
15
1920-01-09
Parsers

• There are two principal models for parsers.


• SAX – Simple API for XML
• Uses a call-back method
• Similar to javax listeners
• DOM – Document Object Model
• Creates a parse tree
• Requires a tree traversal

16
Advantages of XML

• XML uses human, not computer, language. XML is


readable and understandable, even by novices, and no
more difficult to code than HTML.
• XML is platform independent and programming language
independent, thus it can be used on any system and
supports the technology change when that happens
• XML is text (Unicode) based.
• Takes up less space.
• Can be transmitted efficiently.
• One XML document can be displayed differently in
different media.
• Html, video, CD, DVD,
• You only have to change the XML document in order to
change all the rest.
• XML documents can be modularized. Parts can be reused.
17
Disadvantages of XML

More difficult ,demanding and precise than HTML.

Lack of browser support / end user applications.

The redundancy in syntax of XML causes higher


storage and transportation cost when the volume of
data is large

XML file sizes are usually very large due to its verbose
nature, it is totally dependent on who is writing it.

18

You might also like