0% found this document useful (0 votes)
122 views7 pages

Chapter 6 XML

XML stands for eXtensible Markup Language. It is used to provide additional information about a document by adding tags. XML tags describe the meaning of data rather than how to display it like HTML tags. XML documents must follow specific rules to be considered well-formed, such as being properly nested and case sensitive. XML is commonly used to transfer data between systems and applications.

Uploaded by

Habtamu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
122 views7 pages

Chapter 6 XML

XML stands for eXtensible Markup Language. It is used to provide additional information about a document by adding tags. XML tags describe the meaning of data rather than how to display it like HTML tags. XML documents must follow specific rules to be considered well-formed, such as being properly nested and case sensitive. XML is commonly used to transfer data between systems and applications.

Uploaded by

Habtamu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

28-Jan-20

Wollo University
Kombolcha Institute of Technology What is XML
College of Informatics
Department of Information System  XML stands for eXtensible Markup Language.
 A markup language is used to provide information
Introduction to Internet Programming I
about a document.
Chapter 6  Tags are added to the document to provide the extra
Introduction to Extensible Markup information.
 HTML tags tell a browser how to display the
Language (XML) document.
 XML tags give a reader some idea what some of the
Instructor: Habtamu Abate (M.Sc.) data means.
Email: [email protected]
1

What is XML Used For? Advantages of XML

 XML documents are used to transfer data from one place to  XML is text (Unicode) based.
another often over the Internet.
 Takes up less space.
 XML subsets are designed for particular applications.
 Can be transmitted efficiently.
 One is RSS (Rich Site Summary or Really Simple Syndication ).
It is used to send breaking news bulletins from one web site to  One XML document can be displayed differently in
another. different media.
 A number of fields have their own subsets. These include  Html, video, CD, DVD,
chemistry, mathematics, and books publishing.
 You only have to change the XML document in order to
 Most of these subsets are registered with the W3Consortium
and are available for anyone’s use. change all the rest.
 XML documents can be modularized. Parts can be
reused.

1
28-Jan-20

Example of an HTML Document Example of an XML Document

<html> <?xml version=“1.0”/>


<head><title>Example</title></head. <address>
<body> <name>Alice Lee</name>
<h1>This is an example of a page.</h1> <email>[email protected]</email>
<h2>Some information goes here.</h2> <phone>212-346-1234</phone>
</body> <birthday>1985-03-22</birthday>
</html> </address>

Difference Between HTML and XML XML Rules

 HTML tags have a fixed meaning and browsers  Tags are enclosed in angle brackets.
know what it is.  Tags come in pairs with start-tags and end-tags.
 XML tags are different for different applications,  Tags must be properly nested.
and users know what they mean.  <name><email>…</name></email> is not allowed.
 <name><email>…</email><name> is.
 HTML tags are used for display.  Tags that do not have end-tags must be terminated
 XML tags are used to describe documents and data. by a ‘/’.
 <br /> is an html example.

2
28-Jan-20

More XML Rules Encoding


 XML (like Java) uses Unicode to encode characters.
 Tags are case sensitive.  Unicode comes in many flavors. The most common one used in
 <address> is not the same as <Address> the West is UTF-8.
 XML in any combination of cases is not allowed as  UTF-8 is a variable length code. Characters are encoded in 1
byte, 2 bytes, or 4 bytes.
part of a tag.
 The first 128 characters in Unicode are ASCII.
 Tags may not contain ‘<‘ or ‘&’.  In UTF-8, the numbers between 128 and 255 code for some of
 Tags follow Java naming conventions, except that a the more common characters used in western Europe, such as ã,
single colon and other characters are allowed. They á, å, or ç.
must begin with a letter and may not contain white  Two byte codes are used for some characters not listed in the
space. first 256 and some Asian ideographs.
 Documents must have a single root tag that begins the  Four byte codes can handle any ideographs that are left.
document.  Those using non-western languages should investigate other
versions of Unicode.

Well-Formed Documents XML Example Revisited


<?xml version=“1.0”/>
 An XML document is said to be well-formed if it <address>
<name>Alice Lee</name>
follows all the rules.
<email>[email protected]</email>
 An XML parser is used to check that all the rules have <phone>212-346-1234</phone>
been obeyed. <birthday>1985-03-22</birthday>
 Recent browsers such as Internet Explorer 5 and </address>
Netscape 7 come with XML parsers.  Markup for the data aids understanding of its purpose.
 Parsers are also available for free download over  A flat text file is not nearly so clear.
the Internet. One is Xerces, from the Apache open- Alice Lee
source project. [email protected]
 Java 1.4 also supports an open-source parser. 212-346-1234
1985-03-22
 The last line looks like a date, but what is it for?

3
28-Jan-20

Expanded Example XML Files are Trees

<?xml version = “1.0” ?>


<address>
<name> address
<first>Alice</first>
<last>Lee</last> name email phone birthday
</name>
<email>[email protected]</email> first last year month day
<phone>123-45-6789</phone>
<birthday>
<year>1983</year>
<month>07</month>
<day>15</day>
</birthday>
</address>

XML Trees Validity

 An XML document has a single root node.  A well-formed document has a tree structure and
obeys all the XML rules.
 The tree is a general ordered tree.  A particular application may add more rules in either
A parent node may have any number of children. a DTD (document type definition) or in a schema.
 Child nodes are ordered, and may have siblings.  Many specialized DTDs and schemas have been
created to describe particular areas.
 Preorder traversals are usually used for getting
 These range from disseminating news bulletins (RSS)
information out of the tree. to chemical formulas.
 DTDs were developed first, so they are not as
comprehensive as schema.

4
28-Jan-20

Document Type Definitions DTD for address Example

 A DTD describes the tree structure of a document <!ELEMENT address (name, email, phone, birthday)>
and something about its data. <!ELEMENT name (first, last)>
 There are two data types, PCDATA and CDATA. <!ELEMENT first (#PCDATA)>
 PCDATA is parsed character data. <!ELEMENT last (#PCDATA)>
 CDATA is character data, not usually parsed.
<!ELEMENT email (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
 A DTD determines how many times a node may
<!ELEMENT birthday (year, month, day)>
appear, and how child nodes are ordered.
<!ELEMENT year (#PCDATA)>
<!ELEMENT month (#PCDATA)>
<!ELEMENT day (#PCDATA)>

Schemas Schema for First address Example


<?xml version="1.0" encoding="ISO-8859-1" ?>
 Schemas are themselves XML documents. <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="address">
 They were standardized after DTDs and provide <xs:complexType>
more information about the document. <xs:sequence>
<xs:element name="name" type="xs:string"/>
 They have a number of data types including string, <xs:element name="email" type="xs:string"/>
decimal, integer, boolean, date, and time. <xs:element name="phone" type="xs:string"/>
<xs:element name="birthday" type="xs:date"/>
 They divide elements into simple and complex types. </xs:sequence>
 They also determine the tree structure and how many </xs:complexType>
</xs:element>
children a node may have.
</xs:schema>

5
28-Jan-20

XSLT
Explanation of Example Schema
Extensible Stylesheet Language Transformations
<?xml version="1.0" encoding="ISO-8859-1" ?>
 ISO-8859-1, Latin-1, is the same as UTF-8 in the first 128 characters.

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 XSLT is used to transform one xml document into
 www.w3.org/2001/XMLSchema contains the schema standards. another, often an html document.
<xs:element name="address">
 The Transform classes are now part of Java 1.4.
<xs:complexType>
 This states that address is a complex type element.  A program is used that takes as input one xml
<xs:sequence> document and produces as output another.
 This states that the following elements form a sequence and must come in the
order shown.  If the resulting document is in html, it can be viewed
<xs:element name="name" type="xs:string"/> by a web browser.
 This says that the element, name, must be a string.

<xs:element name="birthday" type="xs:date"/>  This is a good way to display xml data.


 This states that the element, birthday, is a date. Dates are always of the form
yyyy-mm-dd.

A Style Sheet to Transform address.xml Parsers

<?xml version="1.0" encoding="ISO-8859-1"?>


<xsl:stylesheet version="1.0"
 There are two principal models for parsers.
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">  SAX – Simple API for XML
<xsl:template match="address">
<html><head><title>Address Book</title></head>  Uses a call-back method
<body>  Similar to javax listeners
<xsl:value-of select="name"/>
<br/><xsl:value-of select="email"/>  DOM – Document Object Model
<br/><xsl:value-of select="phone"/>  Creates a parse tree
<br/><xsl:value-of select="birthday"/>
</body>  Requires a tree traversal
</html>
</xsl:template>
</xsl:stylesheet>

6
28-Jan-20

25 Question???
Practice:
Lab Exercise # 1,2 and3

Internet Programming I CSS

You might also like