0% found this document useful (0 votes)
11 views

XML

XML is a meta markup language designed for representing text documents and data, offering advantages such as portability, readability, and flexibility. It follows specific rules for structure, including case sensitivity and proper nesting of tags, and can be validated against Document Type Definitions (DTDs). XSLT is used to transform XML documents into other formats, such as HTML, enhancing their usability in web applications.

Uploaded by

22f3000894
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

XML

XML is a meta markup language designed for representing text documents and data, offering advantages such as portability, readability, and flexibility. It follows specific rules for structure, including case sensitivity and proper nesting of tags, and can be validated against Document Type Definitions (DTDs). XSLT is used to transform XML documents into other formats, such as HTML, enhancing their usability in web applications.

Uploaded by

22f3000894
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

XML

Dr. Koyel Datta Gupta


XML

XML is a meta markup language


for text documents / textual data

XML allows to define languages


(„applications“) to represent text
documents / textual data
Possible Advantages of
Using XML
 Truly Portable Data
 Easily readable by human users
 Very expressive (semantics near data)
 Very flexible and customizable (no finite
tag set)
 Easy to use from programs (libs available)
 Easy to convert into other representations
(XML transformation languages)
 Many additional standards and tools
 Widely used and supported
Example of an XML Document

<?xml version=“1.0”?>
<address>
<name>Alice Lee</name>
<email>[email protected]</email>
<phone>212-346-1234</phone>
<birthday>1985-03-22</birthday>
</address>
Difference Between HTML and
XML

 HTML tags have a fixed meaning and


browsers know what it is.
 XML tags are different for different
applications, and users know what
they mean.
 HTML tags are used for display.
 XML tags are used to describe
documents and data.
XML Rules

 Tags are enclosed in angle brackets.


 Tags come in pairs with start-tags
and end-tags.
 Tags must be properly nested.
 <name><email>…</name></email> is not
allowed.
 <name><email>…</email><name> is.
 Tagsthat do not have end-tags must
be terminated by a ‘/’.
 <br /> is an html example.
More XML Rules

 Tags are case sensitive.


 <address> is not the same as <Address>
 XML in any combination of cases is not
allowed as part of a tag.
 Tags may not contain ‘<‘ or ‘&’.
 Tags follow Java naming conventions,
except that a single colon and other
characters are allowed. They must begin
with a letter and may not contain white
space.
 Documents must have a single root tag
that begins the document.
Encoding

 XML (like Java) uses Unicode to encode


characters.
 Unicode comes in many flavors. The most
common one used in the West is UTF-8.
 UTF-8 is a variable length code. Characters are
encoded in 1 byte, 2 bytes, or 4 bytes.
 The first 128 characters in Unicode are ASCII.
 In UTF-8, the numbers between 128 and 255
code for some of the more common characters
used in western Europe, such as ã, á, å, or ç.
 Two byte codes are used for some characters not
listed in the first 256 and some Asian ideographs.
 Four byte codes can handle any ideographs that
are left.
 Those using non-western languages should
investigate other versions of Unicode.
Well-Formed Documents

 An XML document is said to be well-formed if it

follows all the rules.

 An XML parser is used to check that all the rules

have been obeyed.

 Recent browsers come with XML parsers.

 Java 1.4 also supports an open-source parser.


XML Example Revisited

<?xml version=“1.0”?>
<address>
<name>Alice Lee</name>
<email>[email protected]</email>
<phone>212-346-1234</phone>
<birthday>1985-03-22</birthday>
</address>
 Markup for the data aids understanding of its
purpose.
 A flat text file is not nearly so clear.
Alice Lee
[email protected]
212-346-1234
1985-03-22
 The last line looks like a date, but what is it for?
Expanded Example

<?xml version = “1.0” ?>


<address>
<name>
<first>Alice</first>
<last>Lee</last>
</name>
<email>[email protected]</email>
<phone>123-45-6789</phone>
<birthday>
<year>1983</year>
<month>07</month>
<day>15</day>
</birthday>
</address>
XML Files are Trees

address
name email phone birthday

first last
year month day
XML Trees

 An XML document has a single root


node.
 The tree is a general ordered tree.
 A parent node may have any number of
children.
 Child nodes are ordered, and may have
siblings.
 Preordertraversals are usually used
for getting information out of the
tree.
Validity

 A well-formed document has a tree


structure and obeys all the XML rules.
 A particular application may add more
rules in either a DTD (document type
definition) or in a schema.
 Many specialized DTDs and schemas have
been created to describe particular areas.
 These range from disseminating news
bulletins (RSS) to chemical formulas.
 DTDs were developed first, so they are not
as comprehensive as schema.
Document Type Definitions

A DTD describes the tree structure of


a document and something about its
data.
 There are two data types, PCDATA
and CDATA.
 PCDATA is parsed character data.
 CDATA is character data, not usually
parsed.
A DTD determines how many times a
node may appear, and how child
nodes are ordered.
Defining Attributes in the DTD

 Let's start by defining the attributes for


the elements in the slide presentation.
 Note:
Add the text highlighted below to
define the attributes for
the slideshow element:
 <!ELEMENT music (song)>
 <!ATTLIST song title CDATA
#REQUIRED date CDATA #IMPLIED
author CDATA "unknown" >
Defining Entities in the DTD

 So far, you've seen predefined


entities like &amp; and you've seen
that an attribute can reference an
entity. It's time now for you to learn
how to define entities of your own.

 <!ENTITY entity-name "entity-


value">
DTD for address Example

<!ELEMENT address (name, email, phone,


birthday)>
<!ELEMENT name (first, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
<!ELEMENT birthday (year, month, day)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT month (#PCDATA)>
<!ELEMENT day (#PCDATA)>
ENTITY example
 subject_name.dtd
<!ELEMENT subject_name (#PCDATA)>
<!ENTITY WT “WEB TECHNOLOGY">
 ● subject.xml
<!DOCTYPE subject_name SYSTEM
subject_name.dtd">
<subject_name>&WT;</subject_name>
XSLT
Extensible Stylesheet Language
Transformations

 XSLT is used to transform one xml


document into another, often an html
document.
 The Transform classes are now part of Java
1.4.
 A program is used that takes as input one
xml document and produces as output
another.
 If the resulting document is in html, it can
be viewed by a web browser.
 This is a good way to display xml data.
A Style Sheet to Transform
address.xml

<?xml version="1.0" encoding="ISO-8859-1"?>


<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<xsl:template match="address">
<html><head><title>Address
Book</title></head>
<body>
<xsl:value-of select="name"/>
<br/><xsl:value-of
select="email"/>
<br/><xsl:value-of
select="phone"/>
<br/><xsl:value-of
select="birthday"/>
</body>
</html>
</xsl:template>
</xsl:stylesheet>

You might also like