0% found this document useful (0 votes)
65 views

XML Basics Extensible Markup Language: Divya Panta 21109

XML (eXtensible Markup Language) is used to describe and store data in a structured format. It allows users to define their own tags to provide context and meaning to data. XML is often used to separate data from presentation and store data in a plain text, machine-readable format that can be processed by different applications. An XML document must follow specific rules to be considered "well-formed", including properly nested tags and a single root element.

Uploaded by

Divya Panta
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views

XML Basics Extensible Markup Language: Divya Panta 21109

XML (eXtensible Markup Language) is used to describe and store data in a structured format. It allows users to define their own tags to provide context and meaning to data. XML is often used to separate data from presentation and store data in a plain text, machine-readable format that can be processed by different applications. An XML document must follow specific rules to be considered "well-formed", including properly nested tags and a single root element.

Uploaded by

Divya Panta
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 17

XML BASICS

eXtensible Markup Language

DIVYA PANTA
21109
What is XML
XML stands for eXtensible Markup Language.
Markup languages are designed for the
processing, definition and presentation of text.
The language specifies code for formatting, both
the layout and style, within a text file.
Tags are added to the document to provide the
extra information.
HTML tags tell a browser how to display the
document.
XML tags give a reader some idea what some of
the data means.
What is XML Used For?
 XML is often used for distributing data over the Internet.
 XML is used in many aspects of web development.
 XML is often used to separate data from presentation.
 XML tags are not predefined like HTML tags are
 XML was designed to carry data - with focus on what data is.
 XML stores data in plain text format. This provides a
software- and hardware-independent way of storing,
transporting, and sharing data.
 With XML, data can be available to all kinds of "reading
machines" like people, computers, voice machines, news
feeds, etc.
Advantages of XML
 Simplicity
Information coded in XML is easy to read and understand,
plus it can be processed easily by computers.
 Openness
XML is a W3C standard,
endorsed by software industry market leaders.
 Extensibility
There is no fixed set of tags.
New tags can be created as they are needed.
 Self-description
XML documents can be stored without [schemas] because they contain
meta data; any XML tag can possess an unlimited number of attributes
such as author or version.
 Contains machine-readable context information
Tags, attributes and element structure provide context information ...
opening up new possibilities for highly efficient search engines,
intelligent data mining, agents, etc.
 Separates content from presentation
XML tags describe meaning not presentation.
The look and feel of an XML document can be controlled by XSL
stylesheets, allowing the look of a document (or of a complete Web
site) to be changed without touching the content of the document.
Multiple views or presentations of the same content are easily rendered.
 Facilitates the comparison and aggregation of data
The tree structure of XML documents allows documents to be
compared and aggregated efficiently element by element.
 Can embed multiple data types
XML documents can contain any possible data type — from
multimedia data (image, sound, video) to active components (Java
applets, ActiveX).
 Rapid adoption by industry
Software AG, IBM, Sun, Microsoft, Netscape, DataChannel, SAP ...
Example of an HTML Document
<html>
<head><title>Example</title></head.
<body>
<h1>This is an example of a page.</h1>
<h2>Some information goes here.</h2>
</body>
</html>
Example of an XML Document
<?xml version=“1.0”/>
<address>
<name>Alice Lee</name>
<email>[email protected]</email>
<phone>212-346-1234</phone>
<birthday>1985-03-22</birthday>
</address>
Difference Between HTML and XML
HTML tags have a fixed meaning and browsers know
what it is.
XML tags are different for different applications, and
users know what they mean.
HTML tags are used for display.
XML tags are used to describe documents and data.
XML Rules
Tags are enclosed in angle brackets.
Tags come in pairs with start-tags and end-tags.
Tags must be properly nested.
 <name><email>…</name></email> is not allowed.
 <name><email>…</email><name> is.
Tags that do not have end-tags must be terminated by a
‘/’.
<br /> is an html example.
More XML Rules
Tags are case sensitive.
 <address> is not the same as <Address>
XML in any combination of cases is not allowed
as part of a tag.
Tags may not contain ‘<‘ or ‘&’.
Tags follow Java naming conventions, except that
a single colon and other characters are allowed.
They must begin with a letter and may not contain
white space.
Documents must have a single root tag that begins
the document.
Encoding
 XML (like Java) uses Unicode to encode characters.
 Unicode comes in many flavors. The most common one used
in the West is UTF-8.
 UTF-8 is a variable length code. Characters are encoded in 1
byte, 2 bytes, or 4 bytes.
 The first 128 characters in Unicode are ASCII.
 In UTF-8, the numbers between 128 and 255 code for some of
the more common characters used in western Europe, such as
ã, á, å, or ç.
 Two byte codes are used for some characters not listed in the
first 256 and some Asian ideographs.
 Four byte codes can handle any ideographs that are left.
 Those using non-western languages should investigate other
versions of Unicode.
Well-Formed Documents
An XML document is said to be well-formed if it
follows all the rules.
An XML parser is used to check that all the rules
have been obeyed.
Recent browsers such as Internet Explorer 5 and
Netscape 7 come with XML parsers.
Parsers are also available for free download over
the Internet. One is Xerces, from the Apache
open-source project.
Java 1.4 also supports an open-source parser.
XML Example Revisited
<?xml version=“1.0”/>
<address>
<name>Alice Lee</name>
<email>[email protected]</email>
<phone>212-346-1234</phone>
<birthday>1985-03-22</birthday>
</address>
 Markup for the data aids understanding of its purpose.
 A flat text file is not nearly so clear.
Alice Lee
[email protected]
212-346-1234
1985-03-22
 The last line looks like a date, but what is it for?
Expanded Example
<?xml version = “1.0” ?>
<address>
<name>
<first>Alice</first>
<last>Lee</last>
</name>
<email>[email protected]</email>
<phone>123-45-6789</phone>
<birthday>
<year>1983</year>
<month>07</month>
<day>15</day>
</birthday>
</address>
XML Files are Trees

address

name email phone birthday

first last year month day


XML Trees
An XML document has a single root node.
The tree is a general ordered tree.
A parent node may have any number of children.
Child nodes are ordered, and may have siblings.
Preorder traversals are usually used for getting
information out of the tree.

You might also like