Chap 2 XML
Chap 2 XML
XML
1) HOW TO CREATE XML FILE (name.xml) AND DATA-
tree like structure?
2) HOW TO CREATE DTD (Document Type Definition
FILE AND VALIDATE THE XML DATA
3) HOW TO RETRIEVE THE STORED DATA - XSLT
Introduction to XML
What is XML?
XML stands for eXtensible Markup Language and it is
used for storing and transferring data.
XML tags identify the data and are used to store and
organize the data, rather than specifying how to display it
like HTML tags, which are used to display the data.
XML allows you to create your own self-descriptive tags,
or language, that suits your application.
XML is a markup language that defines set of rules for
encoding documents in a format that is both human-
readable and machine-readable.
A markup language is a set of symbols that can be placed
in the text of a document to label the parts of that
document.
Differences between XML and HTML
XML and HTML were designed with different goals:
XML is designed to carry data emphasizing on what type of
data it is.
HTML is designed to display data emphasizing on how data
looks
XML tags are not predefined like HTML tags.
HTML is about displaying data, hence it is static whereas XML
is about carrying information, which makes it dynamic.
HTML – presentation lang, XML- db lang
HTML markup lang, XML- proves a framework for defining a
markup lang.
HTML – case insensitive lang, xml – case sensitive lang
HTML- tags are pre-defined, XML- tags are self descriptive
XML output is in tree structure
Structure of an XML Document
XML documents must contain a root element. This element is "the
parent" of all other elements.
The elements in an XML document form a document tree. The tree
starts at the root and branches to the lowest level of the tree.
All elements can have sub elements (child elements).
<? XML version=“1.0” encoding=“UTF-8” ?>
<root>
<child attribute=“value”>
<subchild>.....</subchild>
</child>
</root>
XML Tags are Case Sensitive
XML tags are case sensitive. The tag <Letter> is different from the tag
</letter>.
Opening and closing tags must be written with the same case
Example, we can see it as a document that can be used for sending
sender’s name (the value inside <from> tag), receiver’s name (the
value inside <to> tag) and the message (the value inside <msg> tag).
<?xml version="1.0" encoding="UTF-8"?>
<message>
<to>MyReader</to>
<from>Alice</from>
<msg>Welcome to SYCS</msg>
</message>
-----------------------------------------------------------------
<Book>
<Author>
<Title>ABC</ Title >
</Author>
</Book>
Entity References
Both, HTML and XML, have some symbols reserved
for their use, which cannot be used as content in XML
code. For example, < and > signs are used for opening
and closing XML tags. To display these special
characters, the character entities are used.
There are few special characters or symbols which are
not available to be typed directly from the keyboard.
Character Entities can also be used to display those
symbols/special characters.
<message>salary < 1000</message>
Replace the "<" character with an entity reference:
<message>salary < 1000</message>
There are 5 pre-defined entity references in XML: