0% found this document useful (0 votes)
28 views33 pages

WT Unit II

Uploaded by

Keerthi Talakoti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views33 pages

WT Unit II

Uploaded by

Keerthi Talakoti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 33

UNIT-2

XML: Introduction to XML, Defining XML tags, their


attributes and values, Document Type Definition, XML
Schemas, Document Object Model.

Parsing XML Data – DOM and SAX Parsers in Java.


Introduction to XML

What is XML?

•XML stands for EXtensible Markup Language

•XML was designed to store and transport data

•XML was designed to be self-descriptive

•XML was designed to be both human- and machine-readable.


Why XML?
• Extensible Markup Language (XML) lets you
define and store data in a shareable manner.
XML supports information exchange between
computer systems such as websites,
databases, and third-party applications.
The Difference Between XML and HTML
• XML and HTML were designed with different
goals:
– XML was designed to carry data - with focus on
what data is.
– HTML was designed to display data - with focus on
how data looks.
– XML tags are not predefined like HTML tags.
XML Tree
The XML Prolog
This line is called the XML prolog:
• <?xml version="1.0" encoding="UTF-8"?>
• UTF-8 (Unicode Transformation format)is the
default character encoding for XML
documents.
XML Syntax
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
XML Rules
• XML Tags are Case Sensitive
• XML Elements Must be Properly Nested
• XML Attribute Values Must Always be Quoted
• White-space is Preserved in XML
• In XML, it is illegal to omit the closing tag. All
elements must have a closing tag:
• XML documents that follows to the syntax
rules above are said to be "Well Formed" XML
documents.
XML Naming Rules
• XML elements must follow these naming rules:
– Element names are case-sensitive
– Element names must start with a letter or
underscore
– Element names cannot start with the letters xml
(or XML, or Xml, etc)
– Element names can contain letters, digits,
hyphens, underscores.
– Element names cannot contain spaces
– Any name can be used, no words are reserved
(except xml).
Rules that must be followed
while writing XML
1. In XML each start tag must have matching end tag.
2. The elements in XML must be properly nested.
3. The syntax of writing comments in XML is similar to HTML comments.
<!– comment line -->
<hello>
<!--
<body>Welcome</body>
-->
</hello>
5. XML does not truncate multiple white-spaces (HTML truncates
multiple white-spaces to one single white space.
Basic XML Document
Basic XML Document
•In XML we can create our own tags.
Example: first.xml
<Person> The XML scripts are
self descriptive
<Personal-Info>
<name>My name is Swathi</name>
<city>Hyderabad</city>
</Personal-info>
<Hobby>
<first>Reading</first>
<second>Programming</second>
<third>Movies</third>
</Hobby>
</Person>
Defining XML tags, their attributes
and values
An XML document contains XML Elements.
An XML element is everything from the element's start tag to
the element's end tag.

<price>29.99</price>

An element can contain:


text
attribute
s
other
elements
or a mix
of the
Defining XML tags, their attributes
and values
XML Attributes

XML elements can have attributes, just like HTML.


Attributes are designed to contain data related to a specific element.
Attribute values must always be quoted. Either single or double quotes can
be used.
Eg: <person gender="female"> or <person gender=‘female’>

XML Elements vs. Attributes

<person>
<person gender="female">
<gender>female</gender>
<firstname>Anna</firstname>
<firstname>Anna</firstname>
<lastname>Smith</lastname>
<lastname>Smith</lastname>
</person>
</person>
XML Namespace
XML Namespace

XML Namespaces provide a method to avoid element name conflicts.


In XML, element names are defined by the developer. This often results in a
conflict when trying to mix XML documents from different XML applications.

<table>
<table>
<tr>
<name>African Coffee Table</name>
<td>Apples</td>
<width>80</width>
<td>Bananas</td>
<length>120</length>
</tr>
</table>
</table>

If these XML fragments were added together, there would be a name conflict.
Both contain a <table> element, but the elements have different content and
meaning.

A user or an XML application will not know how to handle these differences.
Name prefix
Name conflicts in XML can easily be avoided using a name
prefix.

<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>

<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
Document Type Definition
•An XML document with correct syntax is called "Well Formed".
•An XML document validated against a DTD is both "Well Formed" and
"Valid".
•The document type definition is used to define the basic building block
of any xml document. Using DTD we can specify the various elements
types, attributes and their relationship with one another.
•DTD is used to specify the set of rules for structuring data in any XML
file.
•Various building blocks of XML are-
•1. Elements – Used for defining tags.
•2. Attribute-Used to specify the values of element.
•3. CDATA-Character data, parsed by the parser.
•4. PCDATA-Parsed Character Data (i.e., Text)
Document Type Definition
Types of DTD
1. Internal DTD

<?xml version = “1.0” encoding=“UTF-8”?>


<!DOCTYPE student [
<!ELEMENT student
(name,address,std,marks)>
<!ELEMENT name(#PCDATA)>
<!ELEMENT address (#PCDATA)>
<!ELEMENT std(#PCDATA)>
<!ELEMENT marks(#PCDATA)>

<student>
<name> Anand</name>
<address>Hyderabad</address>
<std>Second</std>
<marks>70 percent</marks>
</student>
Document Type Definition
2. External DTD (student.dtd)
<!ELEMENT student (name,address,std,marks)>
<!ELEMENT name(#PCDATA)>
<!ELEMENT address (#PCDATA)>
<!ELEMENT std(#PCDATA)>
<!ELEMENT
marks(#PCDATA)> DTDDemo.xml
<?xml version = “1.0”?>
<!DOCTYPE student SYSTEM
“student.dtd”>

<student>
<name> Anand</name>
<address>Hyderabad</address>
<std>Second</std>
<marks>70 percent</marks>
</student>
XML Schemas
An XML Schema describes the structure of an XML document.
The XML Schema language is also referred to as XML Schema Definition
(XSD). The XML Schema became the World Wide Web
Consortium (W3C) recommendation in 2001.

The purpose of an XML Schema is to define the legal building blocks of an


XML document:
• the elements and attributes that can appear in a document
• the number of (and order of) child elements
• data types for elements and attributes
• default and fixed values for elements and attributes

XML Schemas Support Data Types


One of the greatest strength of XML Schemas is the support for data types.

• It is easier to describe allowable document content


• It is easier to validate the correctness of data
• It is easier to define data facets (restrictions on data)
• It is easier to define data patterns (data formats)
• It is easier to convert data between different data types
XML Schemas
XML Schema has a lot of built-in data types. The most common types
are:

xs:string

xs:decima

xs:integer

xs:boolea

n xs:date

xs:time
XML Schemas
Example:
<?xml version=“1.0”?>
<xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema>
<xs:element name=“Student”>
<xs:complexType>
<xs:sequence>
<xs:element name=“name” type=“xs.string”/>
<xs:element name=“address” type=“xs.string”/>
<xs:element name=“std” type=“xs.string”/>
<xs:element name=“marks” type=“xs.string”/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
xs – to identify the schema elements and types.
xs:schema – root element. It takes the attribute xmlns:xs which has the value came
from the namespace http://www.w3.org/2001/XMLSchema
Student is complex type because it contains other elements.
XML Schemas
SimpleXml.xml student.dtd
<?xml version="1.0" encoding="UTF-8">?
<Student> <!ELEMENT
<name>Anand</name> Student(name,address,std,marks)>
<address>Hyderabad</ <!ELEMENT name(#PCDATA)>
address> <!ELEMENT address(#PCDATA)>
<std>Tenth</std> <!ELEMENT std(#PCDATA)>
<marks>90 percent</marks> <!ELEMENT marks(#PCDATA)>
</Student>

DTDDemo.xml
<?xml version="1.0"?>
<!DOCTYPE Student SYSTEM "student.dtd">
<Student>
<name>Anand</name>
<address>Hyderabad</address>
<std>Tenth</std>
<marks>90 percent</marks>
</Student>
XML Schemas
StudentSchema.xsd
<?xml version=“1.0”?>
<xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema>
<xs:element name=“Student”>
<xs:complexType>
<xs:s
eque
nce>
<
x
s
:
e
l
e
m
e
MySchema.xml n
t
<?xml version="1.0" encoding="UTF-8"?>
<Student xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
n
xsi:noNamespaceSchemaLocation="StudentSchema.xsd">
a
m
<name>Anand</name> e
<address>Hyderabad</address> =
<std>Tenth</std> “
<marks>90 percent</marks> B.n MADHURAVANI
</Student> a
XML Schemas
Advantages of Schema:

1. Schemas are more specific.

2. The schema provide the support for data types.

3. The schema is aware of namespace.

4. It is written in XML itself and has a large number of built in and derived types.

5. The XML schema is W3C recommendation, hence it is supported by cvarious


XML
validator and XML processors.

Disadvantages of Schema:

6. The XML schema is complex to design and hard to learn

7. The XML document cannot be if the corresponding schema file is absent.


8. Maintaining the schema for large and complex operations sometimes slows down
XML Schemas
Advantages of Schema over
DTD:

1. Both the schemas and DTDs are useful for defining structural components of XML.
But the DTDs are basic and cannot be much specific for complex operations. On the
other hand schemas are more specific.

2. The schemas provide support for defining the type of data. The DTDs do not have
this ability. Hence content definition is possible using schema.

3. The schemas are namespace aware and DTDs are not.

4. The XML schema is written in XML itself and has a large number of built in and
derived types.

5. The schema is the W3C recommendation. Hence it is supported by various XML


validator and XML processors but there are some XML processors which do not
support DTD.

6. Large number of web applications can be built using XML schema. On the other
hand
Document Object Model (DOM)
The document object modeling is for defining the standard for accessing
and manipulating XML.
It is W3 recommendation for handling the structured documents.
DOM provides standard set of programming interfaces for working with XML
and HTML.
Document Object Model (DOM) is a set of platform independent and language
neutral application programming interface (API) which describes how to
access and manipulate the information stored in XML or in HTML
documents.

XML DOM is for

loading the XML document.


accessing the elements of XML
document deleting the elements of
Document Object Model (DOM)
A DOM Document is a collection of nodes or pieces of information organized in
a

hierarchy. This hierarchy allows a developer to navigate through the tree looking
for

specific information. Because it is based on a hierarchy of information, the DOM is said

to be tree based.

The XML DOM, on the other hand, also provides an API that allows a developer to
add,

edit, move, or remove nodes in the tree at any point in order to create an application.
Document Object Model (DOM)
1. Loading an XML file: student.xml
<html> <?xml version="1.0">?
<!—Simple DOM example for loading xml file--> <Student>
<body>
<Roll_No>10</Roll_No>
<Personal_Info>
<script type=“text/javascript”>
<Name>Anand</Name>
try
<Address>Hyderabad</
{ Address>
xmlDocument=new <Phone>8978903739</Phone>
ActiveXObject(“Microsoft.XMLDOM”); </Personal_Info>
} <Class>B.Tech</Class>
catch(e) <Subject>WT</Subject>
{ <Marks>100</Marks>
try
{ </Student>
xmlDocument=document.implementation.createDocument(“”,””,n
ull);
}
catch(e)
{ alert(e.message)}
}
try
{
xmlDocument.async=false;
xmlDocument.load(“student.xml”
} );
catch(e) document.write(“XML Document
student.xml is loaded”);
{alert(e.message)}
</script>
</body>
Document Object Model (DOM)

B. MADHURAVANI
Parsing XML Data
DOM and SAX Parsers

The primary goal of any XML processor is to parse the given XML document. Java

has a rich source of in-built APIs for parsing the given XML document. It is parse

in two ways-

•Tree based parsing

•Event based parsing

DOM is used to parse the XML document using the tree structure. We can access
the information of an XML document by interacting with the tree nodes.
Simple API for XML (i.e. SAX) allows us to access the information of XML
document using sequences of events. Thus there are two methods of parsing the
XML document:
•Parsing using DOM (tree based)
•Parsing using SAX(event based)
Parsing XML Data
DOM SAX
DOM is a tree based parsing method used SAX is an event based parsing method
to parse the given XML document. used to parse the given XML document.

In this method, the entire XML document In this method, the parsing is done by
is stored in the memory before actual generating the sequence of events or it
processing. Hence it requires more calls handler functions.
memory.
The DOM approach is useful for smaller Although SAX development is much more
applications because it is simpler to use challenging, it is useful for parsing the
but it is certainly not used for larger XML large XML document because the
documents because it will then require approach is event based, xml gets parsed
larger amount of memory. node by node and does not require large
amount of memory.

We can insert or delete a node. We can insert or delete a node.

Traversing is done in any direction in DOM Top to bottom traversing is done in this
approach. approach.
Parsing XML Data
Using DOM API

In Java JDK, two built-in XML parsers are available – DOM and SAX, both have

their pros and cons.

The DOM is the easiest to use Java XML Parser. It parses an entire XML document

and load it into memory, modeling it with Object for easy nodel traversal. DOM

Parser is slow and consume a lot memory if it load a XML document which

contains a lot of data.

1. Read a XML file http://www.mkyong.com/tutorials/java-xml-tutorials/

2. Modify existing XML file

3. Create a new XML file

4. Count XML
Parsing XML Data
SAX XML
Parser
SAX parser is work differently with DOM it does not any
parser, load XML
document into memory and create
some representation the
document. Instead, the
object of use
SAX
(org.xml.sax.helpers.DefaultHandler)to informs clients of the XML
document parser callback XML

structure. function

You might also like