0% found this document useful (0 votes)
26 views28 pages

WT - Unit Ii

The document provides an introduction to XML, describing what it is, how it differs from HTML, how tags and attributes work in XML, and common rules for using tags and attributes. XML is designed to store and transport data in a platform-independent way and allows for extensibility.

Uploaded by

Pavan Durga Rao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views28 pages

WT - Unit Ii

The document provides an introduction to XML, describing what it is, how it differs from HTML, how tags and attributes work in XML, and common rules for using tags and attributes. XML is designed to store and transport data in a platform-independent way and allows for extensibility.

Uploaded by

Pavan Durga Rao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Introduction to XML

What is XML?
● XML stands for eXtensible Markup Language
● XML is a markup language much like HTML
● XML was designed to store and transport data
● XML was designed to be self-descriptive
● XML is a W3C Recommendation

The Difference Between XML and HTML


XML and HTML were designed with different goals:

● XML was designed to carry data - with focus on what data is


● HTML was designed to display data - with focus on how data looks
● XML tags are not predefined like HTML tags are

XML Does Not Use Predefined Tags


The XML language has no predefined tags.

The tags in the example above (like <to> and <from>) are not defined in any
XML standard. These tags are "invented" by the author of the XML document.

HTML works with predefined tags like <p>, <h1>, <table>, etc.

With XML, the author must define both the tags and the document structure.

XML is Extensible
Most XML applications will work as expected even if new data is added (or
removed).
Imagine an application designed to display the original version of note.xml
(<to> <from> <heading> <body>).

Then imagine a newer version of note.xml with added <date> and <hour>
elements, and a removed <heading>.

The way XML is constructed, older version of the application can still work:

<note>

<date>2015-09-01</date>

<hour>08:30</hour>

<to>Tove</to>

<from>Jani</from>

<body>Don't forget me this weekend!</body>

</note>

XML Simplifies Things


● It simplifies data sharing
● It simplifies data transport
● It simplifies platform changes
● It simplifies data availability

Many computer systems contain data in incompatible formats. Exchanging data


between incompatible systems (or upgraded systems) is a time-consuming task
for web developers. Large amounts of data must be converted, and
incompatible data is often lost.

XML stores data in plain text format. This provides a software- and
hardware-independent way of storing, transporting, and sharing data.

XML also makes it easier to expand or upgrade to new operating systems, new
applications, or new browsers, without losing data.

With XML, data can be available to all kinds of "reading machines" like people,
computers, voice machines, news feeds, etc.
XML is a W3C Recommendation
XML became a W3C Recommendation as early as in February 1998.

XML - Tags
XML tags form the foundation of XML. They define the scope of an element in XML.
They can also be used to insert comments, declare settings required for parsing the
environment, and to insert special instructions.

We can broadly categorize XML tags as follows −

Start Tag
The beginning of every non-empty XML element is marked by a start-tag. Following is
an example of start-tag −

<address>

End Tag
Every element that has a start tag should end with an end-tag. Following is an example
of end-tag −

</address>

Empty Tag
The text that appears between start-tag and end-tag is called content. An element
which has no content is termed as empty. An empty element can be represented in two
ways as follows −
A start-tag immediately followed by an end-tag as shown below −

<hr></hr>
A complete empty-element tag is as shown below −

<hr />

XML Tags Rules


Following are the rules that need to be followed to use XML tags −

Rule 1
XML tags are case-sensitive. Following line of code is an example of wrong syntax
</Address>, because of the case difference in two tags, which is treated as erroneous
syntax in XML.

<address>This is wrong syntax</Address>

Following code shows a correct way, where we use the same case to name the start
and the end tag.

<address>This is correct syntax</address>

Rule 2
XML tags must be closed in an appropriate order, i.e., an XML tag opened inside
another element must be closed before the outer element is closed. For example −

<outer_element>

<internal_element>

This tag is closed before the outer_element

</internal_element>

</outer_element>

XML - Attributes
Attributes are part of XML elements. An element can have multiple unique attributes.
Attribute gives more information about XML elements. To be more precise, they define
properties of elements. An XML attribute is always a name-value pair.
Syntax
An XML attribute has the following syntax −

<element-name attribute1 attribute2 >

....content..

< /element-name>

where attribute1 and attribute2 has the following form −

name = "value"

value has to be in double (" ") or single (' ') quotes. Here, attribute1 and attribute2 are
unique attribute labels.
Attributes are used to add a unique label to an element, place the label in a category,
add a Boolean flag, or otherwise associate it with some string of data. Following
example demonstrates the use of attributes −

<?xml version = "1.0" encoding = "UTF-8"?>

<!DOCTYPE garden [

<!ELEMENT garden (plants)*>

<!ELEMENT plants (#PCDATA)>

<!ATTLIST plants category CDATA #REQUIRED>

]>

<garden>

<plants category = "flowers" />

<plants category = "shrubs">

</plants>

</garden>
Attributes are used to distinguish among elements of the same name, when you do not
want to create a new element for every situation. Hence, the use of an attribute can
add a little more detail in differentiating two or more similar elements.
In the above example, we have categorized the plants by including attribute category
and assigning different values to each of the elements. Hence, we have two categories
of plants, one flowers and other shrubs. Thus, we have two plant elements with
different attributes.
You can also observe that we have declared this attribute at the beginning of XML.

Attribute Types
Following table lists the type of attributes −

Attribute Type Description

StringType It takes any literal string as a value. CDATA is a StringType.


CDATA is character data. This means, any string of
non-markup characters is a legal part of the attribute.
This is a more constrained type. The validity constraints
noted in the grammar are applied after the attribute value is
normalized. The TokenizedType attributes are given as −

● ID − It is used to specify the element as unique.


● IDREF − It is used to reference an ID that has been
named for another element.
● IDREFS − It is used to reference all IDs of an element.
● ENTITY − It indicates that the attribute will represent
TokenizedType
an external entity in the document.
● ENTITIES − It indicates that the attribute will represent
external entities in the document.
● NMTOKEN − It is similar to CDATA with restrictions on
what data can be part of the attribute.
● NMTOKENS − It is similar to CDATA with restrictions
on what data can be part of the attribute.

This has a list of predefined values in its declaration. out of


which, it must assign one value. There are two types of
enumerated attribute −

EnumeratedType ● NotationType − It declares that an element will be


referenced to a NOTATION declared somewhere else
in the XML document.
● Enumeration − Enumeration allows you to define a
specific list of values that the attribute value must
match.

Element Attribute Rules


Following are the rules that need to be followed for attributes −
● An attribute name must not appear more than once in the same start-tag or
empty-element tag.
● An attribute must be declared in the Document Type Definition (DTD) using an
Attribute-List Declaration.
● Attribute values must not contain direct or indirect entity references to external
entities.
● The replacement text of any entity referred to directly or indirectly in an attribute
value must not contain a less than sign (<)

XML Values
An XML value represents well-formed XML in the form of an XML document, XML
content, or an XML sequence.

An XML value that is stored in a table as a value of a column defined with the XML data
type must be a well-formed XML document. XML values are processed in an internal
representation that is not comparable to any string value including another XML value.
The only predicate that can be applied to the XML data type is the IS NULL predicate.

An XML value can be transformed into a serialized string value representing an XML
document using the XMLSERIALIZE function. Similarly, a string value that represents
an XML document can be transformed into an XML value using the XMLPARSE
function. An XML value can be implicitly parsed or serialized when exchanged with
application string and binary data types.

The XML data type has no defined maximum length. It does have an effective maximum
length when treated as a serialized string value that represents XML which is the same
as the limit for LOB data values. Like LOBs, there are also XML locators and XML file
reference variables.

Restrictions when using XML values: With a few exceptions, you can use XML
values in the same contexts in which you can use other data types. XML values are
valid in:

● CAST a parameter marker, XML, or NULL to XML


● XMLCAST a parameter marker, XML, or NULL to XML
● IS NULL predicate
● COUNT and COUNT_BIG aggregate functions
● COALESCE, IFNULL, HEX, LENGTH, CONTAINS, and SCORE scalar functions
● XML scalar functions
● A SELECT list without DISTINCT
● INSERT VALUES clause, UPDATE SET clause, and MERGE
● SET and VALUES INTO
● Procedure parameters
● User-defined function arguments and result
● Trigger correlation variables
● Parameter marker values for a dynamically prepared statement

XML values cannot be used directly in the following places. Where expressions are
allowed, an XML value can be used, for example, as the argument of XMLSERIALIZE:

● A SELECT list containing the DISTINCT keyword


● A GROUP BY clause
● An ORDER BY clause
● A subselect of a fullselect that is not UNION ALL
● A basic, quantified, BETWEEN, DISTINCT, IN, or LIKE predicate
● An aggregate function with the DISTINCT keyword
● A primary, unique, or foreign key
● A check constraint
● An index column

No host languages have a built-in data type for the XML data type.

For information on the XML data model and XML values, see SQL XML programming.
XML Dcoument

is a document which is designed following XML and one of the XML markup language standards

To Develop a markup language we require to define the following things.

1. Declare all the elements of the language


a. i.e tags(elements),attributes,entities.....
2. Define the grammar rules for elements declared
3. An application which can put the document in to action

To perform the first 2 operations we can use DTD or XML Schema which are part of XML Specification.
And to develop an XML Application we can use XML Parsers
which are even standardized under XML specification by W3c ...i.e parser specifications
where XML Application is an application using XML Document and can be developed using any
programing language like JAVA,JAVASCRIPT,C,C++,C#.....

UNIT-3

Lecture-20

DTD:(Document Type Defination):

is used to declare the elements and give the type definition,where XML document can be designed
based on the type defination given by DTD

Using DTD we can declare and define:

I. Elements
II. Attributes
III. Entities
IV. Notations

i)Element

Definition:

Elements are used to describe the content which it encloses

Types of Elements:

i)child only
ii)Text only
iii)Empty
iv)Mixed
v)ANY(is a special type)
i)Child only:
these type of elements consists of one or more elements as a contents
Syntax:
<!ELEMENT elemnet_name(list of child element names)>
Example:
<account>
<name> </name>
<bal> </bal>
</account>
<!ELEMENT account(name,bal)>
Example2:
<bank>
<account> </account>
<account> </account>
</bank>
<!ELEMENT bank(account*)>

occurence Specifiers

* indicate 0 or More

+ indiactes 1 or More

? indicates 0 or 1

No Symbol ----only for one time

Example:

<emps>
<emp>
<name> </name>
<sal> </sal>
</emp>

<emp>
<name> </name>
<wages> </wages>
</emp>
</emps>
<!ELEMENT emps(emp+)>
<!ELEMENT emp(name,(sal|wages))>

ii)Text only:
These type of elements can take only text as a content where char,string,int,float,double,boolean...are
considered as a text. and are refered with a type PCDATA
PCDATA:Parsed character DATA

Syntax:
<!ELEMENT element_name(#PCDATA)>
Example:
<name>cmrcet</name>
<!ELEMENT name(#PCDATA)>
<sal>1000</sal>
<!ELEMENT sal(#PCDATA)>
PCDATA allows all the characters of our encoding format except markup char like <..

iii)Empty:
These type of elements does not takes any content
Syntax:
<!ELEMENT element_name EMPTY>
Example:
<br> </br>
or
<br/>
<!ELEMENT br EMPTY>
iv)Mixed:
These type of elements can contain child elements or text or child elemnets and text or even it can be
empty
Syntax:
<!ELEMENT element_name(#PCDATA|list of child elements with | as a separator)*>
Example:
<p>Welcome,<b>to CMRCET</b> and <i>B.Tech(CSE)</i><br/>Hello
</p>
<!ELEMENT p(#PCDATA|b|i|br)*>
v)ANY
These type of elements can take any type of content i.e:text or can be empty or any element declared in
the document
Syntax:
<!ELEMENT element_name ANY>
Example:
<!ELEMENT MyElement ANY>
The above declaration describes that element MyElement can hold text and even any element declared
in the document and it can be empty also

2. Attributes:
Are used to give a extra meaning for the content described by element
 Attribute resides in the opening tag of the element
 One element can be declared with any number of attributes,where element name and each of
these attributes are separated with space character.
 Each of the attribute consist of one name and value where these are separated with ‘=’
character and value should be in quotes ‘ or “(Single quotes or double quotes)
 Attribute name cannot have a space character
Example:
<emp empno=”e101”>
Syntax to declare an attribute:
<!ATTLIST element_name attribute_name type specifier[defaultvalue]>

Types:

1. CDATA(character data):
This type allows all the characters including numbers and space character
2. NMTOKEN:
is same as CDATA but does not accept space character
3. NMTOKENS:
it accepts one or more tokens(where one token is a sequence of characters without space
character) and in this case space is taken as separator between tokens
4. ID:The value of ID type attribute should be unique
it should not start with number but it contain number
5. IDREF:it allows one of the ID type attribute value
6. IDREFS:it can take one or more ID type attribute values where space is the separator
7. enum:in this case while declaring attribute we will specify the list of values and it allows to use
any one of the specified value.
8. ENTITY:it allows one entity name where this entity should be umparsed entity
9. ENTITIES:allows one or more entity names where space is the separator
Example:
<!ATTLIST empno working(yes|no) 'yes'>

Specifiers:
#REQUIRED --------------- Mandatory
#IMPLIED ---------------Optional
#FIXED ------------- -is Optional and even if it is used it has to be given with the value which is
specified while declaring the attribute(i.e its value will be fixed)same as final in java
3)Entity:
is reference to some content.i.e is used to represent some reusable content.we have a requirement
where some content is required to be used for more number of times within the XML documents and
even in some cases we have content being repeated in DTD document also based on this requirement
Entities are classified into 2 types.
1. General Entities Entities
2. Parameter Entity

Un Parsed Entity
Parsed Entities

General Entities Parameter Entities

Internal External Internal External

General Entities:
Are declared in DTD and used in XML documents
Internal Entity:
In this casethe content which has to be replaced where ever the entity is refered,will be placed in the
declaration of the entity directly i.e in DTD document itself.
Syntax:

<!ENTITY entity_name "content which had to replaced">


To use the entity
this can done in XML document
&entity_name

Example:
<!ENTITY copyrights "copyrights Myshop 2013-2014">
External Entity:
Here the content which has to be replaced will be placed in separate file and in the declaration of the
entity insted of specifying the content we will provide the filename with its path.
Syntax:
<!ENTITY enitity_name SYSTEM "filename with path">

Example:
<!ENTITY mylogo SYSTEM "shoplogo.gif">
Parameter Entity:
These entities are declared and used in DTD itself
Internal entity:
Syntax:

<!ENTITY % entity_name "content">


to use (i.e in DTD it self)
%entity_name

External Entities:

Syntax:
<!ENTITY % entity_name SYSTEM "filename">
to use
%entity_name
Example:

<!ENTITY % text "#PCDATA">


<!ELEMENT name(%text;)>

Unparsed Entities:

To refer some content which is of different encoding format we have to go for unparsed entities

i.e like to refer gif,.bmp,.jpg......files

Syntax:

<!ENTITY entity_name SYSTEM "filenamewith path" NDATA notation_name>

Notations:

These are used to refer some content which provides some additional description like
MIME/Contenttype ........

Syntax:

<!NOTATION notation_name "content">

Example:

<!NOTATION gif "image|gif">

<!ENTITY mylogo SYSTEM "shoplogo.gif" NDATA gif>

<!ATTLIST shop logo ENTITY #REQUIED>

Example:
XML Document Structure

<? ?> Processing Instruction Tag


<! > Instruction Tag
<!-- --> Comment Instruction Tag
< > opening tag
</ > closing Tag
< /> Self ending tag
Structure of XML
<?xml version=”1.0” encoding=”UTF-8” standalone=”no”?>
<!DOCTYPE …….>
<root_element_name>
</root_element_name>

 XML processing instruction tags is optional but recommended to be used


 If used should be the first element (even comment is not allowed before this)
 If used version attribute is mandatory this takes the XML version where the current version is
1.0 encoding is optional if not given takes the system encoding format standalone is also
optional takes yes or No if not given by default it takes No.This attribute indicates whether this
document depends on any external resources or not(if it is depending it should be given as ‘no’
if not ‘yes’

UNIT-3

Lecture-21

XML Schema:

Is used to declare the elements of the Markup Language and Grammar rules i.e an alternative to DTD

An XML Schema describes the structure of an XML document.The XML Schema language is also referred
to as XML Schema Definition (XSD)

An XML Schema:

 Defines elements that can appear in a document


 Defines attributes that can appear in a document
 Defines which elements are child elements
 Defines the number of child elements
 Defines whether an element is empty or can include text
 Defines data types for elements and attributes
 Defines default and fixed values for elements and attributes
Differences Between DTD and XML Schema

 DTD uses a small language to define the rules where as xml schema is xml document.XML
schema documents are more descriptive than compared to DTD
 With DTD &XML Schemas we have a provision to declare complex types but with DTD the type
name and the element name should be same which is not required in XML Schema
 With DTD we don’t have a support to specify a particular occurrence for a element i.e MIN and
MAX occurrence(We were allowed to given MIN as 0 or 1 and MAX 1 or more) where as with
XML Schema we can specify the required Max and Min occurrences.
 DTD doesn’t supports all the common types(i.e it considers numbers.. all as text #PCDATA)
where as with XML Schema we can specific type like String,char,number,double,float,Boolean
 XML schema supports NameSpace.Since XML Schema document is also an XML document it can
be generated/written using any tool which supports

XML Schemas are the Successors of DTDs

We think that vey soon XML Schemas will be used in most Web Applications as a replacement for DTDs.

Here are Some reasons:

 XML Schemas are extensible to future additions


 XML Schemas are richer and more useful than DTDs
 XML Schemas are written in XML
 XML Schemas support dat types
 XML Schemas support namespaces

XML Schema has support for Data Types

One of the greatest strengths of XML Schema is the Support for data types

With the support for data types:

 It is easier to describe permissible document content


 It is easier to validate the correctness of data
 It is easier to work with data from a database
 It is easier to define facets(restrictions on data)
 It is easier to define data patterns
 It is easier to convert data between different data types

XML Schemas Secure Data Communication:

When data is sent from sender to a receiver it is essential that both parts have the same “expectations”
about the content.

With XML Schemas,the sender can describe the data in way that the receiver will understand.
Well-Formed is not enough

A well-formed XML document is a document that conforms to the XML syntax rules:

 Must begin with the XML declaration


 Must have one unique root element
 All start tags must match end tags
 XML tags are case sensitive
 All elements must be closed
 All elements must be properly nested
 All attribute values must be quoted
 XML entities must be used for special characters

Even if documents are well-Formed they can still contain errors and those errors can have serious
consequences. Think of this situation: you order 5 gross of laser printers, instead of 5 laser printers. With
XML Schema most of these errors can be caught by your validating software.

A simple XML Document

“note.xml”

<?xml version="1.0"?>
<note>
<to>Srinandhan</to>
<from>shashank</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend</body>
</note>

A simple DTD
This simple DTD file called “Note.dtd” that defines the elements of the XML document
above(“note.xml”)

<!ELEMENT note(to,from,heading,body)>
<!ELEMENT to(#PCDATA)>
<!ELEMENT from(#PCDATA)>
<!ELEMENT heading(#PCDATA)>
<!ELEMENT body(#PCDATA)>

The <schema> Element:


The <schema> Element is the root element of every XML Schema

Syntax:
<?xml version=”1.0”?>
<xs:schema>
----
----
</xs:schema>

The <schema> Element may contain some attributes. A schema declaration often looks something like
this:

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
--
---

</xs:schema>

A simple XML Schema


This simple XML Schema file called “Note.xsd” that defines the elements of the XML document
above(“note.xml”)
“Note.xsd”
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>

</xs:element>
</xs:schema>
A reference to an XML Schema:
<?xml version="1.0"?>
<note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-
instance" xsi:schemaLocation="http://w3schools.com note.xsd">
<to>
Srinandhan
</to>
<from>Shashank</from>
<heading>Reminder</heading>
<body>Test</body>
</note>
Output:

Namespace:

Namespace is used to make the element/attribute unique

i.e this is most required when multiple markup language elements are used in one document in such a
case if the element names are same from both the markup languages then a small prefix can represent a
element uniquely describing that the element is of a particular markup language.
org.xml.sax:

Defines the basic SAX API

javax.xml.transform:

Defines the XSLT API that let you transform XML into other forms

DOM (Document Object Model)

 Is a Specification for w3c


 is a validating parser
 DOM is a DTD Validator and DOM Level 3 parser supports XML Schema also
 is a tree based i.e it follows tree based approach(Makes the complete object tree available to
the application)
 for each part of the xml document it prepares an object and construct an object tree
representing the xml document and org.w3c.dom.Node is the super most type for all the types
in DOM Specification
 These Specifications are implemented by 3rd party vendors
You use the javax.xml.parsers.DocumentBuilderFactory class to get a
DocumentBuilder instance, and use that to produce a Document (a DOM) that conforms to the
DOM specification. The builder you get, in fact, is determined by the System property,
javax.xml.parsers.DocumentBuilderFactory, which selects the factory
implementation that is used to produce the builder. (The platform's default value can be
overridden from the command line.)

You can also use the DocumentBuilder newDocument() method to create an empty
Document that implements the
org.w3c.dom.Document interface. Alternatively, you can use one of the builder's parse methods
to create a Document
from existing XML data. The result is a DOM tree like that shown in the diagram.

Example:

Shop.dtd

<!ENTITY copyrights "copyrights Myshop 2012-2013">


<!NOTATION gif SYSTEM "image|gif">
<!ENTITY mylogo SYSTEM "shoplogo.gif" NDATA gif>
<!ELEMENT shop (item+,selected_items*,copy-rights)>
<!ELEMENT item (name,price,available_qtys)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT price (#PCDATA)>
<!ELEMENT available_qtys (#PCDATA)>
<!ELEMENT selected_items (discount?,gift*)>
<!ELEMENT discount (#PCDATA)>
<!ELEMENT gift EMPTY>
<!ELEMENT copy-rights (#PCDATA)>
<!ATTLIST shop logo ENTITY #IMPLIED>
<!ATTLIST item item_no ID #REQUIRED>
<!ATTLIST item type CDATA #IMPLIED>
<!ATTLIST discount units CDATA #REQUIRED>
<!ATTLIST price units (one|kg|meter) 'one' type NMTOKEN #IMPLIED>
<!ATTLIST selected_items item_no IDREFS #REQUIRED>
<!ATTLIST gift item IDREF #REQUIRED>
Shop.xml

<?xml version="1.0" encoding="utf-8"?>


<!DOCTYPE shop SYSTEM "shop.dtd">

<shop logo="mylogo">
<item item_no="i101" type="books">
<name>item1</name>
<price units="one" type="rs">400</price>
<available_qtys>20</available_qtys>
</item>
<selected_items item_no="i101">
<discount units="percentage">10</discount>
</selected_items>
<selected_items item_no="i102">
<gift item="i101"/>
</selected_items>
<copy-rights>&copyrights;</copy-rights>
</shop>

Output:
ReadShopXMLFile.java
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
import java.io.File;

public class ReadShopXMLFile


{

public static void main(String argv[]) {

try {

File fXmlFile = new File("shop.xml");


DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);

doc.getDocumentElement().normalize();

System.out.println("Root element :" + doc.getDocumentElement().getNodeName());

NodeList nList = doc.getElementsByTagName("item");

System.out.println("----------------------------");

for (int temp = 0; temp < nList.getLength(); temp++)


{

Node nNode = nList.item(temp);

System.out.println("\nCurrent Element :" + nNode.getNodeName());

if (nNode.getNodeType() == Node.ELEMENT_NODE) {

Element eElement = (Element) nNode;

System.out.println("item_no : " + eElement.getAttribute("item_no"));


System.out.println("name : " +
eElement.getElementsByTagName("name").item(0).getTextContent());
System.out.println("price : " +
eElement.getElementsByTagName("price").item(0).getTextContent());
System.out.println("available_qtys : " +
eElement.getElementsByTagName("available_qtys").item(0).getTextContent());
//System.out.println("Salary : " +
//eElement.getElementsByTagName("salary").item(0).getTextContent());

}
}
} catch (Exception e) {
e.printStackTrace();
}
}

Output:
XML Parsers
An XML parser is a software library or package that provides interfaces for client applications to
work with an XML document. The XML Parser is designed to read the XML and create a way for
programs to use XML.

XML parser validates the document and check that the document is well formatted.

Let's understand the working of XML parser by the figure given below:

Types of XML Parsers


These are the two main types of XML Parsers:

1. DOM
2. SAX

DOM (Document Object Model)


A DOM document is an object which contains all the information of an XML document.
It is composed like a tree structure. The DOM Parser implements a DOM API. This API is
very simple to use.

Features of DOM Parser


A DOM Parser creates an internal structure in memory which is a DOM document object
and the client applications get information of the original XML document by invoking
methods on this document object.

DOM Parser has a tree based structure.

Advantages

1) It supports both read and write operations and the API is very simple to use.

2) It is preferred when random access to widely separated parts of a document is


required.

Disadvantages

1) It is memory inefficient. (consumes more memory because the whole XML document
needs to loaded into memory).

2) It is comparatively slower than other parsers.

SAX (Simple API for XML)


A SAX Parser implements SAX API. This API is an event based API and less intuitive.

Features of SAX Parser

It does not create any internal structure.

Clients does not know what methods to call, they just overrides the methods of the API
and place his own code inside method.

It is an event based parser, it works like an event handler in Java.

Advantages

1) It is simple and memory efficient.

2) It is very fast and works for huge documents.

Disadvantages
1) It is event-based so its API is less intuitive.

2) Clients never know the full information because the data is broken into pieces.

You might also like