AdvancedJavaProgramming-SLIDES03-UNIT1-FP2005-Ver 1.0
AdvancedJavaProgramming-SLIDES03-UNIT1-FP2005-Ver 1.0
(J2EE LC)
XML Parsers - Day 3
Course Objectives
Overview of XML
XML Document Type Definitions (DTDs)
XML Schemas
To understand the need for parsing XML documents
To understand types of XML Parsers
– Validating vs. Non-Validating Parsers
To understand different XML Parser Interfaces
– Tree Based Interface Standard : DOM
– Event Based Interface Standard : SAX
Evaluating Parsers
– Which parser to use?
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 2
Technologies Ltd Version 1.00
Recap on XML
What is XML? tomcat-users.xml
– eXtensible Markup Language (XML)
Uses of XML
– XML Data Buffers : Used to store the data
– Config Files : Describes the configuration of the Servers
– Example : The user configuration file for Tomcat Web Server (tomcat-users.xml)
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 3
Technologies Ltd Version 1.00
Recap on XML
XML Document (address.xml)
<?xml
xml version=“
version=“1.0”
1.0”?> XML Declaration
<address> Root Element
address
<name>
<first>John</first> Nested Elements
first, middle, last
<middle>Fitzgerald Johansen</middle>
<last>Doe</last>
</name>
Attribute
<doornumber>2345</doornumber> type
<street>Kalidasa Road</street>
<city>Mysore</city>
<pin>570 002</pin>
<telephone type=“work”>91-821-
2404000</telephone>
<telephone type=“home”> 91-821-
ER/CORP/CRS/LA22/003
2404001</telephone>
Copyright © 2005, Infosys
Technologies Ltd
4
Version 1.00
<telephone type=“mobile”>91-93424-
Namespaces in XML
Namespaces helps to differentiate two objects (XML data) of the same name
<wood:table>
<table>
<wood:name>Coffee Table</wood :name>
<name>Coffee Table</name>
<wood:width>80</wood :width>
<width>80</width>
<wood:length>120</wood :length>
<length>120</length>
</wood:table>
</table>
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 5
Technologies Ltd Version 1.00
Document Type Definitions (DTDs)
Describes syntax that explains
– which elements may appear in the XML document
– what are the element contents and attributes
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 6
Technologies Ltd Version 1.00
Internal DTD
DTD embedded in the XML document
– The declarations appear between [ and ]
– E.g. AddressBook.xml AddressBook.xml
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 7
Technologies Ltd Version 1.00
External DTD
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 8
Technologies Ltd Version 1.00
Anatomy of DTD – Defining new XML tags
(Elements)
<!ELEMENT element_name content_specification>
– element_name: Specifies name of the XML tag
• Nested elements
Example:
– <!ELEMENT Street (#PCDATA)>
• element Street contains the parsed character Data
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 9
Technologies Ltd Version 1.00
Anatomy of DTD – Attribute Declarations
– Attr-Name : Name of the attribute, the attribute is defined for element Tag-Name
Example
– <!ATTLIST Name salutation CDATA #REQUIRED>
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 10
Technologies Ltd Version 1.00
Anatomy of DTD – Entity Declarations (1 of 2)
Example
– <State>Jammu & Kashmir</State>
AddressBook1.xml
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 11
Technologies Ltd Version 1.00
Anatomy of DTD – Entity Declarations(2 of 2)
Example
– <!ENTITY MyCountry "India">
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 12
Technologies Ltd Version 1.00
XML Schema
What is XML Schema?
– An XML vocabulary for expressing your data's structure and business rules
– Validating parsers can use Schema to check whether XML data adheres to rules in schema
– More robust and extensive than DTD, can do even data type validations
– The Subject must be any valid subject from the list (PF, CHSSC, RDBMS, IWT, AOA)
– The Marks must be between 0 to 100 only and Grade can be either A or B or C
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 13
Technologies Ltd Version 1.00
XML Schema : Validating the XML Document
Validating your Data (XML Document)
<Result>
<Name>Kiran</Name>
<EmpNo>45609</EmpNo>
<Subject>
<Name>CHSSC</Name>
<Marks>80</Marks>
XML
<Grade>A</Grade>
</Subject> Schema Data is
</Result> Validating Ok!
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 14
Technologies Ltd Version 1.00
How can XML schema help to accomplish this?
Answer
– It creates XML vocabulary : Defines following set of elements
• <Result>, <Subject>, <Marks>, <Grade>
• <Subject> must be one of the valid subjects (CHSSC, PF, RDBMS, AOA, IWT)
– It is not an actual URL, but uses URL syntax and should be a unique string
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 15
Technologies Ltd Version 1.00
Example of referring to Schema Result.xml
<?xml version = "1.0" encoding = "UTF-
"UTF-8"?>
<res:Result xmlns:res="http://
xmlns:res="http://www.Results.com
="http://www.Results.com"
www.Results.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema
xmlns:xsi="http://www.w3.org/2001/XMLSchema-
="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.Results.com
xsi:schemaLocation="http://www.Results.com Result.xsd
Result.xsd"
">
<res:Name>Kiran</
res:Name>Kiran</res:Name
>Kiran</res:Name>
res:Name>
<res:EmpNo>45609</
res:EmpNo>45609</res:EmpNo
>45609</res:EmpNo>
res:EmpNo>
<res:Subject>
res:Subject>
<res:Name>CHSSC</
res:Name>CHSSC</res:Name
>CHSSC</res:Name>
res:Name>
<res:Marks>80.70</
res:Marks>80.70</res:Marks
>80.70</res:Marks>
res:Marks>
<res:Grade>A</
res:Grade>A</res:Grade
>A</res:Grade>
res:Grade>
</res:Subject
</res:Subject>
res:Subject>
<res:Subject>
res:Subject>
<res:Name>PF</
res:Name>PF</res:Name
>PF</res:Name>
res:Name>
<res:Marks>78.30</
res:Marks>78.30</res:Marks
>78.30</res:Marks>
res:Marks>
<res:Grade>B+</
res:Grade>B+</res:Grade
>B+</res:Grade>
res:Grade>
</res:Subject
</res:Subject>
res:Subject>
</res:Result
</res:Result>
res:Result>
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 16
Technologies Ltd Version 1.00
Schema example : Result.xsd
Result.xsd
<xsd:simpleType name="NameType
name="NameType">
NameType">
<xsd:restriction base="xsd:string">
<xsd:pattern value="CHSSC|PF|RDBMS|IWT|AOA"/>
</xsd:restriction>
</xsd:simpleType>
[ Continued ……]
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 17
Technologies Ltd Version 1.00
Schema example : Result.xsd (Continued ……)
<xsd:complexType name="SubjectType
name="SubjectType">
SubjectType">
<xsd:sequence>
xsd:sequence>
<xsd:element name="Name" type="NameType
type="NameType"/>
NameType"/>
<!--
<!-- Reference to the element Marks -->
-->
<xsd:element ref="Marks"/>
<xsd:element name="Grade">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:pattern value="A|B+|B|C|D"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
</xsd:sequence
</xsd:sequence>
xsd:sequence>
</xsd:complexType>
<xsd:element name="Marks">
<xsd:simpleType>
<xsd:restriction base="xsd:float
base="xsd:float">
xsd:float">
<xsd:minInclusive value="0.0"/>
<xsd:maxInclusive value="100.0"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
</xsd:schema
</xsd:schema>
xsd:schema>
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 18
Technologies Ltd Version 1.00
Result.xml : Understanding XML Declaration
XML Data
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 19
Technologies Ltd Version 1.00
Result.xml : Understanding Structure of XML Data
</res:Subject>
All elements
prefixed with res:
<res:Subject> are defined in
www.Results.co
<res:Name>PF</res:Name> mnamespace
<res:Marks>45.30</res:Marks> PF Result
<res:Grade>D</res:Grade>
</res:Subject>
</res:Result>
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 20
Technologies Ltd Version 1.00
Understanding XML Schema
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema xmlns:xsd=“http://www.w3.org/2001/XMLSchema” All the elements
targetNamespace=“http://www.Results.com” prefixed with xsd are
xmlns="http://www.Results.com" elementFormDefault="qualified"> defined in
www.w3.org/../...
<xsd:element name="Result"> Name-space
<xsd:complexType>
<xsd:sequence>
<xsd:element name="Name" type="xsd:string"/> Define
<xsd:element name="EmpNo" type="xsd:int"/> Element
<xsd:element name=“Subject" type=“SubjectType" maxOccurs="5"/> Result
</xsd:sequence>
</xsd:complexType>
</xsd:element> All the elements
defined here are part
<xsd:complexType name=“SubjectType">
of this
... “targetNamespace”
...
</xsd:complexType>
</xsd:schema>
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 21
Technologies Ltd Version 1.00
DTD vs Schema
– Schema supports integer, string etc while the DTD does not
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 22
Technologies Ltd Version 1.00
Element Declarations: Simple Element
Syntax :
<xsd:element name=“Element_name” type=“Element_type” Occurrence/>
Example :
– <xsd:element name="Name" type="xsd:string"/>
• Defines the element Name of type string
– <xsd:element name=“Marks" type=“xsd:float“ maxOccurs=“5”/>
• Defines the element Marks of simple type float
• Marks may appear for maximum 5 times
• And by default for minimum 1 time
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 23
Technologies Ltd Version 1.00
Element Declarations
Syntax :
<xsd:element name=“
name=“Element_name”
Element_name”>
<xsd:complexType>
<!-- Element Specification -->
<!-- -->
</xsd:complexType>
</xsd:element>
– Example
<xsd:element name=“
name=“Subject">
<xsd:complexType>
<xsd:sequence>
<xsd:element name=“ name=“Name" type="xsd:string
type="xsd:string"/>
xsd:string"/>
<xsd:element name=“ name=“Marks" type="xsd:float
type="xsd:float"/>
xsd:float"/>
<xsd:element name=“ name=“Grade" type="xsd:string
type="xsd:string"/>
xsd:string"/>
</xsd:sequence>
</xsd:complexType>
<xsd:element>
• Defines non-reusable complex element called ‘Subject’
Copyright © 2005, Infosys 24
ER/CORP/CRS/LA22/003
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 25
Technologies Ltd Version 1.00
Element Declarations: Reusable Complex Type
Syntax
<xsd:complexType name=“
name=“Type_name”
Type_name”>
– Defines the reusable type Type_name
Example
<xsd:complexType name=“SubjectType“>
<xsd:sequence>
<xsd:element name=“Name" type=“xsd:string"/>
<xsd:element name=“Marks" type="xsd:int"/>
<xsd:element name=“Grade" type="xsd:string”/>
</xsd:sequence>
</xsd:complexType>
– Defines reusable complex element type SubjectType
– Comprises of following elements in the sequence specified (<xsd:sequence> tag)
• Name
• Marks
• Grade
This type can be used to define elements in your XML
<xsd:element name=“Subject” type=“SubjectType”>
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 26
Technologies Ltd Version 1.00
Defining the Attributes
Syntax : <xsd:attribute name=“Attr_Name" type=“Attr_Type"/>
– Example
<xsd:attribute name=“Project" type=“xsd:string"/>
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 27
Technologies Ltd Version 1.00
Anatomy of XML Schema : Constraints specification
Controls occurrence of individual element or group of elements
Types of constraints
• <choice> : allows only one element to appear
• <sequence> : elements must appear in the same order as they are declared
• <all> : elements can occur in any order and in any combination
<choice> constraint
– E.g.:
<xsd:choice>
<xsd:element name=“first”/>
<xsd:element name=“last”/>
</xsd:choice>
• Allows either first or last name to be used in the instance XML Document
<sequence> constraints
– E.g.:
<xsd:sequence>
<xsd:element name="Name" type="xsd:string"/>
<xsd:element name="EmpNo" type=“xsd:int"/>
<xsd:element name=“Subject" type="SubjectType" maxOccurs="5"/>
</xsd:sequence>
• All elements must appear in the defined order only
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 28
Technologies Ltd Version 1.00
Anatomy of XML Schema : Constraints specification
<all> constraints
– E.g. :
<xsd:all>
<xsd:element name=“invoice”>
<xsd:element name=“purchaseOrder”>
<xsd:element name=“mailingLabel”>
</xsd:all>
• Any of the elements can either appear or not appear
• Elements may appear in any order
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 29
Technologies Ltd Version 1.00
XML Parsers
XML Parser : The Big Picture
XML
DTD / Schema
API’s
Document Application
Parser
Parsed Data
Fig. 1 : Usage of the XML Parser
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 31
Technologies Ltd Version 1.00
Need for Parser
• An application using parser can access data in XML by going through the hierarchy
or using tag names
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 32
Technologies Ltd Version 1.00
Types of XML Parsers
Validating Parser
– a parser that verifies that the XML document adheres to the DTD or Schema
Non-Validating Parser
– a parser that does not verify the XML document against the DTD or Schema
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 33
Technologies Ltd Version 1.00
XML Parser Interfaces
JAXP
– “Java API for XML Processing”
It supports both
– Tree Based Parser : DOM
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 34
Technologies Ltd Version 1.00
DOM Parser
<Result>
<Name>Kiran</Name>
<EmpNo>45609</EmpNo>
<Subject>
<Name>CHSSC</Name>
<Marks>80</Marks>
<Grade>A</Grade>
</Subject>
</Result>
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 35
Technologies Ltd Version 1.00
DOM Parser
Element
Result Nodes
Name
Kiran
EmpNo
Text Nodes
45609
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 36
Technologies Ltd Version 1.00
DOM Parser
An application can navigate through the tree to find the desired pieces of
document
Document Object Model (DOM) is the standard for Tree Based parsing of
XML document
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 37
Technologies Ltd Version 1.00
Document Object Model (DOM)
The Document Object Model (DOM) is a set of interfaces defined by the W3C
DOM Working Group
DOM is the tree based interface used by the programmers to manipulate the
XML document
DOM Parser represents the logical Model of the XML document in the memory
All the entity reference are expanded before the DOM tree was constructed
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 38
Technologies Ltd Version 1.00
DOM Structure representing XML
Document Result
Element
Node
Comment
Text Name Marks
Text Node
80.0 Grade
CHSSC
A
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 39
Technologies Ltd Version 1.00
Document Object Model (DOM) : Overview
The Child nodes of the Document node are : Element nodes, Comments
nodes etc
– Example : Name, Subject, EmpNo, etc are all Child Nodes
All the nodes in the XML Document are derived from interface :
org.w3c.dom.Node
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 40
Technologies Ltd Version 1.00
The Big picture : Parsing the XML Document
Document builder factory creates an instance of parser with required characteristics
– Whether the parser should be validating parser or not
– Whether namespace support required or not, Whether to ignore the white spaces between the elements or
not
Factory hides the implementation details of the parser and gives a standard DOM interface for
parsing XML
– (Analogous to JDBC driver)
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 41
Technologies Ltd Version 1.00
DomApp.java : Parsing XML Document using DOM Parser
public class DomApp {
public static void main(String argv[]) { DomApp.java
MyErrorHandler hErr;
Document hDocument;
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
factory.setValidating(true);
factory.setNamespaceAware(true);
try {
hErr = new MyErrorHandler();
DocumentBuilder hBuilder = factory.newDocumentBuilder();
// Set the error handler
hBuilder.setErrorHandler(hErr);
hDocument = hBuilder.parse( new File(“Result.xml”));
}
catch (Exception e){
// Handle exception if generated during parsing
}
}// End of Function main
}
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 42
Technologies Ltd Version 1.00
Parsing the XML Document using DOM Parser
Step 1: Get the instance of document-builder factory.
This will be used to produce the DOM-parser (called DocumentBuilder)
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
Step 2: Set the properties of the DOM parser to be produced
a. It should validate the XML Document against the Schema / DTD
b. It should be namespace aware
factory.setValidating(true);
factory.setNamespaceAware(true);
Step 3 : Obtain the instance of the MyErrorHandler class
This instance handles the error generated during parsing, in application specific way
hErr = new MyErrorHandler();
Step 4: Obtain the instance of DOM parser, and register the error handler
This will be used to parse the XML Document and creates the memory based tree
representation of the XML Document
DocumentBuilder hBuilder = factory.newDocumentBuilder();
hBuilder.setErrorHandler(hErr);
Step 5 : Parse the XML Document (Result.xml) using the parser created as above
hDocument = hBuilder.parse( new File(“Result.xml”));
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 43
Technologies Ltd Version 1.00
DOM : Exploring the org.w3c.dom.Node Interface
The Node interface is the root of DOM Core class hierarchy
This interface can be used to extract information from any DOM object without
knowing its actual type (e.g. Element node, Text node, Attr Node etc ) of
underlying node
Node
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 44
Technologies Ltd Version 1.00
DOM : Important Methods of Node interface
Methods to retrieve the various information from the XML DOM Tree
• Node getFirstChild() : Returns the first child of the current node
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 45
Technologies Ltd Version 1.00
Using Node Interface
sVal = “Kiran”
Name
45609
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 46
Technologies Ltd Version 1.00
XML Parser Interfaces : Event Based Interface
– Example
<Result>
<Name>Kiran</Name>
<EmpNo>45609</EmpNo>
<Subject>
<Name>CHSSC</Name>
<Marks>80</Marks>
<Grade>A</Grade>
ER/CORP/CRS/LA22/003
</Subject> Copyright © 2005, Infosys
Technologies Ltd
47
Version 1.00
</Result>
XML Parser Interfaces : Event Generated
– startElement : Result
– startElement : Name
– contents : Kiran
– endElement : Name
– startElement : EmpNo
– contents : 45609
– endElement : EmpNo
– endElement : Result
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 48
Technologies Ltd Version 1.00
XML Parser Interfaces : Event Based Interface
Your application intercepts these events, and handles them in any way you
want
Application does not wait till the entire document gets parsed
Application has to maintain the information from XML document within local
data-structures till it is processed completely
Simple API for XML (SAX) is the standard for Event Based parsing of XML
document
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 49
Technologies Ltd Version 1.00
SAXApp.java : Parsing XML Document using SAX Parser
SAXApp.java
public class SAXApp {
public static void main(String argv[])
argv[]) {
//Get the instance of parser event handing class
DefaultHandler handler = new Handler();
//Get the instance of SAXParserFactory
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParserFactory.newInstance();
try {
// Set the properties of the parser to be obtained
factory.setValidating(true);
factory.setValidating(true);
factory.setNamespaceAware(true);
factory.setNamespaceAware(true);
// Get the new SAX Parser
SAXParser saxParser = factory.newSAXParser();
factory.newSAXParser();
// Parse the file
// handler : processes events generated during parsing
saxParser.parse(new File(“
File(“Result.xml”
Result.xml”), handler);
}
//Handle any exceptions if generated during parsing
catch (Throwable
(Throwable t) {
t.printStackTrace();
t.printStackTrace();
}
} // End of function main
} ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 50
Technologies Ltd Version 1.00
SAXApp.java : Parsing XML Document using SAX Parser
class Handler extends DefaultHandler{
DefaultHandler{
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 51
Technologies Ltd Version 1.00
Understanding The Simple API for XML (SAX)
Step 1: Get the instance of SAXParserFactory
This instance is used to obtain the SAX Parser
SAXParserFactory factory = SAXParserFactory.newInstance();
Step 2:Get the instance of the event handler class
This class handles all the events generated by parser
DefaultHandler handler = new Handler();
Step 3:Set the properties of the parser to be obtained
a. It should validate the XML Document against the Schema / DTD
b. It should be namespace aware
factory.setValidating(true);
factory.setNamespaceAware(true);
Step 4 : Obtain the instance of the SAX Parser using the factory just obtained
SAXParser saxParser = factory.newSAXParser();
Step 5: Parse the Result.xml file using the SAX Parser obtained as above
Events generated during parsing will be handled by object handler
saxParser.parse(new File(“Result.xml”), handler);
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 52
Technologies Ltd Version 1.00
The Big picture : Paring the XML Document using SAX
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 53
Technologies Ltd Version 1.00
org.xml.sax Interfaces
org.xml.sax.DefaultHandler Class
– Provides the default implementation of all the events
org.xml.sax.ContentHandler Interface
– Receive notification of the logical content of a document
– Also defines methods characters() which are invoked when the parser encounters
the text in an XML element
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 54
Technologies Ltd Version 1.00
org.xml.sax Interfaces
org.xml.sax.ErrorHandler Interface
– Allows SAX application to do customized error handling
– The parser will then report all errors and warnings through this interface
– Important Methods
• void error() : receives the notification of recoverable error
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 55
Technologies Ltd Version 1.00
Evaluating Parsers : SAX vs. DOM
SAX
– Advantage
• It is good when serial processing of the document is required and document is very large
– Disadvantage
• Requires internal data structure to maintain the parts of XML document till the complete processing is not
finished, therefore not suitable for parsing the small XML Documents.
DOM
– Advantage
• Supports DOM Tree Traversing methods
– Disadvantage
• For large XML documents (size in GBs) requires more memory as compared to memory required to parse
XML document using SAX Parser.
ER/CORP/CRS/LA22/003
Copyright © 2005, Infosys 56
Technologies Ltd Version 1.00