WT U3 Key
WT U3 Key
UNIT III
PART B
1. (i) List and explain the XML syntax rules in detail. (7)
The basic XML syntax rules are as follows (these are covered only briefly
here, since they all should be familiar from our earlier study of XHTML):
FIGURE 7.29 Browser rendering of XML document from Figure 7.28 after XSLT
transformation has been applied
2. (i) Explain the role of XML name spaces with examples. (7)
An XML vocabulary is created by specifying a complete description of the elements
and attributes for a specific type of XML document. (An XML vocabulary is also
sometimes referred to as an XML application)
TheW3C’s XML Namespace recommendation [W3C-XML-NAMESPACE-1.1]
provides a mechanism for identifying each element and attribute name
within a document with a specific XML vocabulary. An XML namespace is a
collection of element and attribute names associated with a particular XML
vocabulary (such as XHTML) through an absolute URI known as the
namespace name.
An example namespace name is http://www.w3.org/1999/xhtml, which is
specified as the namespace name for XHTML 1.0 by its recommendation
[W3C-XHTML-1.0]. We have seen this namespace name frequently in earlier
examples. In fact, XHTML 1.0 requires that every XHTML document be
associated with this namespace name by including an xmlns attribute
specification in the root element of the document:
<html xmlns="http://www.w3.org/1999/xhtml">
The meaning of the xmlns attribute is defined by the XML Namespace
recommendation.
When specified on a root element as shown, the xmlns attribute specifies a default
namespace for the entire document. So, in an XHTML document, the xmlns
specification indicates that all element type names within the document—including
html—belong by default to the XML namespace having namespace name
http://www.w3.org/1999/xhtml. Specifying a default namespace has no effect on
attributes, which belong to no namespace unless explicitly associated with a
namespace via mechanisms
Embedded elements, such as XHTML elements within an RSS document,
First, the document must associate a namespace prefix with the namespace
containing the embedded element types. This is done using a special form of
the xmlns attribute. For example, in an RSS document we might associate
the namespace prefix xhtml with the namespace name
http://www.w3.org/1999/xhtml as follows:
<rss version="0.91" xmlns:xhtml="http://www.w3.org/1999/xhtml">
An xmlns attribute specification of this form is called a namespace declaration.
Once a namespace has been associated with a namespace prefix through a
namespace declaration, we can mark any element (or attribute) name as
belonging to the namespace by preceding the name with the prefix. For
example, in an RSS document containing the given namespace declaration,
we can mark an a element as belonging to the XHTML namespace as follows
(notice that both the start and end tags contain the namespace prefix):
<item>
<title>Announcing a Sibling Site!</title>
<link>http://www.example.org/</link>
<description>Were you aware that
<xhtml:a href="example.com">example.com</xhtml:a>
is not the only site in the example family?</description>
</item>
The XML namespace concept was defined some time after XML itself was initially
defined and, as we have noted, is covered in a separate W3C recommendation.
Therefore, some software may fully comply with the XML recommendation and yet
not be namespace aware. In software that is namespace-aware, all element and
attribute names (except for xmlns attributes) are referred to as qualified names,
whether or not they are prefixed.
Associated with each qualified name is an expanded name, which is a pair
consisting of a namespace name and a local name. The local name is just the
qualified name with any namespace prefix removed;
for example, in the qualified name xhtml:a , a is the local name. In typical Java
XML software, if a qualified name is not in any namespace, then the namespace-
name component of its expanded name is represented by null.
It should be noted that if you design an XML vocabulary and wish to conform with
the XML namespace recommendation, the element and attribute names you define
must not use colons (for obvious syntactic reasons). Similarly, an XML document
conforming with XML namespace may not contain colons in prefix names or in the
values specified for attributes of the XML data type ID
3.Write XSLT code to display employee details in a Table form which is stored is XML.
(13)
<?xml version = "1.0"?>
<class>
<employee id = "001">
<firstname>Aryan</firstname>
<lastname>Gupta</lastname>
<nickname>Raju</nickname>
<salary>30000</salary>
</employee>
<employee id = "024">
<firstname>Sara</firstname>
<lastname>Khan</lastname>
<nickname>Zoya</nickname>
<salary>25000</salary>
</employee>
<employee id = "056">
<firstname>Peter</firstname>
<lastname>Symon</lastname>
<nickname>John</nickname>
<salary>10000</salary>
</employee>
</class>
Define an XSLT stylesheet document for the above XML document. You
should follow the criteria give below:
Create the XSLT document which satisfies the above requirements. Name
it as employee.xsl and save it in the same location of employee.xml.
Employee.xsl
Updated "employee.xml"
Output:
3. (i) Explain in detail about XSL.(7)
Different type of transformation, extracts information from one XML text document and
uses that information to create another XML text document.
The “programming language” used to direct this type of transformation is the Extensible
Stylesheet Language (XSL). XSL is an XML vocabulary; that is, XSL documents are well-
formed XML documents. So the transformation “program” is an XML document. An XSL
document normally contains two types of information: template data, which is text that is
copied to the output XML text document with little or no change; and XSL markup, which
controls the transformation process.
An example XSL document is shown in Figure 7.12. The first thing to notice is that
it uses two namespaces. http://www.w3.org/1999/XSL/Transform is the namespace name
for the XSL namespace, and of course http://www.w3.org/1999/xhtml is the XHTML
namespace name. So, in this document, XHTML is the default namespace, and elements
prefixed with xsl belong to the XSL namespace. Elements belonging to the XSL namespace
are part of the XSL “programming language” and direct the transformation, while elements
in other namespaces form part of the template data of the XSL document.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.w3.org/1999/xhtml">
<xsl:template match="/">
<html>
<head>
<title>
HelloWorld.xsl (transformed)
</title>
</head>
<body>
<p><xsl:value-of select="child::message" /></p>
</body>
</html>
</xsl:template>
</xsl:transform>
FIGURE 7.12 An example XSL document HelloWorld.xsl
The Java DOM API specifies a number of interfaces that correspond to DOM objects and
classes in the context of JavaScript, such as Node, Document.
Let’s consider the Java program of Figure 7.7. This program performs the following task:
input from a user-specified file an RSS document such as the one shown in Figure 7.1, and
output the number of link elements contained in the input document. So, for example, if the
example RSS document of Figure 7.1 was contained in a file named
ExampleContentFeed.xml and the Java program was named DOMCountLink, then a run of
the program might look like (user input is italicized)
$ java DOMCountLinks ExampleContentFeed.xml
Input document has 3 'link' elements.
The heart of this program consists of the three statements
Document document = parser.parse(new File(args[0]));
NodeList links = document.getElementsByTagName("link");
System.out.println("Input document has " +links.getLength() +" 'link' elements.");
// JAXP classes
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
// DOM classes
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
// JDK classes
import java.io.File;
/** Count the number of link elements in an RSS 0.91 document */
class DOMCountLinks {
/** Main program does it all */
static public void main(String args[]) {
try {
// JAXP-style initialization of DocumentBuilder
// (XML parser that builds DOM from document)
DocumentBuilderFactory docBuilderFactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder parser = docBuilderFactory.newDocumentBuilder();
// Parse XML document from file given by first argument
// into a DOM Document object
Document document = parser.parse(new File(args[0]));
// Process the Document object using the Java API version of
// the W3C DOM
NodeList links = document.getElementsByTagName("link");
System.out.println("Input document has " +
links.getLength() +
" 'link' elements.");
}
catch (Exception e) {
e.printStackTrace();
}
return;
}
}
FIGURE 7.7 DOM-based program for displaying the number of links in an RSS 0.91 document
The first of these statements opens the file specified by the first command-line argument and parses
the XML document contained in this file, producing a Document object that is the Java counterpart
to window.document in JavaScript. The second statement calls on the getElementsByTagName()
method (described in Table 5.5) to retrieve a list of Node objects corresponding to elements of type
link in the XML document. Finally, outputting the number of objects in this list completes the
program’s task.
4. Describe in detail about the XML schema, built in and user defined data type detail.
(13)
Any time XML is going to be used to communicate data between software applications. To
address such issues like limits on the number of digits the string
can contain? Can scientific notation (such as 4.34e4) be used? Are negative values only
represented using standard mathematical notation (by a leading minus sign, as in -23.43), or
can an accounting format (such as (23.43)) be used?, the W3C has developed an XML
vocabulary known as XML Schema. A key contribution of XML Schema is its definition of a
collection of standard data types. Each data type definition includes a specification of the
range of values that can be represented by the data type (for example, integers ranging
from −32,768 to 32,767) and details on how to represent those values as strings
TABLE 9.1 JAX-RPC Mappings between Supported Java Classes and XML Schema built-in Data
Types
Java Class XML Schema Type
String string
java.math.BigDecimal decimal
java.math.BigInteger integer
java.util.Calendar dateTime
java.util.Date dateTime
java.xml.namespace.QName QName
java.net.URI anyURI
<simpleType name="oddType">
<union memberTypes="memberType phoneNumType" />
</simpleType>
<simpleType name="intList">
<list itemType="int" />
</simpleType>
defines a data type with values that are white-space-separated lists of int values. This
provides a simple way to represent an array of values
For example, Figure 9.17 contains a complexType element defining an XML Schema user
defined complex type named ExchangeValues. An instance document containing an element
anExchangeValue conforming with this type definition might look like
<anExchangeValue xsi:type="ExchangeValues">
<dollars>1.0</dollars>
<euros>0.746826</euros>
<yen>102.56</yen>
</anExchangeValue>
Notice that a data value of a complex type will be represented in an XML document by
an element that (with certain exceptions, such as empty elements) has as its content other
elements specified as part of the type definition. In the example just given, the element of
type ExchangeValues contains three elements (dollars, euros, and yen) as content.
Although the syntax of complexType is different from that of the XML DTD ELEMENT tag,
their purposes are similar: to define an XML content specification.
In the case of complexType, the content is specified indirectly for elements belonging to the
defined complex data type, while in the case of ELEMENT the content is specified directly for
an element. The sequence element is the XML Schema analog of the sequence operator “,”
in an XML DTD content specification, so the markup
<complexType name="ExchangeValues">
<sequence>
<element name="dollars" type="double"/>
<element name="euros" type="double"/>
<element name="yen" type="double"/>
</sequence>
</complexType>
6. Describe in detail about the differences between DTD and XML schema for defining
XML document structures with appropriate examples. (13)
S. DTD XSD (XML Schema Definition)
NO.
1. DTD are the declarations that XSD describes the elements in a XML
define a document type for document.
SGML.
2. It doesn’t support namespace. It supports namespace.
3. It is comparatively harder than It is relatively simpler than DTD.
XSD.
4. It doesn’t support datatypes. It supports datatypes.
5. SGML syntax is used for DTD. XML is used for writing XSD.
6. It is not extensible in nature. It is extensible in nature.
7. It doesn’t give us much control on It gives us more control on structure of XML
structure of XML document. document.
DTD
The abstract syntax for each flavor of XHTML 1.0 is defined by a set of text files known
collectively as an XML document type definition (DTD). To introduce you to the basic
elements of a DTD, let’s begin with a simple example drawn from an XHTML DTD:
The first of these lines is an example of an element type declaration. Element type
declarations are used to specify the set of all valid elements in the language defined by the
DTD.
This example shows the actual element type declaration for the XHTML 1.0 html element.
The information following the element type name is known as the content specification
for the element; it provides information about the valid content of the element type being
declared. This particular declaration says that, in XHTML 1.0 Strict, the html element
must have two children, a head element followed by a body element. We’ll describe content
specifications in detail in Section 2.10.1.
The second line begins a tag that represents an XML attribute list declaration. As
you might guess, this provides information about the valid attributes for an element, in this
case for the html element. The attribute list declaration shown is equivalent to the actual
XHTML 1.0 Strict declaration for html and says that this element type has five attributes:
lang, xml:lang, dir, id, and xmlns. It also provides information such as the valid set of
values for each attribute and default value information. More will be said about this in
Section 2.10.2.
The final line is an example of an XML entity declaration. Such a tag is essentially
a macro definition, associating the name gt (an entity) with the string >. We have
already learned how to “call” such macros using entity references such as >. Now we
can see more clearly how they are processed: the application reading a document
containing an entity reference simply replaces the reference with the string represented by
the entity, and then recursively processes this string. In this case, the string is a character
reference, and the recursive processing will replace this reference with an appropriate
encoding of the Unicode Standard value for the greater-than character.
XML SCHEMA
A schema is much like an XML DTD: it is used to define the elements and attributes of an
XML vocabulary
To address such issues like limits on the number of digits the string can contain? Can
scientific notation (such as 4.34e4) be used? Are negative values only
represented using standard mathematical notation (by a leading minus sign, as in -23.43), or
can an accounting format (such as (23.43)) be used?, the W3C has developed an XML
vocabulary known as XML Schema. A key contribution of XML Schema is its definition of a
collection of standard data types. Each data type definition includes a specification of the
range of values that can be represented by the data type (for example, integers ranging
from −32,768 to 32,767) and details on how to represent those values as strings
<shiporder orderid="889923"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="shiporder.xsd">
<orderperson>John Smith</orderperson>
<shipto>
<name>Ola Nordmann</name>
<address>Langgt 23</address>
<city>4000 Stavanger</city>
<country>Norway</country>
</shipto>
<item>
<title>Empire Burlesque</title>
<note>Special Edition</note>
<quantity>1</quantity>
<price>10.90</price>
</item>
<item>
<title>Hide your heart</title>
<quantity>1</quantity>
<price>9.90</price>
</item>
</shiporder>
The XML schema for the above XML document is given below
<xs:element name="shiporder">
<xs:complexType>
<xs:sequence>
<xs:element name="orderperson" type="xs:string"/>
<xs:element name="shipto">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="item" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="title" type="xs:string"/>
<xs:element name="note" type="xs:string" minOccurs="0
"/>
<xs:element name="quantity" type="xs:positiveInteger"
/>
<xs:element name="price" type="xs:decimal"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="orderid" type="xs:string" use="required"/
>
</xs:complexType>
</xs:element>
</xs:schema>
7. (i) Demonstrate the building blocks of DOM. (7)
<<<<<<<REFER ANSWER FOR UNIT3 PART B Q.NO 3) ii) >>>>>>>>>>>
(ii) Classify the types of DTD. (6)
<<<<<<<REFER ANSWER FOR UNIT3 PART B Q.NO 4) >>>>>>>>>>>
10. Discover XML document to store voter ID, voter name, address and date of birth
details and validate the document with the help of DTD. (13)
Type the following in notepad and save it as vote.dtd
<? xml version = "1.0" encoding = "UTF-8">
<! DOCTYPE Voter_Information
[ <! Element Voter_information (Id, Name, Address, Date_of_birth)>
<! Element Id (#PCDATA)>
<! Element Name (#PCDATA)>
<! Element Address (#PCDATA)>
<! Element Date_of_birth (#PCDATA)>
]>
Type the following in notepad and save it as vote.xml in the same directory where vote.dtd
is stored
<?xml version="1.0"?>
<!DOCTYPE Voter_Information SYSTEM "vote.dtd">
<Voter_Information>
<Id> Voter Id = FGK99567 </Id>
<Name> Voter Name = Mohan Kumar </Name>
<Address> Address = Assam </Address>
<Date_of_birth> Date of birth = 05-03-1991 </Date_of_birth>
</Voter_Information>
OUTPUT
This XML file does not appear to have any style information associated with it.
The document tree is shown below.
<Voter_Information>
<Id> Voter Id = FGK99567 </Id>
<Name> Voter Name = Mohan Kumar </Name>
<Address> Address = Assam </Address>
<Date_of_birth> Date of birth = 05-03-1991 </Date_of_birth>
</Voter_Information>
Thus the document follows the rules mentioned in the DTD and it is displayed in browser.
we have already seen some simple examples of XPath expressions in Figure 7.12. For
example, in the markup
<xsl:template match="/">
the value of the match attribute (/) is an XPath expression. This particular XPath expression
represents the XPath document root. An XPath expression such as this that represents one
or more nodes within an XPath parse tree is known as a location path. As another example,
in the markup
<xsl:value-of select="child::message" />
the value of the select attribute (child::message) is a location path. Unlike the / location
path, we cannot say which nodes are represented by this second location path without
knowing some additional information. In particular, this location path is defined relative to
some XPath parse tree node known as the context node for the location path.
In this second example, the location path consists of a single location step. More
generally, a location path can consist of multiple location steps separated by slash (/)
characters; we’ll see examples of such location paths later. Each location step consists of at
least two parts: an axis name followed by two colons (::) and a node test. In this example, child is the
axis name and message is the node test
TABLE 7.2 Some XPath 1.0 Axis Names
Name Relationship with Context Node
Next, after the node list has been produced, the XSLT processor in effect calls the XPath
string() function on the node list. That is, all of the text in the first node (“first” here
meaning the first in the order of document appearance) and in its descendants is
concatenated (in order of document appearance) into a single string. This string becomes
the content of the text node added to the result tree. So, in our example, a text node
containing Hello World! Is added to the result tree. Because the value-of element was a
child of a p element in the style-sheet tree, the text node generated will be a child of the
corresponding p element in the result tree, as shown in Figure 7.16.
13. (i) List out data types data types of XML (7)
XML Schema has a lot of built-in data types. The most common types are:
For a person's gender, the <person> element can be written like this:
<person gender="female">
or like this:
<person gender='female'>
If the attribute value itself contains double quotes you can use single quotes, like in this
example:
<gangster name='George "Shotgun" Ziegler'>
or you can use character entities:
<gangster name="George "Shotgun" Ziegler">
<messages>
<note id="501">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
<note id="502">
<to>Jani</to>
<from>Tove</from>
<heading>Re: Reminder</heading>
<body>I will not</body>
</note>
</messages>
The id attributes above are for identifying the different notes. It is not a part of the note
itself.
Metadata (data about data) should be stored as attributes, and the data itself should be
stored as elements.
14. Summarize in detail the XML schema, built in and user defined data types. (13)
<<<<<<<REFER ANSWER FOR UNIT3 PART B Q.NO 4) >>>>>>>>>>>
15. Demonstrate how can both Internal and External DTDs be used in an XML File?
Show with an Example (13)
In fact, you can use both internal and external DTDs at the same time by using these forms
of the
<!DOCTYPE> element: <!DOCTYPE rootname SYSTEM URL [ DTD ]> for private external
DTDs,
and <!DOCTYPE rootname PUBLIC FPI URL [ DTD ]> for public external DTDs.
In this case, the external DTD is specified by URL and the internal one by DTD .
Here's an example in which I've removed the <PRODUCT> element from the external DTD
ch03_10.dtd:
Now I'll specify that I want to use this external DTD in the document's <!DOCTYPE> element
and then also add square brackets, [ and ] , to enclose an internal DTD:
Next, I add the declaration of the <PRODUCT> element to the internal part of the DTD, like
this:
</PRODUCT>
<NUMBER>8</NUMBER>
<PRICE>.25</PRICE>
</ITEM>
.
.
.
<ITEM>
<PRODUCT>
<PRODUCT_ID> 198348206 </PRODUCT_ID>
</PRODUCT>
<NUMBER>6</NUMBER>
<PRICE>.50</PRICE>
</ITEM>
</ORDERS>
</CUSTOMER>
</DOCUMENT>
And that's all it takes; now this DTD uses both internal and external parts .
16. Explain the procedure for validating the XML Documents. (13)
Validation is a process by which an XML document is validated. An XML
document is said to be valid if its contents match with the elements,
attributes and associated document type declaration(DTD), and if the
document complies with the constraints expressed in it. Validation is dealt
in two ways by the XML parser. They are −
<address>
<name>Tanmay Patil</name>
<company>TutorialsPoint</company>
<phone>(011) 123-4567</phone>
</address>
PART – C
1. Briefly Explain about MVC architecture in detail. (15)
Most real-world web applications are of course much larger and may contain a large
number of components (servlets and JSP documents) as well as numerous support files
(such as JavaBeans classes). While there are many possible ways of organizing the
components and support files for such an application, one approach, called the model-view-
controller paradigm, is widely used in one form or another in many web applications. In fact,
the MVC paradigm was elucidated long before the Web existed, and can be applied in any
system that has both data processing and data presentation requirements.
But our focus will be on its use with web applications.
A web application following the MVC organizational paradigm will typically have a single
controller that receives all incoming HTTP requests. Because the controller provides a single
point of entry to the application, application-wide tasks such as initialization, logging, and
controlling access to the application are often performed by the controller.
The controller may also interact with model components of the application. These (model
components) are software components that represent the persistent data of the application
and server-side processing performed on this data. Finally, when the controller and model
software have performed all necessary pre-processing of a request, the controller selects an
appropriate view(presentation) component and forwards the HTTP request to that
component. The view component will generally obtain data from the request and/or model
components and then generate an HTTP response that presents a formatted view of this
data.
In a web application written using JSP, the controller portion is often implemented as
one or more Java servlets. The model components will typically consist of JavaBeans
classes
and/or a database (or other storage mechanism). Each view component is normally a JSP
document. Each such JSP document might also access JavaBeans objects and tag libraries
to obtain model data for inclusion in the response it generates. Furthermore, portions of
the response might be generated by calling on other servlets or JSP documents within the
web application.
2. Summarize about XML schema and XML Parsers and Validation. (15)
<<<<<<<REFER ANSWER FOR UNIT3 PART B
Q.No 6) for XML schema
Q.No 8) for XML Parsers
Q.No 16) for XML Validation >>>>>>>>>>>
3. Get the students’ details like name, register number and mark using form. Generate
DTD for this XML document. Name Reg no Mark. (15)
XYZ 1000 90
ABC 1001 80
RST 1002 89
PQR 1003 87
Generate the collected information in the descending order of marks using XSLT. Results
should be displayed in the above format. Write a source code and explain the same.
student.xml File
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type='text/xsl' href='cgpa.xsl'?>
<class>
<student>
<name> XYZ </name>
<regno> 1000 </regno>
<mark> 90</mark>
</student>
<student>
<name> ABC </name>
<regno> 1001 </regno>
<mark> 80</mark>
</student>
<student>
<name> RST </name>
<regno> 1002</regno>
<mark> 89 </mark>
</student>
<student>
<name> PQR </name>
<regno> 1002</regno >
<mark> 87 </mark>
</student>
</class>
cgpa.xsl File
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="https://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>Student list in descending order of their CGPA.</h2>
<table border="1">
<tr bgcolor="lightblue">
<th>ID</th>
<th>Name</th>
<th>MARK</th>
</tr>
<xsl:for-each select="class/student">
<xsl:sort select="mark" order="descending" data-type="number"/>
<tr>
<td><xsl:value-of select="regno"/></td>
<td><xsl:value-of select="name"/></td>
<td><xsl:value-of select="mark"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
4. Explain how XSLT transforms the document from one (Word) type to other type
(HTML). (15)
<<<<<REFER ANSWER FOR UNIT3 PART B Q.No 12) >>>>>>>>>>>
Low memory needs since the XML file is never entirely in memory.
Use callback procedures to identify and respond only certain XML elements.
Read Also
Android SAX Parser
Disadvantages of SAX
The file has to be parsed entirely to access any node. Thus, getting the 10 nodes included in
Poor navigation abilities : There is no way to get easily the children of a given node or the
An alternative to the DOM approach is to have the parser interact with an application as it
reads an XML document. This is the approach taken by SAX (Simple API for XML). In
the SAX view of XML processing, as an XML parser is reading an XML document, certain
events occur. For example, reading an element start tag is an event, as is reading its end tag
or reading text contained within an element. SAX allows an application to register event
listeners with the XML parser, much as the DOM event model described in Section 5.6
allowed a JavaScript program to register event listeners with a browser. A SAX parser calls
these listeners as events occur and passes them information about the events.
There is no formal standard defining SAX, and in fact no one owns the SAX API
(it is in the public domain). However, SAX is, according to the official SAX Web site
(http://www.saxproject.org/), “the first widely adopted API for XML in Java, and is a
‘de facto’ standard.” The version described here is SAX 2.0.1, often referred to as SAX2
[SAX-2.0.1].
Figure 7.8 and Figure 7.9 contain a SAX2 recoding of the link-counting program
from Figure 7.7.
Once the parser has been obtained, a two-step process is followed to input an XML
document. First, the parser is passed—via a call to its setContentHandler() method—an
instance of a Java class that defines the event-handling methods to be called by the parser;
more on this class shortly. Second, the parse() method of the parser is called with an
argument representing the URL of the XML document to be parsed.
// JAXP classes
import javax.xml.parsers.SAXParserFactory;
import javax.xml.parsers.SAXParser;
// SAX classes
import org.xml.sax.XMLReader;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
/** Count the number of link elements in an XML document */
class SAXCountLinks {
/** Source for RSS feed */
static String FEED_URL = "http://today.java.net/rss/21.rss";
/** Initialize XMLReader and set up event handlers */
static public void main(String args[]) {
try {
// JAXP-style initialization of SAX parser
SAXParserFactory saxFactory = SAXParserFactory.newInstance();
XMLReader parser = saxFactory.newSAXParser().getXMLReader();
// SAX-style processing of RSS document at FEED_URL
parser.setContentHandler(new CountElementsHelper());
parser.parse(FEED_URL);
}
catch (Exception e) {
e.printStackTrace();
}
return;
}
FIGURE 7.8 Initial portion of SAX-based program for displaying the number of links in an
RSS 0.91 document.