0% found this document useful (0 votes)
251 views

Module 2 - Dynamic Document With Javascript - PDF

This document discusses dynamic documents with JavaScript. It introduces dynamic HTML, which allows tag attributes, contents, and styles to be changed after a document is displayed. It discusses using the DOM to address and modify document elements. It also summarizes CSS positioning and visibility attributes that are important for dynamic HTML, such as position, top, left, width, and height, which specify an element's position and size.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
251 views

Module 2 - Dynamic Document With Javascript - PDF

This document discusses dynamic documents with JavaScript. It introduces dynamic HTML, which allows tag attributes, contents, and styles to be changed after a document is displayed. It discusses using the DOM to address and modify document elements. It also summarizes CSS positioning and visibility attributes that are important for dynamic HTML, such as position, top, left, width, and height, which specify an element's position and size.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

ACHARYA INSTITUTE OF GRADUATE STUDIES

(NAAC Re-Accredited ‘A’ & Affiliated to Bengaluru City University)


Soladevanahalli, Bengaluru-560107

CAC23T – Web Programming


Module 2

DYNAMIC DOCUMENTS WITH JAVASCRIPT

Introduction to dynamic documents


Dynamic HTML is not a new markup language

- A dynamic HTML document is one whose tag attributes, tag contents, or


element style properties can be changed after the document has been and is
still being displayed by a browser
- We will discuss only W3C standard approaches
- All examples in this chapter, except the last, use the DOM 0 event model
and work with both IE6 and NS6
- To make changes in a document, a script must be able to address the
elements of the document using the DOM addresses of those elements
Positioning elements
For DHTML content developers, the most important feature of CSS is the
ability to use ordinary CSS style attributes to specify the visibility, size, and
precCSE position of individual elements of a document. In order to do
DHTML programming, it is important to understand how these style
attributes work. They are summarized in Table and documented in more
detail in the sections that follow.
Table. CSS positioning and visibility attributes

Attribute(s) Description
position Specifies the type of positioning
applied to an element
top, left Specifies the position of the top and left
edges of an element
bottom, right
Specifies the position of the bottom and
right edges of an element
width, height
Specifies the size of an element
Specifies the "stacking order" of an
z-index element relative to any overlapping
elements; defines a third dimension of
element positioning
display Specifies how and whether an element
is displayed
visibility Specifies whether an element is visible
Defines a "clipping region" for an
clip element; only portions of the element
within this region are displayed
overflow Specifies what to do if an element is
bigger than the space allotted for it
The Key to DHTML: The position Attribute

The CSS position attribute specifies the type of positioning applied to an


element. The four possible values for this attribute are:
static
This is the default value and specifies that the element is positioned
according to the normal flow of document content (for most Western
languages, this is left to right and top to bottom.) Statically positioned
elements are not DHTML elements and cannot be positioned with the
top, left, and other attributes. To use DHTML positioning techniques
with a document element, you must first set its position
attribute to one of the other three values.

absolute allows you to specify the position of an element relative to its


T containing element. Absolutely positioned elements are
h positioned independently of all other elements and are not part of
i the flow of statically positioned elements.
Fixed
An absolutely positioned element is positioned either relative to
the se <body> of the document or, if it is nested within another
absolutely positioned element, relative to that element. This is the
most commonly used positioning v type for DHTML. a
lThis value allows you to specify an element's position with respect to
the ubrowser window. Elements with fixed positioning do not scroll
with the rest of ethe document and thus can be used to achieve frame-
like effects. Like absolutely positioned elements, fixed-position
elements are independent of all others.
relative
When the position attribute is set to relative, an element is laid out
according to the normal flow, and its position is then adjusted relative
to its position in the normal flow. The space allocated for the element
in the normal document flow remains allocated for it, and the elements
on either side of it do not close up to fill in that space, nor are they
"pushed away" from the new position of the element. Relative
positioning can be useful for some static graphic design purposes, but
it is not commonly used for DHTML effects.

Specifying the Position and Size of Elements

Once you have set the position attribute of an element to something other
than static, you can specify the position of that element with some
combination of the left , top, right, and bottom attributes. The most common
positioning technique is to specify the left and top attributes, which specify
the distance from the left edge of the containing element (usually the
document itself ) to the left edge of the element, and the distance from the
top edge of the container to the top edge of the element. For example, to
place an element 100 pixels from the left and 100 pixels from the top of the
document, you can specify CSS styles in a style attribute as follows:

<div style="position: absolute; left: 100px; top: 100px;">


The containing element relative to which a dynamic element is positioned is
not necessarily the same as the containing element within which the element
is defined in the document source. Since dynamic elements are not part of
normal element flow, their positions are not specified relative to the static
container element within which they are defined. Most dynamic elements are
positioned relative to the document (the <body> tag) itself. The exception is
dynamic elements that are defined within other dynamic elements. In this
case, the nested dynamic element is positioned relative to its nearest dynamic
ancestor.

Although it is most common to specify the position of the upper-left corner


of an element with left and top, you can also use right and bottom to specify
the position of the bottom and right edges of an element relative to the
bottom and right edges of the containing element. For example, to position
an element so that its bottom-right corner is at the

bottom-right of the document (assuming it is not nested within another


dynamic element), use the following styles:
position: absolute; right: 0px; bottom: 0px;

To position an element so that its top edge is 10 pixels from the top of the
window and its right edge is 10 pixels from the right of the window, you can
use these styles:

position: fixed; right: 10px; top: 10px;

Note that the right and bottom attributes are newer additions to the CSS
standard and are not supported by fourth-generation browsers, as top and left
are.

In addition to the position of elements, CSS allows you to specify their size.
This is most commonly done by providing values for the width and height
style attributes. For example, the following HTML creates an absolutely
positioned element with no content. Its width, height, and background-color
attributes make it appear as a small blue square:
<div style="position: absolute; left: 10px; right:
10px; width: 10px; height: 10px;
background-color: blue">
</div>
Another way to specify the width of an element is to specify a value for both
the left and right attributes. Similarly, you can specify the height of an
element by specifying both top and bottom. If you specify a value for left,
right, and width, however, the width attribute overrides the right attribute; if
the height of an element is over-constrained, height takes priority over
bottom.
Bear in mind that it is not necessary to specify the size of every dynamic
element. Some elements, such as images, have an intrinsic size. Furthermore,
for dynamic elements that contain text or other flowed content, it is often
sufficient to specify the desired width of the element and allow the height to
be determined automatically by the layout of the element's content.
In the previous positioning examples, values for the position and size
attributes were specified with the suffix "px". This stands for pixels. The
CSS standard allows measurements to be done in a number of other units,
including inches ("in"), centimeters
("cm"), points ("pt"), and ems ("em" -- a measure of the line height for the
current font). Pixel units are most commonly used with DHTML
programming. Note that the CSS standard requires a unit to be specified.
Some browsers may assume pixels if you omit the unit specification, but
you should not rely on this behavior.

Instead of specifying absolute positions and sizes using the units shown
above, CSS also allows you to specify the position and size of an element as
a percentage of the size of the containing element. For example, the
following HTML creates an empty element with a black border that is half
as wide and half as high as the containing element (or the browser window)
and centered within that element:

<div style="position: absolute; left: 25%; top: 25%; width: 50%;


height: 50%; border: 2px solid black">
</div>
Element size and position details

It is important to understand some details about how the left , right, width,
top, bottom, and height attributes work. First, width and height specify the
size of an element's content area only; they do not include any additional
space required for the element's padding, border, or margins. To determine
the full onscreen size of an element with a border, you must add the left and
right padding and left and right border widths to the element width, and you
must add the top and bottom padding and top and bottom border widths to
the element's height.

Since width and height specify the element content area only, you might
think that left and top (and right and bottom) would be measured relative to
the content area of the containing element. In fact, the CSS standard specifies
that these values are measured relative to the outside edge of the containing
element's padding (which is the same as the inside edge of the element's
border).

Let's consider an example to make this clearer. Suppose you've created a


dynamically positioned container element that has 10 pixels of padding all
the way around its content area and a 5 pixel border all the way around the
padding. Now suppose you dynamically position a child element inside this
container. If you set the left attribute of the child to "0
px", you'll discover that the child is positioned with its left edge right up
against the inner edge of the container's border. With this setting, the child
overlaps the container's padding, which presumably was supposed to remain
empty (since that is the purpose of padding). If you want to position the child
element in the upper left corner of the container's content area, you should set
both the left and top attributes to "10px". Figure helps to clarify this.

Figure Dynamically positioned container and child elements with some


CSS attributes

Now that you understand that width and height specify the size of an
element's content area only and that the left, top, right, and bottom attributes
are measured relative to the containing element's padding, there is one more
detail you must be aware of: Internet Explorer Versions 4 through 5.5 for
Windows (but not IE 5 for the Mac) implement the width and height
attributes incorrectly and include an element's border and padding (but not its
margins). For example, if you set the width of an element to 100 pixels and
place a 10-pixel margin and a 5-pixel border on the left and right, the content
area of the element ends up being only 70 pixels wide in these buggy
versions of Internet Explorer.

In IE 6, the CSS position and size attributes work correctly when the browser
is in standards mode and incorrectly (but compatibly with earlier versions)
when the browser is in compatibility mode. Standards mode, and hence
correct implementation of the CSS
"box model," is triggered by the presence of a <!DOCTYPE> tag at the start
of the document, declaring that the document adheres to the HTML 4.0 (or
later) standard or some version of the XHTML standards. For example, any
of the following three HTML document type declarations cause IE 6 to
display documents in standards mode:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Strict//EN">
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0
Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

Netscape 6 and the Mozilla browser handle the width and height attributes
correctly. But these browsers also have standards and compatibility modes,
just as IE does. The absence of a <!DOCTYPE> declaration puts the
Netscape browser in quirks mode, in which it mimics certain (relatively
minor) nonstandard layout behaviors of Netscape 4. The presence of
<!DOCTYPE> causes the browser to break compatibility with Netscape 4
and correctly implement the standards.

Moving elements

id = window.setInterval("animate();", 100);

A rather rare, but still used, DHTML scenario is not only positioning an
element, but also moving and therefore animating an element. To do so,
window.setTimeout() and window.setInterval() come in handy. The
following code animates an ad banner diagonally over the page. The only
potential issue is how to animate the position. For the Internet Explorer
properties (posLeft, posTop), just adding a value suffices. For left and top,
you first have to determine the old position and then add a value to it. The
JavaScript function parseInt() extracts the numeric content from a string like
"123px". However, parseInt() returns NaN if no value is found in left or top.
Therefore, the following helper function takes care of this situation; in this
case, 0 is returned instead:

function
myParseInt(s) { var
ret = parseInt(s);
return (isNaN(ret) ? 0 : ret);
}
Then, the following code animates the banner and stops after 50 iterations:
Animating an Element (animate.html; excerpt)
<script
language="JavaScript"
type="text/javascript">
var nr = 0; var
id = null;
function
animate() {
nr++;
if (nr > 50) { window.clearInterval(id);
document.getElementById("element").style.visibility = "hidden";
} else {
var el =
document.getElementById("element");
el.style.left =
(myParseInt(el.style.left) + 5) +
"px"; el.style.posLeft += 5;
el.style.top =
(myParseInt(el.style.top) + 5) +
"px"; el.style.posTop += 5;
}
}
window.onload = function() {
id = window.setInterval("animate();", 100);
};
</script>
<h1>My Portal</h1>
<div id="element" style="position:
absolute; background-color: #eee;
border: 1px solid"> JavaScript
Phrasebook
</div>
Element visibility
The visibility property of an element controls whether it is displayed
- The values are visible and hidden
- Suppose we want to toggle between hidden and visible, and the
element‘s DOM address is dom
if (dom.visibility ==
"visible" dom.visibility =
"hidden";
else
dom.visibility = "visible";
--> SHOW showHide.html
Changing colors and fonts
In order to change text colors, you will need two things:

1. A command to change the text.


2. A color (hex) code.
Section 1: Changing Full-Page Text Colors

You have the ability to change full-page text colors over four levels:
<TEXT="######"> -- This denotes the full-page text color.
<LINK="######"> -- This denotes the color of the links on
your page.

<ALINK="######"> -- This denotes the color the link will flash when clicked
upon.

<VLINK="######"> -- This denotes the colors of the links after they have been
visited.
These commands come right after the <TITLE> commands. Again, in that
position they affect everything on the page. Also... place them all together
inside the same command along with any background commands. Something
like this:

< BODY BGCOLOR="######" TEXT="######" LINK="######"


VLINK="######">

<VLINK="#FFFFFF">

Section 2: Changing Specific Word


Color But I only want to change one
word's color!
You'll use a color (hex) code to do the trick. Follow this formula:
<FONT COLOR="######">text text text text text</FONT>

It's a pain in the you-know-where, but it gets the job done. It works with all
H commands and text size commands. Basically, if it's text, it will work.

Using Cascading
Style Sheets (CSS) to Change Text Colors

There isn't enough space to fully describe what CSS is capable of in this
article, but we have several articles here that can get you up to speed in no
time! For a great tutorial on using CSS to change color properties, check out
this article by Vincent Wright.

A quick intro to CSS is in order, so let's describe it a bit. CSS is used to


define different elements on your web page. These elements include text
colors, link colors, page background, tables, forms--just about every aspect
of the style of the web page. You can use CSS inline, much like the HTML
above, or you can, more preferably, include theh style sheet within the
HEAD tags on your page, as in this example:

<STYLE
type=text/css>
A:link {
COLOR: red /*The color of the link*/
}

A:visited {
COLOR: #800080 /*The color of the visited link*/
}
A:hover {
COLOR: green /*The color of the mouseover or 'hover' link*/
}
BODY { COLOR: #800080 /*The color of all the other text within the body of the
page*/
{
</STYLE>
Alternately, you can include the CSS that is between the STYLE tags
above, and save it in a file that you could call "basic.css" which would be
placed in the root directory of your website. You would then refer to that
style sheet by using a link that goes between
Programming the Web 10CS73

the HEAD tags in your web page, like this:


<link type="text/css" rel="stylesheet" href="basic.css">
As you can see in the example above, you can refer to the colors using
traditional color names, or hex codes as described above.

The use of CSS is vastly superior to using inline FONT tags and such, as it
separates the content of your site from the style of your site, simplifying the
process as you create more pages or change the style of elements. If you are
using an external style sheet, you can make the change once in the style
sheet, and it will be applied to your entire site. If you choose to include the
style sheet itself within the HEAD tags as shown above, then you will have
to make those changes on every page on your site.

CSS is such a useful tool in your


web developer arsenal, you should definitely take the
time to read more about it in our CSS Tutorials section.

Dynamic content
In the early days of the Web, once a Web page was loaded into a client's
browser, changing the information displayed on the page required another
call to the Web server.
The user interacted with the Web page by submitting forms or clicking on
links to other Web pages, and the Web server delivered a new page to the
client.

Any customization of information - such as building a Web page on the fly


from a template - required that the Web server spent additional time
processing the page request.

In short dynamic content is about changing the content of a Web page by


inserting and deleting elements or the content inside elements before, or
after, a Web page has been loaded into a client's browser.

Stacking Elements
The top and left properties allow the placement of an element anywhere in the two
dimensions of the display of a document. Although the display is restricted to two
physical dimensions, the effect of a third dimension is possible through the simple
concept of stacked elements, such as that used to stack windows in graphical user
interfaces. Although multiple elements can occupy the same space in the document, one
is considered to be on top and is displayed. The top element hides the parts of the lower
elements on which it is superimposed. The placement of elements in this third
dimension is controlled by the z-index attribute of the element. An element whose z-
Programming the Web 10CS73
index is greater than that of an element in the same space will be displayed over the
other element, effectively hiding the element with the smaller z-index value. The
JavaScript style property associated with the z-index attribute is zIndex.
In the next example, three images are placed on the display so that they overlap. In the
XHTML description of this situation, each image tag includes an onclick attribute,
which is used to trigger the execution of a JavaScript handler function. First the
function defines DOM addresses for the last top element and the new top element. Then
the function sets the zIndex value of the two elements so that the old top element has a
value of 0 and the new top element has the value 10,
effectively putting it at the top. The script keeps track of which image is currently on
top with the global variable top, which is changed every time a new element is moved
to the top with the toTop function. Note that the zIndex value, as is the case with other
properties, is a string. This document, called stacking.html, and the associated
JavaScript file are as follows:

Locating the Mouse Cursor


Recall from Chapter 5 that every event that occurs while an XHTML document is being
displayed creates an event object. This object includes some information about the
event. A mouse-click event is an implementation of the Mouse-Event interface, which
defines two pairs of properties that provide geometric coordinates of the position of the
element in the display that created the event. One of these pairs, clientX and clientY,
gives the coordinates of the element relative to the upper-left corner of the browser
display window, in pixels. The other pair, screenX and screenY, also gives coordinates
of the element, but relative to the client upper-left corner of the browser display
window, in pixels. The other pair, screenX and screenY, also gives coordinates of the
element, but relative to the client
computer’s screen. Obviously, the former pair is usually more useful than the latter.
In the next example, where.html, two pairs of text boxes are used to display these four
properties every time the mouse button is clicked. The handler is triggered by the
onclick attribute of the body element. An image is displayed just below the display of
the coordinates, but only to make the screen more interesting.
The call to the handler in this example sends event, which is a reference to the event
object just created in the element, as a parameter. This is a bit of magic, because the
event object is implicitly created. In the handler, the formal parameter is used to access
the properties of the coordinates. Note that the handling of the event object is not
implemented the same way in the popular browsers. The Firefox browsers send it as a
parameter to event handlers, whereas Microsoft browsers make it available as a global
property. The code in where.html works for both of these approaches by sending the
event object in the call to the handler. It is available in the call with Microsoft browsers
because it is visible there as a global variable. Of course, for a Microsoft browser, it
need not be sent at all. The where.html document and it’s associated JavaScript file are
as follow
Programming the Web 10CS73

Reacting to a Mouse Click


The next example is another one related to reacting to mouse clicks. In this case, the
mousedown and mouseup events are used, respectively, to show and hide the message
“Please don’t click here!” on the display under the mouse cursor whenever the mouse
button is clicked, regardless of where the cursor is at the time. The offsets (-130 for left
and -25 for top) modify the actual cursor position so that the message is approximately
centered over it. Here is the document and it’s associated JavaScript file:
Slow Movement of Elements
So far, only element movements that happen instantly have been considered. These
movements are controlled by changing the top and left properties of the element to be
moved. The only way to move an element slowly is to move it by small amounts many
times, with the moves separated by small amounts of time.
JavaScript has two Window methods that are capable of this task: setTimeout and
setInterval.
The setTimeout method takes two parameters: a string of JavaScript code to be
executed and a number of milliseconds of delay before executing the given code.
For example, the call
setTimeout(“mover()”, 20);
causes a 20-millisecond delay, after which the function mover is called.
The setInterval method has two forms. One form takes two parameters, exactly as does
setTimeout. It executes the given code repeatedly, using the second parameter as the
interval, in milliseconds, between executions. The second form of setInterval takes a
variable number of parameters. The first parameter is the name of a function to be
called, the second is the interval in milliseconds between the calls to the function, and
the remaining parameters are used as actual parameters to the function being called.
The example presented here, moveText.html, moves a string of text from one position
(100, 100) to a new position (300, 300). The move is accomplished by using
setTimeout to call a mover function every millisecond until the final position (300,
300) is reached. The initial position of the text is set in the span element that specifies
the text. The onload attribute of the body element is used to call a function, initText, to
initialize the x- and y-coordinates of the initial position to the left and top properties of
the element and call the mover function.
The mover function, named moveText, takes the current coordinates of the text as
parameters, moves them one pixel toward the final position, and then, using
setTimeout, calls itself with the new coordinates. The recomputation of the coordinates
is complicated by the fact that we want the code to work regardless of the direction of
the move.
One consideration with this script is that the properties of the coordinates are stored as
strings with units attached. For example, if the initial position of an element is (100,
100), its left and top property values both have the string value “100px”. To change the
properties arithmetically, they must be numbers.
Programming the Web 10CS73
Therefore, the property values are converted to strings with just numeric digit
characters in the initText function by stripping the nondigit unit parts. This conversion
allows them to be coerced to numbers when they are used as operands in arithmetic
expressions. Before the left and top properties are set to the new coordinates, the units
abbreviation (in this case, “px”) is catenated back onto the coordinates.
It is interesting that, in this example, placing the event handler in a separate file avoids
a problem that would occur if the JavaScript were embedded in the markup. The
problem is the use of XHTML comments to hide JavaScript and having possible parts
of XHTML comments embedded in the JavaScript.
For example, if the JavaScript statement x--; is embedded in an XHTML comment, the
validator complains that the -- in the statement is an invalid comment declaration.3
In the code file, moveTextfuns.js, note the complexity of the call to the moveText
function in the call to setTimeout. This level of complexity is required because the call
to moveText must be built from static strings with the values of the variables x and y
catenated in.
The moveText.html document and the associated JavaScript file, moveTextfuns.js, are
as follows:
Dragging and Dropping Elements
One of the more powerful effects of event handling is allowing the user to drag and
drop elements around the display screen. The mouseup, mousedown, and mousemove
events can be used to implement this feature. Changing the top and left properties of an
element, as seen earlier in the chapter, causes the element to move. To illustrate drag
and drop, an XHTML document and a JavaScript file that creates a magnetic poetry
system is developed, showing two static lines of a poem and allowing the user to create
the last two lines from a collection of movable words.
and allowing the user to create the last two lines from a collection of movable words.
This example uses a mixture of the DOM 0 and DOM 2 event models. The DOM 0
model is used for the call to the handler for the mousedown event. The rest of the
process is designed with the DOM 2 model. The mousedown event handler, grabber,
takes the Event object as its parameter. It gets the element to be moved from the
currentTarget property of the Event object and puts it in a global variable so that it is
available to the other handlers. Then it determines the coordinates of the current
position of the element to be moved and computes the difference between each of them
and the corresponding coordinates of the position of the mouse cursor. These two
differences, which are used by the handler for mousemove to actually move the
element, are also placed in global variables.
The grabber handler also registers the event handlers for mousemove and mouseup.
These two handlers are named mover and dropper, respectively. The dropper handler
disconnects mouse movements from the element-moving process by unregistering the
handlers mover and dropper. The following is the document we have just described,
called dragNDrop.html. Following it is the associated JavaScript file.
Programming the Web 10CS73

XML
5.1 Introduction

SGML is a meta-markup language is a language for defining markup language


it can describe a wide variety of document types.
_ Developed in the early 1980s; In 1986 SGML was approved by ISO std.
_ HTML was developed using SGML in the early 1990s - specifically for
Web documents.
_ Two problems with HTML:
1. HTML is defined to describe the general form and layout of information
without considering its meaning.
2. Fixed set of tags and attributes. Given tags must fit every kind of
document. No way to find particular information
3. There are no restrictions on arrangement or order of tag appearance in
document. For example, an opening tag can appear in the content of an
element, but its corresponding closing tag can appear after the end of the
element in which it is nested.
Eg : <strong> Now <em> is </strong> the time </em>
_ One solution to the first problems is to allow for group of users with
common needs to define their own tags and attributes and then use the
SGML standard to define a new markup language to meet those needs. Each
application area would have its own markup language.
_ Use SGML to define a new markup language to meet those needs
_ Problem with using SGML:
1. It‘s too large and complex to use and it is very difficult to build a parser
for it. SGML includes a large number of capabilities that are only rarely
used.
2. A program capable of SGML documents would be very large and costly to
develop.
3. SGML requires that a formal definition be provided with each new
markup language. So having area-specific markup language is a good idea,
basing them on SGML is not.
_ A better solution: Define a simplified version of SGML and allow users to
define their own markup languages based on it. XML was designed to be
that simplified version of
Programming the Web 10CS73

SGML.
_ XML is not a replacement for HTML . Infact two have different goals
_ HTML is a markup language used to describe the layout of any kind of
information
_ XML is a meta-markup language that provides framework for defining
specialized markup languages
_ Instead of creating text file like
<html>
<head><title>name</title></head>…
… Syntax
<name>
<first> nandini </first>
<last> sidnal </last>
</name>
XML is much larger then text file but makes easier to write software that
accesses the information by giving structure to data.
_ XML is a very simple and universal way of storing and transferring any textual
kind
_ XML does not predefine any tags
- XML tag and its content, together with closing tag _ element
_ XML has no hidden specifications
_ XML based markup language as _ tag set
_ Document that uses XML based markup language _ XML document
_ An XML processor is a program that parses XML documents and provides
the parts to an application
_ Both IE7 and FX2 support basic XML XML is a meta language for
describing mark-up languages. It provides a facility to define tags and the
structural relationship between them
What is XML?
• XML stands for EXtensible Markup Language
• XML is a markup language much like HTML
• XML was designed to carry data, not to display data
Programming the Web 10CS73

• XML tags are not predefined. You must define your own tags
• XML is designed to be self-descriptive
• XML is a W3C
Recommendation XML is not a
replacement for HTML.
• XML syntax rules:
The syntax of XML can be thought of at two distinct levels.
1. There is the general low-level syntax of XML that imposes its rules on all
XML documents.
2. The other syntactic level is specified by either document type definitions
or XML schemas. These two kinds of specifications impose structural
syntactic rules on documents written with specific XML tag sets.
_ DTDs and XML schema specify the set of tags and attributes that can
appear in particular document or collection of documents, and also the orders
and various arrangements in which they can appear.
DTD‘s and XML schema can be used to define a XML markup
language. XML document can include several different kinds
of statements.
• Data elements
• Markup declarations - instructions to XML parser
• Processing instructions – instructions for an applications program that will
process the data described in the document. The most common of these are
the data elements of the document. XML document may also include markup
declarations, which are instructions to the XML parser, and processing
instructions, which are instructions for an application program that will
process the data described in the document.
All XML document must begin with XML declaration. It identifies the
document as being XML and provides the version no. of the XML standard
being used. It will also include encoding standard. It is a first line of the
XHTML document.

-- This is a comment -->


XML names must begin with a letter or underscore and can include
digits, hyphens, and periods.
XML names are case sensitive. , the tag <Letter> is different from the tag
<letter>.
Programming the Web 10CS73

HTML truncates multiple white-space characters to one single


white-space: HTML: Hello my name is Tove
Output: Hello my name is Tove.
With XML, the white-space in a document is not truncated.
Every XML document defines a single root element, whose opening tag
must appear on the first line of XML code.
ments must be nested inside the root element. The root element
of every XHTML document is html.

<element_name/> --- no content

<p>This is a paragraph
<p>This is another
paragraph. must have a closing tag:

<p>This is a paragraph</p>
<p>This is another
paragraph</p>

<Message>This is incorrect</message>
<message>This is correct</message>

In HTML, you might see improperly nested elements:


<b><i>This text is bold and italic</b></i>
In XML, all elements must be properly nested within each other:
<b><i>This text is bold and italic</i></b>
Programming the Web 10CS73

In the example above, "Properly nested" simply means that since the
<i> element is opened inside the <b> element, it must be closed inside
the <b> element.
XML Documents Must Have a Root Element
t contain one element that is the parent of all other
elements.
This element is called the root element.
<root>
<child>
<subchild> </subchild>
</child>
</root>
XML tags can have attributes, which are specified with name/value
assignments. XML Attribute Values must be enclosed with single or double
quotation marks. XML document that strictly adheres to these syntax rule is
considered as well formed. An XML document that follows all of these rules
is well formed
Example :
<?xml version = ―1.0‖ encoding = ―utf-8‖ ?>
<ad>
<year>1960</year>
<make>Cessna</make>
<model>Centurian</moel>
<color>Yellow with white trim</color>
<location>
<city>Gulfport</city>
<state>Mississippi</state>
</location>
</ad>
None of this tag in the document is defined in XHTML-all are designed for
the specific content of the document. When designing an XML document,
the designer is often faced with the choice between adding a new attribute to
an element or defining a nested element.
Programming the Web 10CS73

o In some cases there is no choices.


o In other cases, it may not matter whether an attribute or a nested element is used.
o Nested tags are used,
• when tags might need to grow in structural complexity in the future
• if the data is subdata of the parent element‘s content
• if the data has substructure of its own
o Attribute is used ,
• For identifying numbers/names of element
• If data in question is one value from a given possiblities
• The attribute is used if there is no substructure
<!-- A tag with one attribute -->
<patient name = "Maggie Dee Magpie">
...
</patient>
<!-- A tag with one nested tag -->
<patient>
<name> Maggie Dee Magpie </name>
...
</patient>
<!-- A tag with one nested tag,
which contains three nested tags -->
<patient>
<name>
<first> Maggie </first>
<middle> Dee </middle>
<last> Magpie </last>
</name>
...
</patient>
Programming the Web 10CS73

Document structure

XML document often uses two auxiliary files:


1. It specifies tag set and syntactic structural rules.
2. It contain the style sheet to describe how the content of the document to be
printed. XML documents (and HTML documents) are made up by the
following building blocks: Elements, Tags, Attributes, Entities, PCDATA,
and CDATA
Elements
Elements are the main building blocks of both XML and HTML
documents. Elements can contain text, other elements, or be empty.
Tags
Tags are used to markup elements.
A starting tag like <element_name> mark up the beginning of an element,
and an ending tag like
</element_name> mark up the end of an
element. Examples:
A body element: <body>body text in between</body>.
A message element: <message>some message in between</message>
Attributes
Attributes provide extra information about elements.
Attributes are placed inside the start tag of an element. Attributes come in
name/value pairs. The following "img" element has an additional information
about a source file:
<img src="computer.gif" /> The name of the element is "img". The name of
the attribute is "src".
The value of the attribute is "computer.gif". Since the element itself is
empty it is closed by a "
/".
PCDATA
PCDATA means parsed character data.
Think of character data as the text found between the start tag and the end tag
of an XML element.
Programming the Web 10CS73

PCDATA is text that will be parsed by a parser. Tags inside the text will be
treated as markup and entities will be expanded.
CDATA
CDATA also means character data. CDATA is text that will NOT be parsed
by a parser. Tags inside the text will NOT be treated as markup and entities
will not be expanded.
Most of you know the HTML entity reference: "&nbsp;" that is used to
insert an extra space in an HTML document. Entities are expanded when a
document is parsed by an XML parser.
_ An XML document often uses two auxiliary files:
• One to specify the structural syntactic rules ( DTD / XML schema)
• One to provide a style specification ( CSS /XSLT Style Sheets)
Entities
An XML document has a single root element, but often consists of one or
more entities An XML document consist of one or more entities that are
logically related collection of information,
Entities range from a single special character to a book chapter
• An XML document has one document entity
* All other entities are referenced in the
document entity Reasons to break a document
into multiple entities.
1. Good to define a Large documents as a smaller no. of parts easier to manage .
2. If the same data appears in more than one place in the document, defining
it as a entity allows any no. of references to a single copy of data.
3. Many documents include information that cannot be represented as text,
such as images. Such information units are usually stored as binary data.
Binary entities can only be referenced in the document entities
Entity names:
• No length limitation
• Must begin with a letter, a dash, or a colon
• Can include letters, digits, periods, dashes, underscores, or colons
_ A reference to an entity has the form name with prepended
ampersand and appended semicolon: &entity_name; Eg. &apple_image;
Programming the Web 10CS73

_ One common use of entities is for special characters that may be


used fo markup delimiters to appear themselves in the document.

_ If several predefined entities must appear near each other in a


document, it is better to avoid using entity references. Character data section
can be used. The content of a character data section is not parsed by the XML
parser, so it can include any tags.
_ The form of a character data section is as follows:
<![CDATA[content]]> // no tags can be used since it is not parsed For
example, instead of Start &gt;&gt;&gt;&gt; HERE &lt;&lt;&lt;&lt;
use
<![CDATA[Start >>>> HERE <<<<]]>
The opening keyword of a character data section is not just CDATA,
it is in effect [CDATA[. There can be any spaces between [ and C or
between A and [.
_ Content of Character data section is not parsed by parser For
example the content of the line <![CDATA[The form of a tag is &lt;tag
name&gt;]]> is as follows The form of a tag is &lt;tag name&gt;

Document Type definitions

A DTD is a set of structural rules called declarations. These rules specify a


set of elements, along with how and where they can appear in a document
• Purpose: provide a standard form for a collection of XML documents
an define a markup language for them.
• DTD provide entity definition.
Programming the Web 10CS73

• With DTD, application development would be simpler.


• Not all XML documents have or need a DTD

ı External style sheets are used to impose a uniform style over a


collection of documents.
ı W hen are DTDs used?
When same tag set definition are used by collection of documents ,
collection of users and documents must have a consistent and uniform
structure.
ı A document can be tested against a DTD to determine weather it confirms
to the rules the DTD describes.
ı Application programs that processes the data in the collection of XML
documents can be written to assume the particular document form.
ı Without such structural restrictions, developing such applications would be
difficult. If not impossible.
ı The DTD for a document can be internal (embedded in XML
document) or external(separate file)- can be used with more than one
document.
ı DTD with incorrect/inappropriate declaration can have wide-spread
consequences. ı DTD declarations have the form: <!keyword … >
There are four possible declaration keywords:
ELEMENT, ATTLIST, ENTITY, and NOTATION
1. Declaring Elements:
• Element declarations are similar to BNF(CFG)(used to define syntactic
structure of Programming language) here DTD describes syntactic
structure of particular set of doc so its rules are similar to BNF.
• An element declaration specifies the name of an element and its structure

• If the element is a leaf node of the document tree, its structure is


in terms of characters
• If it is an internal node, its structure is a list of children elements
(either leaf or internal nodes)
Programming the Web 10CS73

• General form:

<!ELEMENT element_name(list of child


names)> e.g.,
<!ELEMENT memo (from, to, date, re, body)>
This element structure can describe the document tree structure shown below.

• Child elements can have modifiers,


_ + -> One or more occurrences
_ * -> Zero or more occurrences
_ ? ->Zero or one occurrences
Ex: consider below DTD declaration
<!ELEMENT person (parent+, age, spouse?, sibling*)>
_ One or more parent elements
_ One age element
_ Possible a spouse element.
_ Zero or more sibling element.
• Leaf nodes specify data types of content of their parent nodes which are elements
1. PCDATA (parsable character data)
2. EMPTY (no content)
3. ANY (can have any
content) Example of a leaf
declaration:
<!ELEMENT name (#PCDATA)>
Programming the Web 10CS73

2. Declaring Attributes:
• Attributes are declared separately from the element declarations
• General form:
<!ATTLIST element_name attribute_name attribute_type [default
_value]> More than one attribute
< !ATTLIST element_name attribute_name1 attribute_type default_value_1
attribute_name 2 attribute_type default_value_2
…>
_ Attribute type :There are ten different types, but we will consider only CDATA
_ Possible Default value for attributes:
Value - value ,which is used if none is specified
#Fixed value - value ,which every element have and can‘t be changed
# Required - no default value is given ,every instance must
specify a value #Implied - no default value is given ,the value
may or may not be specified Example :
<!ATTLIST car doors CDATA "4">
<!ATTLIST car engine_type CDATA #REQUIRED>
<!ATTLIST car make CDATA #FIXED "Ford">
<!ATTLIST car price CDATA #IMPLIED>
<car doors = "2" engine_type = "V8">
...
</car>
Declaring Entities :
Two kinds:
• A general entity can be referenced anywhere in the content of an
XML document Ex: Predefined entities are all general entities.
• A parameter entity can be referenced only in DTD.
• General Form of entity declaration.
<!ENTITY [%] entity_name "entity_value">
% when present it specifies declaration
parameter entity Example :
Programming the Web 10CS73

<!ENTITY jfk "John Fitzgerald Kennedy">


_ A reference above declared entity: &jfk;
• If the entity value is longer than a line, define it in a separate file
(an external text entity)
_ General Form of external entity declaration
<!ENTITY entity_name SYSTEM ―file_location">
SYSTEM specifies that the definition of the entity is in a different file.
• Example for parameter entity
<!ENTITY %pat ―(USN, Name)‖ >
<!ELEMENT student %pat; >
4. Sample DTD:
<?xml version = "1.0" encoding = "utf-8"?>
<!-- planes.dtd - a document type definition for the planes.xml document,
which specifies a list of used airplanes for sale -->
<!ELEMENT planes_for_sale (ad+)>
<!ELEMENT ad (year, make, model, color, description, price?, seller, location)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT make (#PCDATA)>
<!ELEMENT model (#PCDATA)>
<!ELEMENT color (#PCDATA)>
<!ELEMENT description (#PCDATA)>
<!ELEMENT price (#PCDATA)>
<!ELEMENT seller (#PCDATA)>
<!ELEMENT location (city, state)>
<!ELEMENT city (#PCDATA)>
<!ELEMENT state (#PCDATA)>
<!ATTLIST seller phone CDATA #REQUIRED>
<!ATTLIST seller email CDATA #IMPLIED>
<!ENTITY c "Cessna">
Programming the Web 10CS73
<!ENTITY p "Piper">
<!ENTITY b "Beechcraft">
Programming the Web 10CS73

5. Internal and External DTD’s:


• Internal DTDs
<!DOCTYPE planes [
<!– The DTD for planes -->
]>
• External DTDs
<!DOCTYPE XML_doc_root_name SYSTEM
―DTD_file_name‖ >
For examples,
<!DOCTYPE planes_for_sale SYSTEM
―planes.dtd‖ >
<?xml version = "1.0" encoding = "utf-8"?>
<!-- planes.xml - A document that lists ads for used airplanes -->
<!DOCTYPE planes_for_sale SYSTEM "planes.dtd">
<planes_for_sale>
<ad>
<year> 1977 </year>
<make> &c; </make>
<model> Skyhawk </model>
<color> Light blue and white </color>
<description> New paint, nearly new interior,
685 hours SMOH, full IFR King avionics </description>
<price> 23,495 </price>
<seller phone = "555-222-3333"> Skyway Aircraft </seller>
<location>
<city> Rapid City, </city>
<state> South Dakota </state>
</location>
</ad></planes_for_sale>
Programming the Web 10CS73

Namespaces

ı XML provides benefits over bit formats and can now create well formed
XML doc. But when applications become complex we need to combine
elements from various doc types into one XML doc. This causes Pb?
ı Two documents will have elements with same name but with different
meanings and semantics.
ı Namespaces – a means by which you can differentiate elements and
attributes of different XML document types from each other when combining
them together into other documents , or even when processing multiple
documents simultaneously.
ı W hy do we need Namespaces?
ı XM L allows users to create their tags, naming collisions can occur For
ex: element title
– <title>Resume of John</title>

– <title>Sir</title><Fname>John</Fname>

ı Na mespaces provide a means for user to prevent naming collisions Element title
– <xhtml:title>Resume of John</xhtml:title>

– <person:title>Sir</person:title><Fname>John</Fname> xhtml and


person --- two namespaces
ı Namespace is collection of elements and attribute names used in XML
documents. It has a form of a URI
ı A Namespace for elements and attributes of the hierarchy rooted at
particular element is declared as the value of the attribute xmlns. Namespace
declaration for an element is
<element_name xmlns[:prefix] = URI>
• The square bracket indicates that what is within them is optional.

• prefix[optional] specify name to be attached to names in the declared


namespace ı Two r easons for prefix :
1. Shorthand for URI // URI is too long to be typed on every occurrence of
every name from the namespace.
Programming the Web 10CS73

2. URI may includes characters that are illegal in XML

ı Usually the element for which a namespace is declared is usually the root
of a document.
<html xmlns = http://www.w3.org/1999/xhtml>
This declares the default namespace, from which names appear
without prefixes. As an example of a prefixed namespace declaration,
consider
<birds xmlns:bd = http://www.audubon.org/names/species>
Within the birds element, including all of its children elements, the names
from the namespace must be prefixed with bd, as in the following.
<bd:lark>
If an element has more than one namespace declaration, then
<birds xmlns:bd =
―http://www.audubon.org/names/species‖ >
xmlns : html =
―http://www.w3.org/1999/xhtml‖ >
Here we have added the standard XHTML namespace to the birds element.
One namespace declaration in an element can be used to declare a default
namespace. This can be done by not including the prefix in the declaration.
The names from the default by omitting the prefix.
Consider the example in which two namespaces are
declared. The first is declared to be the default
namespaces.
The second defines the prefix, cap.
<states>
xmlns = http://www.states-info.org/states
xmlns:cap = http://www.states-info.org/state-capitals
<state>
<name> South Dakota </name>
<population> 75689 </population>
<capital>
Programming the Web 10CS73

<cap:name> Pierre </cap:name>


<cap:population>12429</cap:population>
</capital>
</state>
</states>
Each state element has name and population elements for both namespaces.

XML schemas

ı A schema is any type of model document that defines the structure of


something, such as databases structure or documents. Here something is
XML doc. Actually DTDs are a type of schema.
ı An XM L schema is an XMl document so it can be parsed with an XML parser.
ı The term XML scheme is used to refer to specify W3C XML schema
technology.
ı W 3C XML Schemas like DTD allow you to describe the structure for
an XML doc. ı DTD s have several disadvantages
• Syntax is different from XML - cannot be parsed with an XML parser

• It is confusing to deal with two different syntactic forms

• DTDs do not allow restriction on the form of data that can be content of
element ex:
<quantity>5</quantity> and <quantity>5</quantity> are valid DTD can
only specifies that could be anything. Eg time No datatype for integers all are
treated as texts.
ı XM L Schemas is one of the alternatives to DTD
• It is XML document, so it can be parsed with XML parser

• It also provides far more control over data types than do DTDs

• User can define new types with constraints on existing data types

1. Schema Fundamentals:

• Schema are related idea of class and an object in an OOP language


Programming the Web 10CS73

D Schema ı c lass definition


D XML document confirming to schema structure ı Objec t
• Schemas have two primary purposes

D Specify the structure of its instance XML documents


D pecify the data type of every element & attribute of its instance XML
documents
2. Defining a schema:

Schemas are written from a namespace(schema of schemas):


http://www.w3.org/2001/XMLSchema element, schema, sequence and
string are some names from this namespace
ı Eve ry XML schema has a single root, schema.
• The schema element must specify the namespace for the schema of
schemas from which the schema‘s elements and its attributes will be drawn.
• It often specifies a prefix that will be used for the names in the schema.
This name space specs appears as
xmlns:xsd = http://www.w3.org/2001/XMLSchema
ı Every XML schema itself defines a tag set like DTD, which must be named
with the targetNamespace attribute of schema element. The target namespace
is specified by assigining a name space to the target namespace attribute as
the following: targetNamespace = http://cs.uccs.edu/planeSchema
Every top-level element places its name in the target namespace If we want
to include nested elements, we must set the elementFormDefault attribute to
qualified. elementFormDefault = qualified.
ı The default namespace which is source of the unprefixed names in the
schema is given with another xmlns specification xmlns =
"http://cs.uccs.edu/planeSchema
ı A c omplete example of a schema element:
<xsd:schema
<!-- Namespace for the schema itself -->
xmlns:xsd = http://www.w3.org/2001/XMLSchema
Programming the Web 10CS73

<!-- Namespace where elements defined here will be placed -->


targetNamespace = ―http://cs.uccs.edu/planeSchema‖
<!-- Default namespace for this document -->
xmlns = ―http://cs.uccs.edu/planeSchema‖
<!-- Specify non-top-level elements to be in the target namespace--
> elementFormDefault = "qualified‖ >

Defining a schema instance:


• An instance of schema must specify the namespaces it uses

• These are given as attribute assignments in the tag for its root element

1. Define the default namespace

<planes
xmlns = http://cs.uccs.edu/planesScema
…>
2. It is root element of an instance document is for the schemaLocation attribute.

Specify the standard namespace for instances


(XMLSchema-instance) xmlns:xsi
=―http://www.w3.org/2001/XMLSchema-instance"
3. Specify location where the default namespace is defined, using the
schemaLocation attribute, which is assigned two values namespace and
filename.
xsi:schemaLocation ="http://cs.uccs.edu/planeSchema planes.xsd" >
4. Schema Data types: Two categories of data types

1. Simple (strings only, no attributes and no nested elements)


2. Complex (can have attributes and nested elements)
• XML Schema defines 44 data types

• Primitive: string, Boolean, float, …

• Derived: byte, decimal, positiveInteger, …


Programming the Web 10CS73

• User-defined (derived) data types – specify constraints on


an existing type (then called as base type)
• Constraints are given in terms of facets of the
base type Ex: interget data type has *8 facets :totalDigits,
maxInclusive….
ı B oth simple and complex types can be either named or anonymous
ı DTDs define global elements (context of reference is irrelevant). But
context of reference is essential in XML schema
ı Da ta declarations in an XML schema can be
1. Local ,which appears inside an element that is a child of schema

2. Global, which appears as a child of schema

5. Defining a simple type:

• Use the element tag and set the name and type attributes

<xsd:element name = "bird‖ type =


"xsd:string‖ /> D The instance could be :
<bird> Yellow-bellied sap sucker </bird>
• An element can be given default value using default attribute

<xsd:element name = "bird‖ type = "xsd:string‖ default=―Eagle‖ />


• An element can have constant value, using fixed attribute

<xsd:element name = "bird‖ type =


"xsd:string‖ fixed=―Eagle‖ /> Declaring
User-Defined Types:
• User-Define type is described in a simpleType element, using facets

• facets must be specified in the content of restriction element

• facets values are specified with the value attribute

For example, the following declares a user-defined type , firstName


<xsd:simpleType name = ―firstName" >
<xsd:restriction base = "xsd:string" >
<xsd:maxLength value = "20" />
Programming the Web 10CS73

</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType name = ―phoneNumber" >
<xsd:restriction base = "xsd:decimal" >
<xsd:precision value = ―10" />
</xsd:restriction>
</xsd:simpleType>
6. Declaring Complex Types:

• There are several categories of complex types, but we discuss just one,
element-only elements
• Element-only elements are defined with the complex Type element

• Use the sequence tag for nested elements that must be in a particular order

• Use the all tag if the order is not important

• Nested elements can include attributes that give the allowed number of
occurrences (minOccurs, maxOccurs, unbounded)
• For ex:

<xsd:complexType name = "sports_car" >


<xsd:sequence>
<xsd:element name = "make―type = "xsd:string" />
<xsd:element name = "model ―ytpe = "xsd:string" />
<xsd:element name = "engine―type = "xsd:string" />
<xsd:element name = "year―type = "xsd:string" />
</xsd:sequence>
</xsd:complexType>
7. Validating Instances of Schemas:

• An XML schema provides a definition of a category of XML documents.


Programming the Web 10CS73

• However, developing a schema is of limited value unless there is some


mechanical way to determine whether a given XML instance document
confirms to the schema.
• Several XML schema validation tools are available eg. xsv(XML
schema validator) This can be used to validate online.
• Output of xsv is an XML document. When run from command line
output appears without being formated.
Output of xsv when run on planes.xml
<?XML version = ‗1.0‘ encoding = ‗utf-8?>
<xsv docElt = ‗{
http://cs.uccs.edu/planesSchema} planes‘
instanceAccessed = ‗true‘
instanceErrors =
‗0‘schemaErrors
= ‗0‘
schemaLocs = ‗http://cs.uccs.edu/planesSchema-
>planes.xsd‘ Target = ‗file:
/c:/wbook2/xml/planes.xml‘
Validation = ‗strict‘
Version = ‗XSV 1.197/1.101 of 2001/07/07 12:01:19‘
Xmlns=‗http:// www.w3.org/2000/05.xsv‘>
<importAttempt URI
=‗file:/c:wbook2/xml/planes.xsd‘ namespace =
‗http://cs.uccs.edu/planesSchema‘ outcome =
‗success‘ />
</xsv>
If schema is not in the correct format, the validator will report that it could not
find the specified schema.
Displaying RAW XML Documents

ı An XML enabled browser or any other system that can deal with XML
documents cannot possibly know how to format the tags defined in the doc.
ı Without a style sheet that defines presentation styles for the doc tags the
XML doc can not be displayed in a formatted manner.
Programming the Web 10CS73

ı Some browsers like FX2 have default style sheets that are used when style
sheets are not defined.
ı Eg of planes.xml document.

Displaying XML Documents with CSS

ı Style sheet information can be provided to the browser for an xml


document in two ways.
• First, a CSS file that has style information for the elements in the XML
document can be developed.
• Second the XSLT style sheet technology can be used..

ı Using CSS is effective, XSLT provides far more power over the
appearance of the documents display.
ı A C SS style sheet for an XML document is just a list of its tags and associated
styles
ı The connection of an XML document and its style sheet is made
through an xmlstylesheet processing instruction
ı Display– used to specify whether an element is to be displayed inline or in
a separate block.
<?xml-stylesheet type = "text/css― herf =
―planes.css"?> For example: planes.css
<!-- planes.css - a style sheet for the planes.xml
document --> ad { display: block; margin-top: 15px;
color: blue;}
year, make, model { color: red; font-size: 16pt;}
color {display: block; margin-left: 20px; font-size:
12pt;} description {display: block; margin-left: 20px;
font-size: 12pt;} seller { display: block; margin-left:
15px; font-size: 14pt;} location {display: block;
margin-left: 40px; }
city {font-size:
12pt;} state {font-
size: 12pt;}
<?xml version = "1.0" encoding = "utf-8"?>
Programming the Web 10CS73

<!-- planes.xml - A document that lists ads for used airplanes -->
<planes_for_sale>
<ad>
<year> 1977 </year>
<make> Cessana </make>
<model> Skyhawk </model>
<color> Light blue and white </color>
<description> New interior
</description>
<seller phone = "555-222-
3333"> Skyway Aircraft
</seller>
<location>
<city> Rapid City, </city>
<state> South Dakota </state>
</location>
</ad>
</planes_for_sale>
With planes.css the display of planes.xml as following: 1977 Cessana
Skyhawk Light blue and white New interior Skyway Aircraft Rapid City,
South Dakota

Web Services:
ı The ultimate goal of Web services: Allow different software in different
places, written in different languages and resident on different platforms, to
connect and interoperate
ı The Web began as provider of markup documents, served through the HTTP
methods,
Programming the Web 10CS73

GET and POST


ı A Web service is closely related to an information service
- The server provides services, through server- resident software

- The same Web server can provide both documents and services

ı The original Web services were provided via Remote Procedure Call
(RPC), through two technologies, DCOM and CORBA. DCOM and
CORBA use different protocols, which defeats the goal of universal
component interoperability
ı The re are three roles required to provide and use Web services:
1. Service providers

2. Service requestors

3. A service registry

• Service providers

- Must develop & deploy software that provide service

- Service Description --> Web Serviced Definition Language (WSDL)

*Used to describe available services, as well as of message protocols for


their use their use
* Such descriptions reside on the Web server

• Service requestors

- Uses WSDL to query a query a web registry

• Service registry

- Created using Universal Description, Discovery, and Integration Service


(UDDI)
* UDDI also provides

- Create service registry

- Provides method to query a Web service registry


Programming the Web 10CS73

• Standard Object Access Protocol (SOAP)

- An XML-based specification that defines the forms of messages and RPCs

- The root element of SOAP is envelope

- Envelopeı c ontains SOAP messages – description of web services

- Supports the exchange of information among distributed systems


XSLT Style Sheets
The eXtensible Stylesheet Language (XSL) is a family of recommendations for
defining the presentation and transformations of XML documents. It consists
of three related standards: XSL Transformations (XSLT), XML Path Language
(XPath), and XSL Formatting Objects (XSL-FO). Each of these has an
importance and use of
its own. Together, they provide a powerful means of formatting XML
documents. Because XSL-FO is not yet widely used, it is not discussed in this
book.
XSLT style sheets are used to transform XML documents into different forms
or formats, perhaps using different DTDs. One common use for XSLT is to
transform XML documents into XHTML documents, primarily for display. In
the transformation of an XML document, the content of elements can be
moved, modified, sorted, and converted to attribute values, among other things.
XSLT style sheets are XML documents, so they can be validated against
DTDs. They can even be transformed with the use of other XSLT style sheets.
The XSLT standard is given at http://www.w3.org/TR/xslt. XSLT style sheets
and their uses are the primary topics of this section.
XPath is a language for expressions, which are often used to identify parts of
XML documents, such as specific elements that are in specific positions in the
document or elements that have particular attribute values. XSLT requires such
expressions to specify transformations. XPath is also used for XML document
querying languages, such as XQL, and to build new XML document structures
with XPointer. The XPath standard is given at http://www.w3.org/TR/xpath.
This chapter uses simple XPath expressions in the discussion of XSLT and
does not explore them further.
Overview of XSLT
XSLT is actually a simple functional-style programming language. Included in
XSLT are functions, parameters, names to which values can be bound, selection
constructs, and conditional expressions for multiple selection. The syntactic
structure of XSLT is XML, so each statement is specified with an element. This
approach
Programming the Web 10CS73

makes XSLT documents appear very different from programs in a typical


imperative programming language, but not completely different from programs
written in the LISP-based functional languages COMMON LISP and Scheme.
XSLT processors take both an XML document and an XSLT document as input.
The XSLT document is the program to be executed; the XML document is the
input data to the program. Parts of the XML document are selected, possibly
modified, and merged with parts of the XSLT document to form a new document,
which is sometimes called an XSL document. Note that the XSL document is also
an XML document, which could be again the input to an XSLT processor.
The output document can be stored for future use by applications, or it may be
immediately displayed by an application, often a browser. Neither the XSLT
document nor the input XML document is changed by the XSLT processor.
The transformation process used by an XSLT processor is shown in Figure 7.5.

Figure 7.5 XSLT processing

An XSLT document consists primarily of one or more templates, which use XPath
to describe element–attribute patterns in the input XML document. Each template
has associated with it a section of XSLT “code,” which is “executed” when a
match to the template is found in the XML document. So, each template
describes a function that is executed whenever the XSLT processor finds a match
to the template’s pattern.
An XSLT processor sequentially examines the input XML document, searching for
parts that match one of the templates in the XSLT document. XML documents
consist of nodes—elements, attributes, comments, text, and processing
instructions. If a template matches an element, the element is not processed until
the
closing tag is found. When a template matches an element, the child elements of
that element may or may not be processed.
One XSLT model of processing XML data is called the template-driven model,
which works well when the data consists of multiple instances of highly regular
Programming the Web 10CS73

data collections, as with files containing records. XSLT can also deal with irregular
and recursive data, using template fragments in what is called the data-driven
model. A single XSLT style sheet can include the mechanisms for both the
template- and data-driven models. The discussion of XSLT in this chapter is
restricted to the template-driven model.
To keep the complexity of the discussion manageable, the focus is on
transformations that are related to presentation. The examples in this section were
processed with the XSLT processor that is part of IE8.

XML Processors
So far in this chapter, we have discussed the structure of XML documents, the
rules for writing them, the DTD and XML Schema approaches to specifying the
particular tag sets and structure of collections of XML documents, and the CSS
and XSLT methods of displaying the contents of XML documents. That is
tantamount to telling a long story about how data can be stored and displayed,
without providing any hint on how it may be processed. Although we do not
discuss processing data stored in XML documents in this section, we do introduce
approaches to making that data conveniently available to application programs that
process the data.

The Purposes of XML Processors


Several purposes of XML processors have already been discussed. First, the
processor must check the basic syntax of the document for well-formedness.
Second, the processor must replace all references to entities in an XML document
with their definitions. Third, attributes in DTDs and elements in XML schemas can
specify that
their values in an XML document have default values, which must be copied into
the XML document during processing. Fourth, when a DTD or an XML schema is
specified and the processor includes a validating parser, the structure of the XML
document must be checked to ensure that it is legitimate.
One simple way to check the well-formedness of an XML document is with a
browser that has an XML parser. Information about Microsoft’s MSXML XML
parser (part of IE8), which checks for well-formedness and validation against
either DTDs or XML schemas, is available at
http://msdn2.microsoft.com/enUS/xml/bb291077.aspx.
Information on the XML parsers in other browsers can be found at
http://www.w3.org/XML/Schema.
Although an XML document exhibits a regular and elegant structure, that structure
does not provide applications with convenient access to the document’s data. It
Programming the Web 10CS73

was recognized early on that, because the process of the initial syntactic analysis
required to expose the embedded data must be repeated for every application that
processes XML documents, standard syntax analyzers for XML documents were
needed. Actually, the syntax analyzers themselves need not be standard; rather,
they should expose the data of XML documents in a standard application
programmer interface (API). This need led to the development of two different
standard APIs for
XML processors. Because there are different needs and uses of XML applications,
having two standards is not a negative. The two APIs parallel the two kinds of
output that are produced by the syntax analyzers of compilers for programming
languages. Some of these syntax analyzers produce a stream of the syntactic
structures
of an input program. Others produce a parse tree of the input program that shows
the hierarchical structure of the program in terms of its syntactic structures.

The SAX Approach


The Simple API for XML (SAX) standard, which was released in May 1998, was
developed by an XML users group, XML-DEV. Although not developed or
supported by any standards organization, SAX has been widely accepted as a de
facto standard and is now widely supported by XML processors.
The SAX approach to processing is called event processing. The processor scans
the XML document from beginning to end. Every time a syntactic structure of the
document is recognized, the processor signals an event to the application by calling
an event handler for the particular structure that was found. The syntactic
structures of interest naturally include opening tags, attributes, text, and closing
tags. The interfaces that describe the event handlers form the SAX API.

The DOM Approach


The natural alternative to the SAX approach to XML document parsing is to build
a hierarchical syntactic structure of the document. Given the use of DOM
representations of XHTML documents to create dynamic documents in Chapter 6,
“Dynamic Documents with JavaScript,” this is a familiar idea. In the case of
XHTML, the browser parses the document and builds the DOM tree. In the case of
XML, the parser part of the XML processor builds the DOM tree. In both cases,
the nodes of the tree are represented as objects that can be accessed and processed
or modified by the application. When parsing is complete, the complete DOM
representation of the document is in memory and can be accessed in a number of
different ways, including tree traversals of various kinds as well as random
accesses.
Programming the Web 10CS73

The DOM representation of an XML document has several advantages over the
sequential listing provided by SAX parsers. First, it has an obvious advantage if
any part of the document must be accessed more than once by the application.
Second, if the application must perform any rearrangement of the elements of the
document, that can most easily be done if the whole document is accessible at the
same time. Third, accesses to random parts of the document are possible.
Finally, because the parser sees the whole document before any processing takes
place, this approach avoids any processing of a document that is later found to be
invalid (according to a DTD or XML schema).
In some situations, the SAX approach has advantages over the DOM method. The
DOM structure is stored entirely in memory, so large documents require a great
deal of memory. In fact, because there is no limit on the size of an XML document,
some documents cannot be parsed with the DOM method. This is not a problem
with the SAX approach. Another advantage of the SAX method is speed: It is
faster than the DOM approach.
The process of building the DOM structure of an XML document requires some
syntactic analysis of the document, similar to that done by SAX parsers. In fact,
most DOM parsers include a SAX parser as a front end.

You might also like