0% found this document useful (0 votes)
148 views

HTML Dom

The document discusses the Document Object Model (DOM), which is a W3C specification that defines a programming interface for XML and HTML documents. It allows programs and scripts to dynamically access and update the content, structure and style of documents. The summary provides an overview of DOM levels 1-3, core interfaces like Node, Document and Element, and how to navigate and manipulate DOM objects in different languages. It also gives an example of using the DOM API in Java to build an XML document by creating elements, setting attributes, and saving the file.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
148 views

HTML Dom

The document discusses the Document Object Model (DOM), which is a W3C specification that defines a programming interface for XML and HTML documents. It allows programs and scripts to dynamically access and update the content, structure and style of documents. The summary provides an overview of DOM levels 1-3, core interfaces like Node, Document and Element, and how to navigate and manipulate DOM objects in different languages. It also gives an example of using the DOM API in Java to build an XML document by creating elements, setting attributes, and saving the file.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 27

SDPL 2002 Notes 3.

2: Document Object Model 1


3.2 Document Object Model (DOM)
How to provide uniform access to structured
documents in diverse applications (parsers,
browsers, editors, databases)?
Overview of W3C DOM Specification
second one in the XML-family of
recommendations
Level 1, W3C Rec, Oct. 1998
Level 2, W3C Rec, Nov. 2000
Level 3, W3C Working Draft (January 2002)
What does DOM specify, and how to use it?
SDPL 2002 Notes 3.2: Document Object Model 2
DOM: What is it?
An object-based, language-neutral API for
XML and HTML documents
allows programs and scripts to build documents,
navigate their structure, add, modify or delete
elements and content
Provides a foundation for developing
querying, filtering,
transformation, rendering etc.
applications on top of DOM implementations
In contrast to Serial Access XML could think
as Directly Obtainable in Memory
SDPL 2002 Notes 3.2: Document Object Model 3
DOM structure model
Based on O-O concepts:
methods (to access or change objects state)
interfaces (declaration of a set of methods)
objects (encapsulation of data and methods)
Roughly similar to the XSLT/XPath data
model (to be discussed later)
a parse tree
Tree-like structure implied by the abstract relationships
defined by the programming interfaces;
Does not necessarily reflect data structures used by an
implementation (but probably does)
SDPL 2002 Notes 3.2: Document Object Model 4

invoice
invoicepage
name
addressee
addressdata
address
form="00"
type="estimatedbill"
Leila Laskuprintti
streetaddress postoffice
70460 KUOPIO Pyynpolku 1
<invoice>
<invoicepage form="00"
type="estimatedbill">
<addressee>
<addressdata>
<name>Leila Laskuprintti</name>
<address>
<streetaddress>Pyynpolku 1
</streetaddress>
<postoffice>70460 KUOPIO
</postoffice>
</address>
</addressdata>
</addressee> ...
Document
Element
NamedNodeMap
Text
DOM structure model
SDPL 2002 Notes 3.2: Document Object Model 5
Structure of DOM Level 1
I: DOM Core Interfaces
Fundamental interfaces
basic interfaces to structured documents
Extended interfaces
XML specific: CDATASection, DocumentType,
Notation, Entity, EntityReference,
ProcessingInstruction
II: DOM HTML Interfaces
more convenient to access HTML documents
(we ignore these)
SDPL 2002 Notes 3.2: Document Object Model 6
DOM Level 2
Level 1: basic representation and manipulation of
document structure and content
(No access to the contents of a DTD)
DOM Level 2 adds
support for namespaces
accessing elements by ID attribute values
optional features
interfaces to document views and style sheets
an event model (for, say, user actions on elements)
methods for traversing the document tree and manipulating
regions of document (e.g., selected by the user of an editor)
Loading and writing of docs not specified (-> Level 3)
SDPL 2002 Notes 3.2: Document Object Model 7
DOM Language Bindings
Language-independence:
DOM interfaces are defined using OMG Interface
Definition Language (IDL; Defined in Corba
Specification)
Language bindings (implementations of DOM
interfaces) defined in the Recommendation
for
Java and
ECMAScript (standardised JavaScript)
SDPL 2002 Notes 3.2: Document Object Model 8
Core Interfaces: Node & its variants
Node
Comment
DocumentFragment Attr
Text
Element
CDATASection
ProcessingInstruction
CharacterData
Entity DocumentType Notation
EntityReference
Extended
interfaces
Document
SDPL 2002 Notes 3.2: Document Object Model 9
DOM interfaces: Node

invoice
invoicepage
name
addressee
addressdata
address
form="00"
type="estimatedbill"
Leila Laskuprintti
streetaddress postoffice
70460 KUOPIO Pyynpolku 1
Node
getNodeType
getNodeValue
getOwnerDocument
getParentNode
hasChildNodes getChildNodes
getFirstChild
getLastChild
getPreviousSibling
getNextSibling
hasAttributes getAttributes
appendChild(newChild)
insertBefore(newChild,refChild)
replaceChild(newChild,oldChild)
removeChild(oldChild)

Document
Element
NamedNodeMap
Text
SDPL 2002 Notes 3.2: Document Object Model 10
Object Creation in DOM
Each DOM object X lives in the context of a
Document: X.getOwnerDocument()
Objects implementing interface X are created
by factory methods
D.createX() ,
where D is a Document object. E.g:
createElement("A"),
createAttribute("href"),
createTextNode("Hello!")
Creation and persistent saving of Documents
left to be specified by implementations
SDPL 2002 Notes 3.2: Document Object Model 11

invoice
invoicepage
name
addressee
addressdata
address
form="00"
type="estimatedbill"
Leila Laskuprintti
streetaddress postoffice
70460 KUOPIO Pyynpolku 1
Document
getDocumentElement
createAttribute(name)
createElement(tagName)
createTextNode(data)
getDocType()
getElementById(IdVal)
Node
Document
Element
NamedNodeMap
Text
DOM interfaces: Document
SDPL 2002 Notes 3.2: Document Object Model 12
DOM interfaces: Element

invoice
invoicepage
name
addressee
addressdata
address
form="00"
type="estimatedbill"
Leila Laskuprintti
streetaddress postoffice
70460 KUOPIO Pyynpolku 1
Element
getTagName
getAttributeNode(name)
setAttributeNode(attr)
removeAttribute(name)
getElementsByTagName(name)
hasAttribute(name)
Node
Document
Element
NamedNodeMap
Text
SDPL 2002 Notes 3.2: Document Object Model 13
Accessing properties of a Node
Node.getNodeName()
for an Element = getTagName()
for an Attr: the name of the attribute
for Text = "#text" etc
Node.getNodeValue()
content of a text node, value of attribute, ;
null for an Element (!!)
(in XSLT/Xpath: the full textual content)
Node.getNodeType(): numeric constants
(1, 2, 3, , 12) for ELEMENT_NODE,
ATTRIBUTE_NODE,TEXT_NODE, ,
NOTATION_NODE
SDPL 2002 Notes 3.2: Document Object Model 14
Content and element manipulation
Manipulating CharacterData D:
D.substringData(offset, count)
D.appendData(string)
D.insertData(offset, string)
D.deleteData(offset, count)
D.replaceData(offset, count, string)
(= delete + insert)
Accessing attributes of an Element object E:
E.getAttribute(name)
E.setAttribute(name, value)
E.removeAttribute(name)
SDPL 2002 Notes 3.2: Document Object Model 15
Additional Core Interfaces (1)
NodeList for ordered lists of nodes
e.g. from Node.getChildNodes() or
Element.getElementsByTagName("name")
all descendant elements of type "name" in document
order (wild-card "*"matches any element type)
Accessing a specific node, or iterating over all
nodes of a NodeList:
E.g. Java code to process all children:
for (i=0;
i<node.getChildNodes().getLength();
i++)
process(node.getChildNodes().item(i));
SDPL 2002 Notes 3.2: Document Object Model 16
Additional Core Interfaces (2)
NamedNodeMap for unordered sets of nodes
accessed by their name:
e.g. from Node.getAttributes()
NodeLists and NamedNodeMaps are "live":
changes to the document structure reflected to
their contents
SDPL 2002 Notes 3.2: Document Object Model 17
DOM: Implementations
Java-based parsers
e.g. IBM XML4J, Apache Xerces, Apache
Crimson
MS IE5 browser: COM programming interfaces for
C/C++ and MS Visual Basic, ActiveX object
programming interfaces for script languages
XML::DOM (Perl implementation of DOM Level 1)
Others? Non-parser-implementations?
(Participation of vendors of different kinds of systems
in DOM WG has been active.)
SDPL 2002 Notes 3.2: Document Object Model 18
A Java-DOM Example
A stand-alone toy application BuildXml
either creates a new db document with two
person elements, or adds them to an existing db
document
based on the example in Sect. 8.6 of Deitel et al:
XML - How to program
Technical basis
DOM support in Sun JAXP
native XML document initialisation and storage
methods of the JAXP 1.1 default parser (Apache
Crimson)
SDPL 2002 Notes 3.2: Document Object Model 19
Code of BuildXml (1)
Begin by importing necessary packages:

import java.io.*;
import org.w3c.dom.*;
import org.xml.sax.*;
import javax.xml.parsers.*;
// Native (parse and write) methods of the
// JAXP 1.1 default parser (Apache Crimson):
import org.apache.crimson.tree.XmlDocument;
SDPL 2002 Notes 3.2: Document Object Model 20
Code of BuildXml (2)
Class for modifying the document in file fileName:

public class BuildXml {
private Document document;

public BuildXml(String fileName) {
File docFile = new File(fileName);
Element root = null; // doc root elemen
// Obtain a SAX-based parser:
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();

SDPL 2002 Notes 3.2: Document Object Model 21
Code of BuildXml (3)
try { // to get a new DocumentBuilder:
documentBuilder builder =
factory.newInstance();

if (!docFile.exists()) { //create new doc
document = builder.newDocument();
// add a comment:
Comment comment =
document.createComment(
"A simple personnel list");
document.appendChild(comment);
// Create the root element:
root = document.createElement("db");
document.appendChild(root);
SDPL 2002 Notes 3.2: Document Object Model 22
Code of BuildXml (4)
or if docFile already exists:

} else { // access an existing doc
try { // to parse docFile
document = builder.parse(docFile);
root = document.getDocumentElement();
} catch (SAXException se) {
System.err.println("Error: " +
se.getMessage() );
System.exit(1);
}
/* A similar catch for a possible IOException */
SDPL 2002 Notes 3.2: Document Object Model 23
Code of BuildXml (5)
Create and add two child elements to root:

Node personNode =
createPersonNode(document, "1234",
"Pekka", "Kilpelinen");
root.appendChild(personNode);
personNode =
createPersonNode(document, "5678",
"Irma", "Knnen");
root.appendChild(personNode);
SDPL 2002 Notes 3.2: Document Object Model 24
Code of BuildXml (6)
Finally, store the result document:

try { // to write the
// XML document to file fileName
((XmlDocument) document).write(
new FileOutputStream(fileName));
} catch ( IOException ioe ) {
ioe.printStackTrace();
}
SDPL 2002 Notes 3.2: Document Object Model 25
Subroutine to create person elements
public Node createPersonNode(Document document,
String idNum, String fName, String lName) {
Element person =
document.createElement("person");
person.setAttribute("idnum", idNum);
Element firstName =
document. createElement("first");
person.appendChild(firstName);
firstName. appendChild(
document. createTextNode(fName) );
/* similarly for a lastName */
return person;
}
SDPL 2002 Notes 3.2: Document Object Model 26
The main routine for BuildXml
public static void main(String args[]){
if (args.length > 0) {
String fileName = args[0];
BuildXml buildXml = new
BuildXml(fileName);
} else {
System.err.println(
"Give filename as argument");
};
} // main

SDPL 2002 Notes 3.2: Document Object Model 27
Summary of XML APIs
XML processors make the structure and
contents of XML documents available to
applications through APIs
Event-based APIs
notify application through parsing events
e.g., the SAX call-back interfaces
Object-model (or tree) based APIs
provide a full parse tree
e.g, DOM, W3C Recommendation
more convenient, but may require too much
resources with the largest documents
Major parsers support both SAX and DOM

You might also like