What Is The Document Object Model
What Is The Document Object Model
Editors
Jonathan Robie, Texcel Research
Introduction
The Document Object Model (DOM) is a programming API for HTML and XML
documents. It defines the logical structure of documents and the way a document is
accessed and manipulated. In the DOM specification, the term "document" is used in
the broad sense - increasingly, XML is being used as a way of representing many
different kinds of information that may be stored in diverse systems, and much of this
would traditionally be seen as data rather than as documents. Nevertheless, XML
presents this data as documents, and the DOM may be used to manage this data.
With the Document Object Model, programmers can create and build documents,
navigate their structure, and add, modify, or delete elements and content. Anything
found in an HTML or XML document can be accessed, changed, deleted, or added
using the Document Object Model, with a few exceptions - in particular, the DOM
interfaces for the internal subset and external subset have not yet been specified.
As a W3C specification, one important objective for the Document Object Model is to
provide a standard programming interface that can be used in a wide variety of
environments and applications. The Document Object Model can be used with any
programming language. In order to provide precise, language-independent
specification of the Document Object Model interfaces, we have chosen to define the
specifications in OMG IDL, as defined in the CORBA 2.2 specification. In addition to
the OMG IDL specification, we provide language bindings for Java and ECMAScript
(an industry-standard scripting language based on JavaScript and
JScript). Note: OMG IDL is used only as a language-independent and
implementation-neutral way to specify interfaces. Various other IDLs could have been
used; the use of OMG IDL does not imply a requirement to use a specific object
binding runtime.
In the Document Object Model, documents have a logical structure which is very
much like a tree; to be more precise, it is like a "forest" or "grove" which can contain
more than one tree. However, the Document Object Model does not specify that
documents be implemented as a tree or a grove , nor does it specify how the
relationships among objects be implemented in any way. In other words, the object
model specifies the logical model for the programming interface, and this logical
model may be implemented in any way that a particular implementation finds
convenient. In this specification, we use the term structure model to describe the tree-
like representation of a document; we specifically avoid terms like "tree" or "grove" in
order to avoid implying a particular implementation. One important property of DOM
structure models is structural isomorphism: if any two Document Object Model
implementations are used to create a representation of the same document, they will
create the same structure model, with precisely the same objects and relationships.
The name "Document Object Model" was chosen because it is an "object model" is
used in the traditional object oriented design sense: documents are modeled using
objects, and the model encompasses not only the structure of a document, but also the
behavior of a document and the objects of which it is composed. In other words, the
nodes in the above diagram do not represent a data structure, they represent objects,
which have functions and identity. As an object model, the Document Object Model
identifies:
The Document Object Model currently consists of two parts, DOM Core and DOM
HTML. The DOM Core represents the functionality used for XML documents, and
also serves as the basis for DOM HTML. All DOM implementations must support the
interfaces listed as "fundamental" in the Core specification; in addition, XML
implementations must support the interfaces listed as "extended" in the Core
specification. The Level 1 DOM HTML specification defines additional functionality
needed for HTML documents.
1. Attributes defined in the IDL do not imply concrete objects which must have
specific data members - in the language bindings, they are translated to a pair
of get()/set() functions, not to a data member. (Read-only functions have only a
get() function in the language bindings).
2. DOM applications may provide additional interfaces and objects not found in
this specification and still be considered DOM compliant.
3. Because we specify interfaces and not the actual objects that are to be created,
the DOM can not know what constructors to call for an implementation. In
general, DOM users call the createXXX() methods on the Document class to
create document structures, and DOM implementations create their own
internal representations of these structures in their implementations of the
createXXX() functions.
1. A structure model for the internal subset and the external subset.
2. Validation against a schema.
3. Control for rendering documents via stylesheets.
4. Access control.
5. Thread-safety.