DTD and Namespaces
DTD and Namespaces
(DTD)
1
Unnat-e 08/07/24
Definition
xml is static. It has to be used by applications.
Applications can work only with specific format
of data. For example, an editor uses a specific
format, say text,bold, italic etc. However, xml is
by nature extensible. Parsers only check that an
xml document is well formed.
There is a need to constrain xml optionally when
required.
It is easier to build applications that use xml if it
is easy to validate that xml is valid.
2
Unnat-e 08/07/24
Definition
DTDs allows us to validate XML. It
allows us to specify
– what elements are allowed in a document
– depict relationships between the elements
(children of an element, one to one, one to
many relationships etc)
– the attributes that an element can have
– type of data that can be present in an attribute
(but not elements)
3
Unnat-e 08/07/24
A simple DTD example
Consider the following xml
<person>
<name><first_name>Bill </first_name>
<last_name>Gates</last_name>
</name>
<profession> CEO</profession>
</person>
The DTD for the xml is given below
<!ELEMENT person(name,profession*)>
<!ELEMENT name(first_name,last_name)>
<!ELEMENT first_name(#PCDATA)>
<!ELEMENT last_name(#PCDATA)>
<!ELEMENT profession(#PCDATA)>
4
Unnat-e 08/07/24
DTD, explained
Each line is an element declaration. The
first line says that a person element
consists of a name and zero or more
profession elements. The second line says
that name consists of one first name and
one last name. The last three lines say that
first_name,last_name, and profession
contain PCDATA (parsed character data).
In other words, this contains data and no
child elements.
5
Unnat-e 08/07/24
Referencing DTD’s
Including the DTD in the xml file
– <!DOCTYPE person SYSTEM "person.dtd">
– This means that the dtd for the document is person.dtd
and it is located in the same directory/url as the xml
file.
Some dtds are well known and are bundled along
with the XML parser. In this case, the PUBLIC
ID can be used instead of the SYSTEM ID.
– <!DOCTYPE rss PUBLIC "-//Netscape Communications/DTD RSS 0.91/EN
"http://my.netscape.com/publish/formats/rss-0.91.dtd"
6
Unnat-e 08/07/24
Internal DTDs
DTD definitions can be embedded in the
xml file itself
– <!DOCTYPE person [
– <!ELEMENT person(name,profession*)>
– <!ELEMENT name(first_name,last_name)>
– <!ELEMENT first_name(#PCDATA)>
– <!ELEMENT last_name(#PCDATA)>
– <!ELEMENT profession(#PCDATA)>
– ]>
7
Unnat-e 08/07/24
Mixed DTDs
<!DOCTYPE person SYSTEM “name.dtd” [
– <!ELEMENT person(name,profession*)>
– <!ELEMENT profession(#PCDATA)>
]>
8
Unnat-e 08/07/24
Validating Documents using
DTD
Parsers may or may not check for validation.
When parsing apis are invoked from a program,
it is necessary to turn on validation by means of
some flag provided in the API.
Browsers do not check xml files against dtds
Validation can be tested using a sample utility
from the xerces open source API.
– java sax.SAXCount -v person.xml
9
Unnat-e 08/07/24
DTD constructs
Element declaration
Every element used in the xml file should
be declared using the statement
– <!ELEMENT element_name (content_model)>
Here element_name is the name of the
element. The content model can either be a
simple type (#PCDATA) or a sequence of
other elements.
10
Unnat-e 08/07/24
DTD constructs
Element types
– #PCDATA
This says that element may contain any parsed character data,
but not any child elements.
– Child Element
If an element has exactly one child then it is represented by
the child element
– <!ELEMENT fax (phone_number)>
– Sequences
The most elementary sequence is one where an element has
more than one child in a particular sequence but only one of
each type.
– <!ELEMENT name (first_name,last_name)
11
Unnat-e 08/07/24
DTD Constructs
Previous example indicates
a name should contain both first name and last name
if either of them is missing, it is invalid
if last_name precedes first_name it is invalid
Specifying n-n relationships
? - zero or 1 child
* - zero or many children
+ - 1 or many children
Exercise – What do the following declarations
mean?
<!ELEMENT name (first_name,middle_name?,last_name)
<!ELEMENT person (name,profession+,hobbies*)
12
Unnat-e 08/07/24
DTD Constructs
Choices
– If an element can contain either one or the other
element as a child, then a choice | symbol is used.
<!ELEMENT circle (centre,(radius|diameter))>
Empty elements
– If an xml element is going to contain no data, or child
elements, it will be marked as empty
<image src="photo.jpg" width="10" height="10"/>
This will have a type EMPTY
<!ELEMENT image EMPTY>
– An element marked as empty can still have attributes
13
Unnat-e 08/07/24
DTD Constructs
ANY
– Some elements can be marked with a type of
ANY. This means that there are no type
constraints on this element. It can contain any
attributes, children or data.
– <!ELEMENT page ANY>
14
Unnat-e 08/07/24
DTD Constructs
Attribute Declarations
– The valid attributes of an element can be designed as follows
<!ATTLIST person
born CDATA #REQUIRED
died CDATA #IMPLIED>
This can also be written as
<!ATTLIST person born CDATA #REQUIRED>
<!ATTLIST person died CDATA #IMPLICIT>
Attribute Types
– Unlike elements, different types can be specified for attributes. The valid attribute types
are
– CDATA – character data
– NMTOKEN – same constraints as those for XML names
– Enumerations - this is not a keyword. Instead it is a set of tokens separated by | sign
which form valid values for an attribute
– ex: <!ATTLIST date month (January|february|march|april)
#REQUIRED>
15
Unnat-e 08/07/24
Including a DTD within another
DTD
<!ENTITY % names SYSTEM "names.dtd">
%names;
Entity References
– Entity references allow you to define a token that stands for
some other text. For example, an entity can be defined as
<!ENTITY pcd “#PCDATA”>
– It can be used as follows
<!ELEMENT name %pcd;>
– References to the entity are enclosed within % and ;. Wherever,
the entity is encountered, it is substituted with the corresponding
text.
16
Unnat-e 08/07/24
Exercise
Write a DTD for the organization chart created in
the previous exercise
Write a DTD for the library xml file created in
the previous exercise
Test the xml files with the DTDs.
17
Unnat-e 08/07/24
Namespaces
Lets begin with an exercise!
– In the previous exercises, an organization chart and a library xml
files were created. A dtd was created for the organization chart
and the library xml files too. In this exercise
Create an xml file called organization.xml that contains both the
organization chart xml and the library xml. Also, these should be
validated, so ensure that a dtd is created for this file. Since the dtds
for the two xml files have been created already, ensure that these
are reused.
18
Unnat-e 08/07/24
Namespaces – the need
XML elements can be reused and assembled to create
other xml files.
When such assembling is done, there can be name
clashes. Common terms like name, age etc tend to appear
in different applications. Sometimes same terms can have
different meanings in different documents, for example
table could mean an html table, a multiplication table or a
piece of furniture based on the application
Different operations need to identify the element
uniquely to perform operations such as validation,
display etc. An html table should be displayed and
validated differently from a multiplication table.
19
Unnat-e 08/07/24
Namespace Definition contd…
Namespaces can be defined using the following syntax
<og:employee og:name=“Bill Gates” xmlns:og=http://abc.com>
<og:designation>CEO</og:designation>
<og:subordinates>
<og:employee og:name=“ABC”>
<og:designation>VP</og:designation>
</og:employee>
</og:subordinates>
</og:employee>
<prefix:<tag> xmlns:<prefix>=“uri”>
Syntax consists of two steps
Define the prefix
– Xmlns:prefix=http://www.abc.com
Use the prefix
– <og:employee>
– The Syntax looks a little odd because both of them appear on the same line and the
prefix appears to be used before declaration!
20
Unnat-e 08/07/24
Namespace definition (contd)
The uri can be any valid url, it need not actually exist.
The only requirement is that it is unique within the xml
document.
Namespaces are always set at element level. A
namespace applies only to the element it is set. Its
subordinates must be prefixed with the same prefix to
indicate that they belong to the namespace.
Sub elements inside a main element can belong to a
different namespace, in which case the namespace for the
sub element can be explicitly declared.
21
Unnat-e 08/07/24
Declaring a default namespace
If all the subelements from an element come from the
same namespace, then a default namespace can be
declared as follows
– <employee xmlns=“http://www.abc.com”>
<subordinates>
</subordinates>
</employee>
22
Unnat-e 08/07/24
Namespaces and DTDs
Namespaces and DTDs are orthogonal. Xml documents
can have DTD’s or namespaces or both.
If an element is qualified by a prefix in the xml
document, then the DTD should also declare an element
with the same prefix and name. Example
<!ELEMENT dc:title (#PCDATA)>
Sometimes two DTDs may contain the same prefix. In
such a case, one of the DTD’s should be changed. It is
easier to change a DTD if the prefix is defined as an
entity reference
23
Unnat-e 08/07/24
Namespace prefixes in DTD
– <!ENTITY % dc-prefix “dc”>
– <!ENTITY % dc-colon “:”>
24
Unnat-e 08/07/24
Exercise
Create an xml file that shows both the organization chart and the
library. Ensure that they are validated using proper DTDs’
25
Unnat-e 08/07/24