0% found this document useful (0 votes)
3 views105 pages

Chapter 1

The document provides an overview of XML (eXtended Markup Language), detailing its structure, usage, and relationship with other data types such as structured and semi-structured data. It highlights XML's role in data exchange, its hierarchical structure, and various dialects built on XML. Additionally, it discusses the separation of content and presentation in XML, making it a flexible and standardized tool for data management.

Uploaded by

maachoudouaa04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views105 pages

Chapter 1

The document provides an overview of XML (eXtended Markup Language), detailing its structure, usage, and relationship with other data types such as structured and semi-structured data. It highlights XML's role in data exchange, its hierarchical structure, and various dialects built on XML. Additionally, it discusses the separation of content and presentation in XML, making it a flexible and standardized tool for data management.

Uploaded by

maachoudouaa04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 105

Introduction Structure XML DTD NameSpace XML Summary

Semi-Structured Data
Chapter 1: XML, DTD, and Namespace

Amel Boustil
Computer Science Department, FS, University M'Hamed Bougara of Boumerdes,

35000, Algeria.

30 janvier 2024

1
Introduction Structure XML DTD NameSpace XML Summary

Agenda

Declaration of Attributes
Introduction
Entities
Databases and XML
Notations
HTML and XML
Using XML NameSpace XML
XML Galaxy Namespace Declaration
XML Dialects Removing a Namespace
Applying a Namespace to an
Structure XML Attribute
DTD Namespaces and DTD
Declaration of Elements Summary

2
Introduction Structure XML DTD NameSpace XML Summary

Databases and XML


• A database (DB) : is a set of data stored in les and accessible
on demand.
• A DB is managed by a DBMS : Database Management
System.
• A DB is structured according to a data model.
• A data denition language is used to dene the DB, and a
data manipulation language is used to operate on the DB.
• A relational DB consists of a set of relations.
• A relation is a set of tuples or n-tuples.
• A relation is dened by a schema giving the name of the
relation, the list of attributes with their domain, the primary
key, and any foreign keys.
3
Introduction Structure XML DTD NameSpace XML Summary

Structured vs. Semi-Structured Data

Types of data :
• Structured Data : is organized in a tabular format with a
predened schema.
Examples : Relational databases, Excel spreadsheets, CSV les.
• Unstructured Data : does not follow a predened schema or
organization. It can take the form of free-text, images, videos,
audio les, etc.
Examples : Text documents, JPEG images, MP3 audio les.
• Semi-Structured Data : is not as strictly organized as
structured data but has some structure or partial schema.
Examples : JSON, XML formats.

4
Introduction Structure XML DTD NameSpace XML Summary

Structured vs. Semi-Structured Data

Types of data :
• Structured Data : is organized in a tabular format with a
predened schema.
Examples : Relational databases, Excel spreadsheets, CSV les.
• Unstructured Data : does not follow a predened schema or
organization. It can take the form of free-text, images, videos,
audio les, etc.
Examples : Text documents, JPEG images, MP3 audio les.
• Semi-Structured Data : is not as strictly organized as
structured data but has some structure or partial schema.
Examples : JSON, XML formats.

4
Introduction Structure XML DTD NameSpace XML Summary

Structured vs. Semi-Structured Data

Types of data :
• Structured Data : is organized in a tabular format with a
predened schema.
Examples : Relational databases, Excel spreadsheets, CSV les.
• Unstructured Data : does not follow a predened schema or
organization. It can take the form of free-text, images, videos,
audio les, etc.
Examples : Text documents, JPEG images, MP3 audio les.
• Semi-Structured Data : is not as strictly organized as
structured data but has some structure or partial schema.
Examples : JSON, XML formats.

4
Introduction Structure XML DTD NameSpace XML Summary

Structured vs. Semi-Structured Data

XML documents are les containing semi-structured data.


• XML documents can follow a general hierarchical structure
and vary in details.
• XML structures data according to a schema, but it remains a
non-optimized text le for both space and database operations.
• XML is seen as a volatile part of an information system and
solves information ow problems.
• However, XML databases exist with powerful querying
languages that allow working on the client side.

5
Introduction Structure XML DTD NameSpace XML Summary

Structured vs. Semi-Structured Data

XML documents are les containing semi-structured data.


• XML documents can follow a general hierarchical structure
and vary in details.
• XML structures data according to a schema, but it remains a
non-optimized text le for both space and database operations.
• XML is seen as a volatile part of an information system and
solves information ow problems.
• However, XML databases exist with powerful querying
languages that allow working on the client side.

5
Introduction Structure XML DTD NameSpace XML Summary

Structured vs. Semi-Structured Data

XML documents are les containing semi-structured data.


• XML documents can follow a general hierarchical structure
and vary in details.
• XML structures data according to a schema, but it remains a
non-optimized text le for both space and database operations.
• XML is seen as a volatile part of an information system and
solves information ow problems.
• However, XML databases exist with powerful querying
languages that allow working on the client side.

5
Introduction Structure XML DTD NameSpace XML Summary

Structured vs. Semi-Structured Data

XML documents are les containing semi-structured data.


• XML documents can follow a general hierarchical structure
and vary in details.
• XML structures data according to a schema, but it remains a
non-optimized text le for both space and database operations.
• XML is seen as a volatile part of an information system and
solves information ow problems.
• However, XML databases exist with powerful querying
languages that allow working on the client side.

5
Introduction Structure XML DTD NameSpace XML Summary

HTML and XML

• HTML, HyperText Markup Language, is the language that


allows creating documents with tags on how they should be
displayed.
• Users perceive the Web through a Web browser.
• URI, Uniform Resource Identier, is the mechanism that
identies and references a document or resource on the Web.
• A concept of Hypertext structures information by allowing
documents to reference other documents or parts of
documents.
• Information on the Web can be stored in les or, increasingly,
in a database or XML, etc.

6
Introduction Structure XML DTD NameSpace XML Summary

The Web : HTML, SGML, XML


• XML is also a markup language, distinguishing itself from
HTML by the separation between the structure of a
document's content and its presentation.
• XML is a derivative of SGML (Standard Generalized Markup
Language) and HTML.
• XML is a simple language that has the advantages of SGML
and HTML.
Comparison with SGML Comparison with HTML
metalanguage and structured simple like HTML
structure like SGML
not as complex as SGML not fault-tolerant and
presentation-oriented like HTML

7
Introduction Structure XML DTD NameSpace XML Summary

The Web : HTML, SGML, XML


• XML is also a markup language, distinguishing itself from
HTML by the separation between the structure of a
document's content and its presentation.
• XML is a derivative of SGML (Standard Generalized Markup
Language) and HTML.
• XML is a simple language that has the advantages of SGML
and HTML.
Comparison with SGML Comparison with HTML
metalanguage and structured simple like HTML
structure like SGML
not as complex as SGML not fault-tolerant and
presentation-oriented like HTML

7
Introduction Structure XML DTD NameSpace XML Summary

Using XML

The use of XML has expanded to :


• Providing a framework for storage and processing of
semi-structured data
• Exchange of data between applications

8
Introduction Structure XML DTD NameSpace XML Summary

Using XML

The use of XML has expanded to :


• Providing a framework for storage and processing of
semi-structured data
• Exchange of data between applications

8
Introduction Structure XML DTD NameSpace XML Summary

XML : First Example

9
Introduction Structure XML DTD NameSpace XML Summary

XML Galaxy

Basic standards built on XML


• XML Schema : data model
• Xpath : language for navigating XML trees
• XSL : for generating style sheets
• XQuery : for querying XML databases
• XLink and Xpointer : for links
• DOM and SAX : for programming

10
Introduction Structure XML DTD NameSpace XML Summary

XML Dialects

X
ML is more of a metalanguage that allows dening other languages
(dialects) : XHTML, SVG, XSLT, SMIL, MathML, etc.

• These dialects are dened to apply XML to various domains.


• These dialects share the same basic syntax (XML), and all
XML tools can be used to specify and manipulate these
documents.

11
Introduction Structure XML DTD NameSpace XML Summary

XML Dialects
• RSS (Really Simple Syndication) : Subscription to data feeds
• SVG (Scalable Vector Graphics) : Description of vector drawings
• SMIL (Synchronized Multimedia Integration Language) :
Description of multimedia content
• MathML (Mathematical Markup Language) : Description of
mathematical formulas
• WSDL (Web Services Description Language) : Description of WEB
services
• XUL (XML-based User Interface Language) : Language for
describing graphical interfaces.
• XML Signature : Format for electronic signatures
• SAML (Security Assertion Markup Language) : Language for
exchanging authentication and security information...
12
Introduction Structure XML DTD NameSpace XML Summary

XML Dialects : Example

MathML
MathML is a W3C recommendation (http ://www.w3.org/Math/)
used to write mathematical expressions.

• mo denotes an operator,
• mrow an expression,
• msup an expression with an exponent.
• mi to write a variable (e.g., x).
• mn to write a number

13
Introduction Structure XML DTD NameSpace XML Summary

XML Dialects : Example

Here, we have described the expression : x + 4 ∗ x + 4.


2

14
Introduction Structure XML DTD NameSpace XML Summary

XML Dialects : Example

Here, we have described the expression : x + 4 ∗ x + 4.


2

14
Introduction Structure XML DTD NameSpace XML Summary

Summary

XML
XML (eXtended Markup Language) is a metalanguage of markup,
recommended by the W3C since 1998 to facilitate data exchange
via the Web.
• XML represents and does not present (separation between
content and presentation)
• Language is text-oriented, formed with tags to organize data in
a structured way ;
• Used for storage, sharing of documents, and exchange of data
between applications ;
• It is simple, exible, and easy to use, standardized and open.

15
Introduction Structure XML DTD NameSpace XML Summary

Summary

XML
XML (eXtended Markup Language) is a metalanguage of markup,
recommended by the W3C since 1998 to facilitate data exchange
via the Web.
• XML represents and does not present (separation between
content and presentation)
• Language is text-oriented, formed with tags to organize data in
a structured way ;
• Used for storage, sharing of documents, and exchange of data
between applications ;
• It is simple, exible, and easy to use, standardized and open.

15
Introduction Structure XML DTD NameSpace XML Summary

Summary

XML
XML (eXtended Markup Language) is a metalanguage of markup,
recommended by the W3C since 1998 to facilitate data exchange
via the Web.
• XML represents and does not present (separation between
content and presentation)
• Language is text-oriented, formed with tags to organize data in
a structured way ;
• Used for storage, sharing of documents, and exchange of data
between applications ;
• It is simple, exible, and easy to use, standardized and open.

15
Introduction Structure XML DTD NameSpace XML Summary

Summary

XML
XML (eXtended Markup Language) is a metalanguage of markup,
recommended by the W3C since 1998 to facilitate data exchange
via the Web.
• XML represents and does not present (separation between
content and presentation)
• Language is text-oriented, formed with tags to organize data in
a structured way ;
• Used for storage, sharing of documents, and exchange of data
between applications ;
• It is simple, exible, and easy to use, standardized and open.

15
Introduction Structure XML DTD NameSpace XML Summary

XML is...

• Tag (or label) is a start and end mark for identifying a data
element (textual).
• Form : <start_tag> and </end_tag>
• Tags indicate the meaning of marked sections, data element.
• All text is enclosed by a start and end tag (root of the
document).
• Data elements can be nested, forming a tree.
• Attribute : name="value" qualifying a tag.

16
Introduction Structure XML DTD NameSpace XML Summary

XML is...

• Tag (or label) is a start and end mark for identifying a data
element (textual).
• Form : <start_tag> and </end_tag>
• Tags indicate the meaning of marked sections, data element.
• All text is enclosed by a start and end tag (root of the
document).
• Data elements can be nested, forming a tree.
• Attribute : name="value" qualifying a tag.

16
Introduction Structure XML DTD NameSpace XML Summary

XML is...

• Tag (or label) is a start and end mark for identifying a data
element (textual).
• Form : <start_tag> and </end_tag>
• Tags indicate the meaning of marked sections, data element.
• All text is enclosed by a start and end tag (root of the
document).
• Data elements can be nested, forming a tree.
• Attribute : name="value" qualifying a tag.

16
Introduction Structure XML DTD NameSpace XML Summary

XML is...

• Tag (or label) is a start and end mark for identifying a data
element (textual).
• Form : <start_tag> and </end_tag>
• Tags indicate the meaning of marked sections, data element.
• All text is enclosed by a start and end tag (root of the
document).
• Data elements can be nested, forming a tree.
• Attribute : name="value" qualifying a tag.

16
Introduction Structure XML DTD NameSpace XML Summary

XML is...

• Tag (or label) is a start and end mark for identifying a data
element (textual).
• Form : <start_tag> and </end_tag>
• Tags indicate the meaning of marked sections, data element.
• All text is enclosed by a start and end tag (root of the
document).
• Data elements can be nested, forming a tree.
• Attribute : name="value" qualifying a tag.

16
Introduction Structure XML DTD NameSpace XML Summary

XML is...

• Tag (or label) is a start and end mark for identifying a data
element (textual).
• Form : <start_tag> and </end_tag>
• Tags indicate the meaning of marked sections, data element.
• All text is enclosed by a start and end tag (root of the
document).
• Data elements can be nested, forming a tree.
• Attribute : name="value" qualifying a tag.

16
Introduction Structure XML DTD NameSpace XML Summary

XML Structure

A XML document can contain :


• A prologue
• A root element
• A tree of elements and their attributes
• Comments

The prologue can contain :


• A declaration
• Processing instructions
• A DTD

17
Introduction Structure XML DTD NameSpace XML Summary

XML Structure

A XML document can contain :


• A prologue
• A root element
• A tree of elements and their attributes
• Comments

The prologue can contain :


• A declaration
• Processing instructions
• A DTD

17
Introduction Structure XML DTD NameSpace XML Summary

XML Example

18
Introduction Structure XML DTD NameSpace XML Summary

Prologue

The prologue is composed of :


< ?xml version="1.0" encoding="ISO-8859-1" standalone="yes" ?>

• version : version of the XML used in the document ;


• encoding : the character encoding scheme used. By default,
encoding has the value UTF-8.
• standalone : document dependence on a DTD ;
• standalone = yes : the application processor does not expect
any external DTD to the document.
• standalone = no : the processor expects a document type
declaration reference. The default value is no.

19
Introduction Structure XML DTD NameSpace XML Summary

Processing Instructions

• Processing instructions are intended for applications processing


XML documents.
• They are delimited by the strings : < ? and ?>.
• Example : An XSLT style sheet can be attached to an XML
document through a processing instruction named
xml-stylesheet :

20
Introduction Structure XML DTD NameSpace XML Summary

Tree of Elements

• Every XML document is represented as a tree of elements.


• Like any tree, it has a root, branches, and leaves.
• The tree consists of elements nested within each other (having
a parent-child relationship) and adjacent elements.

21
Introduction Structure XML DTD NameSpace XML Summary

Tree of Elements

• Every XML document is represented as a tree of elements.


• Like any tree, it has a root, branches, and leaves.
• The tree consists of elements nested within each other (having
a parent-child relationship) and adjacent elements.

21
Introduction Structure XML DTD NameSpace XML Summary

Tree of Elements

• Every XML document is represented as a tree of elements.


• Like any tree, it has a root, branches, and leaves.
• The tree consists of elements nested within each other (having
a parent-child relationship) and adjacent elements.

21
Introduction Structure XML DTD NameSpace XML Summary

Example of Tree of Elements

22
Introduction Structure XML DTD NameSpace XML Summary

Attributes

• All elements can contain one or more attributes.


• An attribute consists of a name and a value.
• Syntax of an element with attributes : <Element-Name
attribute1, attribute2, ...>
• Syntax of an attribute i : name="value"
Example :

23
Introduction Structure XML DTD NameSpace XML Summary

Predened Attributes

There are four special attributes that are part of the XML
namespace (see Section 4).
• xml :lang,
• xml :space,
• xml :base,
• xml :id.

24
Introduction Structure XML DTD NameSpace XML Summary

Predened Attributes

• The xml :lang attribute is used to describe the language of the


content of the element. Its value is a language code of two or
three letters from the ISO 639 standard (e.g., en, fr, es).
• The xml :id attribute allows associating an identier with any
element independently of any DTD or schema.
• The xml :space attribute allows indicating to an application
how to handle whitespace characters. The two possible values
of this attribute are default and preserve. If the xml :space
attribute has the value preserve, the application must respect
dierent whitespace characters.
• The xml :base attribute : Each element in an XML document
is associated with a base URI.

25
Introduction Structure XML DTD NameSpace XML Summary

Predened Attributes

• The xml :lang attribute is used to describe the language of the


content of the element. Its value is a language code of two or
three letters from the ISO 639 standard (e.g., en, fr, es).
• The xml :id attribute allows associating an identier with any
element independently of any DTD or schema.
• The xml :space attribute allows indicating to an application
how to handle whitespace characters. The two possible values
of this attribute are default and preserve. If the xml :space
attribute has the value preserve, the application must respect
dierent whitespace characters.
• The xml :base attribute : Each element in an XML document
is associated with a base URI.

25
Introduction Structure XML DTD NameSpace XML Summary

Predened Attributes

• The xml :lang attribute is used to describe the language of the


content of the element. Its value is a language code of two or
three letters from the ISO 639 standard (e.g., en, fr, es).
• The xml :id attribute allows associating an identier with any
element independently of any DTD or schema.
• The xml :space attribute allows indicating to an application
how to handle whitespace characters. The two possible values
of this attribute are default and preserve. If the xml :space
attribute has the value preserve, the application must respect
dierent whitespace characters.
• The xml :base attribute : Each element in an XML document
is associated with a base URI.

25
Introduction Structure XML DTD NameSpace XML Summary

Predened Attributes

• The xml :lang attribute is used to describe the language of the


content of the element. Its value is a language code of two or
three letters from the ISO 639 standard (e.g., en, fr, es).
• The xml :id attribute allows associating an identier with any
element independently of any DTD or schema.
• The xml :space attribute allows indicating to an application
how to handle whitespace characters. The two possible values
of this attribute are default and preserve. If the xml :space
attribute has the value preserve, the application must respect
dierent whitespace characters.
• The xml :base attribute : Each element in an XML document
is associated with a base URI.

25
Introduction Structure XML DTD NameSpace XML Summary

Predened Attributes
• The xml :base attribute : Each element in an XML document
is associated with a base URI.

26
Introduction Structure XML DTD NameSpace XML Summary

Well-Formed Document and Valid Document

• Well-Formed Document : A well-formed document must


adhere to the syntactic rules specic to XML. It is, in a sense,
the spelling of XML.
• Valid Document : A valid document must adhere to a
document model that rigorously describes how the document
should be organized. A document model can be seen as a
grammar for XML documents.

27
Introduction Structure XML DTD NameSpace XML Summary

Well-Formed Document and Valid Document

• Well-Formed Document : A well-formed document must


adhere to the syntactic rules specic to XML. It is, in a sense,
the spelling of XML.
• Valid Document : A valid document must adhere to a
document model that rigorously describes how the document
should be organized. A document model can be seen as a
grammar for XML documents.

27
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)

DTD
A DTD allows dening a standard structure for an XML document.
It is a grammar that describes how to construct valid XML
documents.
• It can be
• internal
< !DOCTYPE RootName [DEFINITION ]>
• external (PUBLIC or PRIVATE).
< !DOCTYPE RootName SYSTEM "path">
• An element is dened by !ELEMENT, and an attribute is
dened by !ATTLIST

28
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)

DTD
A DTD allows dening a standard structure for an XML document.
It is a grammar that describes how to construct valid XML
documents.
• It can be
• internal
< !DOCTYPE RootName [DEFINITION ]>
• external (PUBLIC or PRIVATE).
< !DOCTYPE RootName SYSTEM "path">
• An element is dened by !ELEMENT, and an attribute is
dened by !ATTLIST

28
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)

DTD
A DTD allows dening a standard structure for an XML document.
It is a grammar that describes how to construct valid XML
documents.
• It can be
• internal
< !DOCTYPE RootName [DEFINITION ]>
• external (PUBLIC or PRIVATE).
< !DOCTYPE RootName SYSTEM "path">
• An element is dened by !ELEMENT, and an attribute is
dened by !ATTLIST

28
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)

DTD
A DTD allows dening a standard structure for an XML document.
It is a grammar that describes how to construct valid XML
documents.
• It can be
• internal
< !DOCTYPE RootName [DEFINITION ]>
• external (PUBLIC or PRIVATE).
< !DOCTYPE RootName SYSTEM "path">
• An element is dened by !ELEMENT, and an attribute is
dened by !ATTLIST

28
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)


Declaration of elements in a DTD :

< !ELEMENT tag (content)>.


• The content can be :
• simple elements :
• empty : EMPTY
• any : ANY
• textual :(#PCDATA)
• composition :
• sequence of elements : (a,b,c)
• alternative choices of elements :(a|b|c)
• hierarchical mix :(a, (b|c),d)
• For each element, the occurrence indicator can be :
• ? : (zero or one)
• * : (zero or more),
• + : (one or more) 29
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)


Declaration of elements in a DTD :

< !ELEMENT tag (content)>.


• The content can be :
• simple elements :
• empty : EMPTY
• any : ANY
• textual :(#PCDATA)
• composition :
• sequence of elements : (a,b,c)
• alternative choices of elements :(a|b|c)
• hierarchical mix :(a, (b|c),d)
• For each element, the occurrence indicator can be :
• ? : (zero or one)
• * : (zero or more),
• + : (one or more) 29
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)


Declaration of elements in a DTD :

< !ELEMENT tag (content)>.


• The content can be :
• simple elements :
• empty : EMPTY
• any : ANY
• textual :(#PCDATA)
• composition :
• sequence of elements : (a,b,c)
• alternative choices of elements :(a|b|c)
• hierarchical mix :(a, (b|c),d)
• For each element, the occurrence indicator can be :
• ? : (zero or one)
• * : (zero or more),
• + : (one or more) 29
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)


Declaration of elements in a DTD :

< !ELEMENT tag (content)>.


• The content can be :
• simple elements :
• empty : EMPTY
• any : ANY
• textual :(#PCDATA)
• composition :
• sequence of elements : (a,b,c)
• alternative choices of elements :(a|b|c)
• hierarchical mix :(a, (b|c),d)
• For each element, the occurrence indicator can be :
• ? : (zero or one)
• * : (zero or more),
• + : (one or more) 29
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)


Declaration of elements in a DTD :

< !ELEMENT tag (content)>.


• The content can be :
• simple elements :
• empty : EMPTY
• any : ANY
• textual :(#PCDATA)
• composition :
• sequence of elements : (a,b,c)
• alternative choices of elements :(a|b|c)
• hierarchical mix :(a, (b|c),d)
• For each element, the occurrence indicator can be :
• ? : (zero or one)
• * : (zero or more),
• + : (one or more) 29
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)


Declaration of elements in a DTD :

< !ELEMENT tag (content)>.


• The content can be :
• simple elements :
• empty : EMPTY
• any : ANY
• textual :(#PCDATA)
• composition :
• sequence of elements : (a,b,c)
• alternative choices of elements :(a|b|c)
• hierarchical mix :(a, (b|c),d)
• For each element, the occurrence indicator can be :
• ? : (zero or one)
• * : (zero or more),
• + : (one or more) 29
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)


Declaration of elements in a DTD :

< !ELEMENT tag (content)>.


• The content can be :
• simple elements :
• empty : EMPTY
• any : ANY
• textual :(#PCDATA)
• composition :
• sequence of elements : (a,b,c)
• alternative choices of elements :(a|b|c)
• hierarchical mix :(a, (b|c),d)
• For each element, the occurrence indicator can be :
• ? : (zero or one)
• * : (zero or more),
• + : (one or more) 29
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)


Declaration of elements in a DTD :

< !ELEMENT tag (content)>.


• The content can be :
• simple elements :
• empty : EMPTY
• any : ANY
• textual :(#PCDATA)
• composition :
• sequence of elements : (a,b,c)
• alternative choices of elements :(a|b|c)
• hierarchical mix :(a, (b|c),d)
• For each element, the occurrence indicator can be :
• ? : (zero or one)
• * : (zero or more),
• + : (one or more) 29
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)


Declaration of elements in a DTD :

< !ELEMENT tag (content)>.


• The content can be :
• simple elements :
• empty : EMPTY
• any : ANY
• textual :(#PCDATA)
• composition :
• sequence of elements : (a,b,c)
• alternative choices of elements :(a|b|c)
• hierarchical mix :(a, (b|c),d)
• For each element, the occurrence indicator can be :
• ? : (zero or one)
• * : (zero or more),
• + : (one or more) 29
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)


Declaration of elements in a DTD :

< !ELEMENT tag (content)>.


• The content can be :
• simple elements :
• empty : EMPTY
• any : ANY
• textual :(#PCDATA)
• composition :
• sequence of elements : (a,b,c)
• alternative choices of elements :(a|b|c)
• hierarchical mix :(a, (b|c),d)
• For each element, the occurrence indicator can be :
• ? : (zero or one)
• * : (zero or more),
• + : (one or more) 29
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)


Declaration of elements in a DTD :

< !ELEMENT tag (content)>.


• The content can be :
• simple elements :
• empty : EMPTY
• any : ANY
• textual :(#PCDATA)
• composition :
• sequence of elements : (a,b,c)
• alternative choices of elements :(a|b|c)
• hierarchical mix :(a, (b|c),d)
• For each element, the occurrence indicator can be :
• ? : (zero or one)
• * : (zero or more),
• + : (one or more) 29
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)


Declaration of elements in a DTD :

< !ELEMENT tag (content)>.


• The content can be :
• simple elements :
• empty : EMPTY
• any : ANY
• textual :(#PCDATA)
• composition :
• sequence of elements : (a,b,c)
• alternative choices of elements :(a|b|c)
• hierarchical mix :(a, (b|c),d)
• For each element, the occurrence indicator can be :
• ? : (zero or one)
• * : (zero or more),
• + : (one or more) 29
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)

Declaration of attributes in a DTD :

< !ATTLIST tag [attribute type #mode [value]]*


• type : CDATA (raw data), ID (identier), IDREF(identier
reference), IDREFS(List of identier references), Notation(a
notation), NMTOKEN(a single word), NMTOKENS (a list of
words), an enumeration of values (each value separated by the
| character), ENTITY(entity)
• mode : REQUIRED(mandatory), IMPLIED(optional),
FIXED(xed to a value), DEFAULT(default value)

30
Introduction Structure XML DTD NameSpace XML Summary

DTD (Document Type Denition)

Declaration of attributes in a DTD :

< !ATTLIST tag [attribute type #mode [value]]*


• type : CDATA (raw data), ID (identier), IDREF(identier
reference), IDREFS(List of identier references), Notation(a
notation), NMTOKEN(a single word), NMTOKENS (a list of
words), an enumeration of values (each value separated by the
| character), ENTITY(entity)
• mode : REQUIRED(mandatory), IMPLIED(optional),
FIXED(xed to a value), DEFAULT(default value)

30
Introduction Structure XML DTD NameSpace XML Summary

DTD Example 1

31
Introduction Structure XML DTD NameSpace XML Summary

DTD Example 2 : XML Document

32
Introduction Structure XML DTD NameSpace XML Summary

DTD Example 2 : DTD Document message.dtd

33
Introduction Structure XML DTD NameSpace XML Summary

Entities in DTDs

Entity : Denition and Role


An entity allows replacing a string of characters with a symbol, and
then it allows using this symbol in place of that string.
An entity allows reuse in a DTD.

Several types of entities are distinguished in XML :


• General entities
• Parameter entities
• Character entities

34
Introduction Structure XML DTD NameSpace XML Summary

General Entities

General Entities
General entities allow dening elements that can be substituted in
the body of the XML document (although they are dened within
the DTD and not in the XML document itself).
• Internal declaration : < !ENTITY entity-name "entity-value">
• External declaration : < !ENTITY entity-name SYSTEM
"entity-URL">
To reference it :

& entity-name ;

35
Introduction Structure XML DTD NameSpace XML Summary

General Entities

Examples
• Entity (internal) :
< !ENTITY website "http ://www.mysite.com">
• In an XML document, if we declare :
<url>The website for the DSS course is : &website ;</url>.
• It will be evaluated to :
<url>The website for the DSS course is :
http ://www.mysite.com</url>

36
Introduction Structure XML DTD NameSpace XML Summary

General Entities

Examples
• Entity (internal) :
< !ENTITY website "http ://www.mysite.com">
• In an XML document, if we declare :
<url>The website for the DSS course is : &website ;</url>.
• It will be evaluated to :
<url>The website for the DSS course is :
http ://www.mysite.com</url>

36
Introduction Structure XML DTD NameSpace XML Summary

General Entities

Examples
• Entity (internal) :
< !ENTITY website "http ://www.mysite.com">
• In an XML document, if we declare :
<url>The website for the DSS course is : &website ;</url>.
• It will be evaluated to :
<url>The website for the DSS course is :
http ://www.mysite.com</url>

36
Introduction Structure XML DTD NameSpace XML Summary

General Entities

Examples
• Entity (internal) :
< !ENTITY website "http ://www.mysite.com">
• In an XML document, if we declare :
<url>The website for the DSS course is : &website ;</url>.
• It will be evaluated to :
<url>The website for the DSS course is :
http ://www.mysite.com</url>

36
Introduction Structure XML DTD NameSpace XML Summary

General Entities

Examples
• Entity (internal) :
< !ENTITY website "http ://www.mysite.com">
• In an XML document, if we declare :
<url>The website for the DSS course is : &website ;</url>.
• It will be evaluated to :
<url>The website for the DSS course is :
http ://www.mysite.com</url>

36
Introduction Structure XML DTD NameSpace XML Summary

Parameter Entities

• Parameter entities allow using entities within the DTD itself.


• Declaration syntax :
< !ENTITY % entity-name denition>
• To reference it :
% entity-name ;
Declaration
• External parameter entity :
< !ENTITY % entity-name SYSTEM "lename">
• Internal parameter entity :
< !ENTITY % entity-name "denition">

37
Introduction Structure XML DTD NameSpace XML Summary

Parameter Entities

• Parameter entities allow using entities within the DTD itself.


• Declaration syntax :
< !ENTITY % entity-name denition>
• To reference it :
% entity-name ;
Declaration
• External parameter entity :
< !ENTITY % entity-name SYSTEM "lename">
• Internal parameter entity :
< !ENTITY % entity-name "denition">

37
Introduction Structure XML DTD NameSpace XML Summary

Parameter Entities
Example

It is equivalent to the following DTD fragment :

38
Introduction Structure XML DTD NameSpace XML Summary

Parameter Entities
Example

It is equivalent to the following DTD fragment :

38
Introduction Structure XML DTD NameSpace XML Summary

Character Entities

Character entities are reserved characters in XML represented as


general entities to be able to insert these reserved characters into
the XML document.
List of main character entities :
Character Entity Representation

&amp ; &
&lt ; <
&gt ; >
&apos ; '
&quot ; "

39
Introduction Structure XML DTD NameSpace XML Summary

Character Entities

Note :
Any character can be inserted into a document through the entity
constructed according to the syntax &#decimal_code ; or
&#xhexadecimal_code ;.

Example
The code &#960 ; or &#x03C0 ; :
allows inserting the Greek letter : π

40
Introduction Structure XML DTD NameSpace XML Summary

Notations

• Notations allow identifying by name the format of entities not


analyzed by the XML parser. They dene the format of data
and the applications that can process them.
• For example, it is possible to associate Jif images with the
view.exe program using the following syntax :

41
Introduction Structure XML DTD NameSpace XML Summary

Denition

Namespace (NameSpace)
Namespaces were introduced in XML to allow mixing multiple
vocabularies within a single document.

XML Vocabulary
An XML vocabulary is a set of tag names and attributes with a
given meaning.

42
Introduction Structure XML DTD NameSpace XML Summary

XML Namespace : Introductory Example


XML1 XML2

XML Namespace : Problem


Confusion on the author element in XML3, which reuses both XML1 and
XML2.

43
Introduction Structure XML DTD NameSpace XML Summary

XML Namespaces
Solution : Namespaces
• Distinguish elements and attributes from dierent XML
applications that have the same name.
• Assign elements and attributes to a URI.
• Assign a prex to this URI.

44
Introduction Structure XML DTD NameSpace XML Summary

XML Namespaces : Declarations

Declaration
Declaration is done by an attribute associated with an element, and
two forms exist :
• xmlns="uri" denes the default namespace.
• xmlns :prex="uri" denes the prex representing a qualied
namespace.

45
Introduction Structure XML DTD NameSpace XML Summary

Some Remarks on Namespaces


• The scope of the namespace declaration includes the tags of
the element that contains it.
• Namespaces can be declared in the elements where they will
be used or in the root element of XML :
Example :

46
Introduction Structure XML DTD NameSpace XML Summary

Some Remarks on Namespaces


• By dening a default namespace for an element, it avoids
using prexes for all its children.
Example :

• We can change the default namespace even in child elements :


Example :

47
Introduction Structure XML DTD NameSpace XML Summary

Removing a Namespace

• No namespace is used when there is no default namespace or


prex.
• To remove the action of a namespace, simply use an empty
value "", which is equivalent to having no namespace.
Example :

48
Introduction Structure XML DTD NameSpace XML Summary

Applying a Namespace to an Attribute

• Namespaces can be applied via a prex to an attribute or an


attribute value to have multiple instances of an attribute with
the same name in an element declaration.

• The prex xml is linked to the XML namespace identied by


(http ://www.w3.org/XML/1998/namespace). This
namespace does not need to be declared. The four special
attributes xml :lang, xml :space, xml :base, and xml :id are
part of this namespace.

49
Introduction Structure XML DTD NameSpace XML Summary

Namespaces and DTD

• DTDs do not consider namespaces. However, it is possible to


validate, with a DTD, a document with namespaces.
• A xmlns declaration can be omitted in the document and
declared in the DTD as follows :
Example :

50
Introduction Structure XML DTD NameSpace XML Summary

Some Famous Namespaces

• XHTML : <xmlns :xhtml=http ://www.w3.org/1999/xhtml>


• MathML :
<xmlns :MathML=http ://www.w3.org/1998/Math/MathML>
• SVG : <xmlns :svg=http ://www.w3.org/2000/svg>
• XSLT : <xmlns :xsl=http ://www.w3.org/1999/XSL/Transform>
• Schema : <xmlns :xs=http ://www.w3.org/2001/XMLSchema>
• RDF : <xmlns :rdf=http ://www.w3.org/TR/REC-rdf-syntax#>
• Dublin-core : <xmlns :dc=http ://purl.org/dc/>

51
Introduction Structure XML DTD NameSpace XML Summary

Example of Using an Existing SVG Namespace


• SVG namespace :

• Display result in the browser :

52
Introduction Structure XML DTD NameSpace XML Summary

Summary

• XML is a metalanguage for markup. It facilitates data


exchange, sharing, and transfer via the Web.
• XML represents and does not present.
• It is a simple, exible, and text-oriented language.
• An XML document consists of a prologue and a tree of
elements.
• A tree contains elements and attributes.
• A DTD describes the structure of the XML document.
• In a DTD, entities and notations can be described.
• Namespaces are URIs abbreviated by prexes, dened to mix
XML documents without confusion.

53
Introduction Structure XML DTD NameSpace XML Summary

Summary

• XML is a metalanguage for markup. It facilitates data


exchange, sharing, and transfer via the Web.
• XML represents and does not present.
• It is a simple, exible, and text-oriented language.
• An XML document consists of a prologue and a tree of
elements.
• A tree contains elements and attributes.
• A DTD describes the structure of the XML document.
• In a DTD, entities and notations can be described.
• Namespaces are URIs abbreviated by prexes, dened to mix
XML documents without confusion.

53
Introduction Structure XML DTD NameSpace XML Summary

Summary

• XML is a metalanguage for markup. It facilitates data


exchange, sharing, and transfer via the Web.
• XML represents and does not present.
• It is a simple, exible, and text-oriented language.
• An XML document consists of a prologue and a tree of
elements.
• A tree contains elements and attributes.
• A DTD describes the structure of the XML document.
• In a DTD, entities and notations can be described.
• Namespaces are URIs abbreviated by prexes, dened to mix
XML documents without confusion.

53
Introduction Structure XML DTD NameSpace XML Summary

Summary

• XML is a metalanguage for markup. It facilitates data


exchange, sharing, and transfer via the Web.
• XML represents and does not present.
• It is a simple, exible, and text-oriented language.
• An XML document consists of a prologue and a tree of
elements.
• A tree contains elements and attributes.
• A DTD describes the structure of the XML document.
• In a DTD, entities and notations can be described.
• Namespaces are URIs abbreviated by prexes, dened to mix
XML documents without confusion.

53
Introduction Structure XML DTD NameSpace XML Summary

Summary

• XML is a metalanguage for markup. It facilitates data


exchange, sharing, and transfer via the Web.
• XML represents and does not present.
• It is a simple, exible, and text-oriented language.
• An XML document consists of a prologue and a tree of
elements.
• A tree contains elements and attributes.
• A DTD describes the structure of the XML document.
• In a DTD, entities and notations can be described.
• Namespaces are URIs abbreviated by prexes, dened to mix
XML documents without confusion.

53
Introduction Structure XML DTD NameSpace XML Summary

Summary

• XML is a metalanguage for markup. It facilitates data


exchange, sharing, and transfer via the Web.
• XML represents and does not present.
• It is a simple, exible, and text-oriented language.
• An XML document consists of a prologue and a tree of
elements.
• A tree contains elements and attributes.
• A DTD describes the structure of the XML document.
• In a DTD, entities and notations can be described.
• Namespaces are URIs abbreviated by prexes, dened to mix
XML documents without confusion.

53
Introduction Structure XML DTD NameSpace XML Summary

Summary

• XML is a metalanguage for markup. It facilitates data


exchange, sharing, and transfer via the Web.
• XML represents and does not present.
• It is a simple, exible, and text-oriented language.
• An XML document consists of a prologue and a tree of
elements.
• A tree contains elements and attributes.
• A DTD describes the structure of the XML document.
• In a DTD, entities and notations can be described.
• Namespaces are URIs abbreviated by prexes, dened to mix
XML documents without confusion.

53
Introduction Structure XML DTD NameSpace XML Summary

Summary

• XML is a metalanguage for markup. It facilitates data


exchange, sharing, and transfer via the Web.
• XML represents and does not present.
• It is a simple, exible, and text-oriented language.
• An XML document consists of a prologue and a tree of
elements.
• A tree contains elements and attributes.
• A DTD describes the structure of the XML document.
• In a DTD, entities and notations can be described.
• Namespaces are URIs abbreviated by prexes, dened to mix
XML documents without confusion.

53
Introduction Structure XML DTD NameSpace XML Summary

References

• W3C website. Available at https ://www.w3.org/XML/.


• Alexandre Brillant. XML Cours et exercices. Edition Eyrolles.
2007.
• Online courses :
• L'essentiel de XML. Course by Olivier Carton. Version dated
13/10/2015. Available at : https ://www.irif.fr/ carton/Ensei-
gnement/XML/Cours/index.html
• Cours de XML. Course by Gilles Chagnon. Version dated 16
January 2019. Available at
https ://www.gchagnon.fr/cours/xml/

54

You might also like