Chapter 1
Chapter 1
Semi-Structured Data
Chapter 1: XML, DTD, and Namespace
Amel Boustil
Computer Science Department, FS, University M'Hamed Bougara of Boumerdes,
35000, Algeria.
30 janvier 2024
1
Introduction Structure XML DTD NameSpace XML Summary
Agenda
Declaration of Attributes
Introduction
Entities
Databases and XML
Notations
HTML and XML
Using XML NameSpace XML
XML Galaxy Namespace Declaration
XML Dialects Removing a Namespace
Applying a Namespace to an
Structure XML Attribute
DTD Namespaces and DTD
Declaration of Elements Summary
2
Introduction Structure XML DTD NameSpace XML Summary
Types of data :
• Structured Data : is organized in a tabular format with a
predened schema.
Examples : Relational databases, Excel spreadsheets, CSV les.
• Unstructured Data : does not follow a predened schema or
organization. It can take the form of free-text, images, videos,
audio les, etc.
Examples : Text documents, JPEG images, MP3 audio les.
• Semi-Structured Data : is not as strictly organized as
structured data but has some structure or partial schema.
Examples : JSON, XML formats.
4
Introduction Structure XML DTD NameSpace XML Summary
Types of data :
• Structured Data : is organized in a tabular format with a
predened schema.
Examples : Relational databases, Excel spreadsheets, CSV les.
• Unstructured Data : does not follow a predened schema or
organization. It can take the form of free-text, images, videos,
audio les, etc.
Examples : Text documents, JPEG images, MP3 audio les.
• Semi-Structured Data : is not as strictly organized as
structured data but has some structure or partial schema.
Examples : JSON, XML formats.
4
Introduction Structure XML DTD NameSpace XML Summary
Types of data :
• Structured Data : is organized in a tabular format with a
predened schema.
Examples : Relational databases, Excel spreadsheets, CSV les.
• Unstructured Data : does not follow a predened schema or
organization. It can take the form of free-text, images, videos,
audio les, etc.
Examples : Text documents, JPEG images, MP3 audio les.
• Semi-Structured Data : is not as strictly organized as
structured data but has some structure or partial schema.
Examples : JSON, XML formats.
4
Introduction Structure XML DTD NameSpace XML Summary
5
Introduction Structure XML DTD NameSpace XML Summary
5
Introduction Structure XML DTD NameSpace XML Summary
5
Introduction Structure XML DTD NameSpace XML Summary
5
Introduction Structure XML DTD NameSpace XML Summary
6
Introduction Structure XML DTD NameSpace XML Summary
7
Introduction Structure XML DTD NameSpace XML Summary
7
Introduction Structure XML DTD NameSpace XML Summary
Using XML
8
Introduction Structure XML DTD NameSpace XML Summary
Using XML
8
Introduction Structure XML DTD NameSpace XML Summary
9
Introduction Structure XML DTD NameSpace XML Summary
XML Galaxy
10
Introduction Structure XML DTD NameSpace XML Summary
XML Dialects
X
ML is more of a metalanguage that allows dening other languages
(dialects) : XHTML, SVG, XSLT, SMIL, MathML, etc.
11
Introduction Structure XML DTD NameSpace XML Summary
XML Dialects
• RSS (Really Simple Syndication) : Subscription to data feeds
• SVG (Scalable Vector Graphics) : Description of vector drawings
• SMIL (Synchronized Multimedia Integration Language) :
Description of multimedia content
• MathML (Mathematical Markup Language) : Description of
mathematical formulas
• WSDL (Web Services Description Language) : Description of WEB
services
• XUL (XML-based User Interface Language) : Language for
describing graphical interfaces.
• XML Signature : Format for electronic signatures
• SAML (Security Assertion Markup Language) : Language for
exchanging authentication and security information...
12
Introduction Structure XML DTD NameSpace XML Summary
MathML
MathML is a W3C recommendation (http ://www.w3.org/Math/)
used to write mathematical expressions.
• mo denotes an operator,
• mrow an expression,
• msup an expression with an exponent.
• mi to write a variable (e.g., x).
• mn to write a number
13
Introduction Structure XML DTD NameSpace XML Summary
14
Introduction Structure XML DTD NameSpace XML Summary
14
Introduction Structure XML DTD NameSpace XML Summary
Summary
XML
XML (eXtended Markup Language) is a metalanguage of markup,
recommended by the W3C since 1998 to facilitate data exchange
via the Web.
• XML represents and does not present (separation between
content and presentation)
• Language is text-oriented, formed with tags to organize data in
a structured way ;
• Used for storage, sharing of documents, and exchange of data
between applications ;
• It is simple, exible, and easy to use, standardized and open.
15
Introduction Structure XML DTD NameSpace XML Summary
Summary
XML
XML (eXtended Markup Language) is a metalanguage of markup,
recommended by the W3C since 1998 to facilitate data exchange
via the Web.
• XML represents and does not present (separation between
content and presentation)
• Language is text-oriented, formed with tags to organize data in
a structured way ;
• Used for storage, sharing of documents, and exchange of data
between applications ;
• It is simple, exible, and easy to use, standardized and open.
15
Introduction Structure XML DTD NameSpace XML Summary
Summary
XML
XML (eXtended Markup Language) is a metalanguage of markup,
recommended by the W3C since 1998 to facilitate data exchange
via the Web.
• XML represents and does not present (separation between
content and presentation)
• Language is text-oriented, formed with tags to organize data in
a structured way ;
• Used for storage, sharing of documents, and exchange of data
between applications ;
• It is simple, exible, and easy to use, standardized and open.
15
Introduction Structure XML DTD NameSpace XML Summary
Summary
XML
XML (eXtended Markup Language) is a metalanguage of markup,
recommended by the W3C since 1998 to facilitate data exchange
via the Web.
• XML represents and does not present (separation between
content and presentation)
• Language is text-oriented, formed with tags to organize data in
a structured way ;
• Used for storage, sharing of documents, and exchange of data
between applications ;
• It is simple, exible, and easy to use, standardized and open.
15
Introduction Structure XML DTD NameSpace XML Summary
XML is...
• Tag (or label) is a start and end mark for identifying a data
element (textual).
• Form : <start_tag> and </end_tag>
• Tags indicate the meaning of marked sections, data element.
• All text is enclosed by a start and end tag (root of the
document).
• Data elements can be nested, forming a tree.
• Attribute : name="value" qualifying a tag.
16
Introduction Structure XML DTD NameSpace XML Summary
XML is...
• Tag (or label) is a start and end mark for identifying a data
element (textual).
• Form : <start_tag> and </end_tag>
• Tags indicate the meaning of marked sections, data element.
• All text is enclosed by a start and end tag (root of the
document).
• Data elements can be nested, forming a tree.
• Attribute : name="value" qualifying a tag.
16
Introduction Structure XML DTD NameSpace XML Summary
XML is...
• Tag (or label) is a start and end mark for identifying a data
element (textual).
• Form : <start_tag> and </end_tag>
• Tags indicate the meaning of marked sections, data element.
• All text is enclosed by a start and end tag (root of the
document).
• Data elements can be nested, forming a tree.
• Attribute : name="value" qualifying a tag.
16
Introduction Structure XML DTD NameSpace XML Summary
XML is...
• Tag (or label) is a start and end mark for identifying a data
element (textual).
• Form : <start_tag> and </end_tag>
• Tags indicate the meaning of marked sections, data element.
• All text is enclosed by a start and end tag (root of the
document).
• Data elements can be nested, forming a tree.
• Attribute : name="value" qualifying a tag.
16
Introduction Structure XML DTD NameSpace XML Summary
XML is...
• Tag (or label) is a start and end mark for identifying a data
element (textual).
• Form : <start_tag> and </end_tag>
• Tags indicate the meaning of marked sections, data element.
• All text is enclosed by a start and end tag (root of the
document).
• Data elements can be nested, forming a tree.
• Attribute : name="value" qualifying a tag.
16
Introduction Structure XML DTD NameSpace XML Summary
XML is...
• Tag (or label) is a start and end mark for identifying a data
element (textual).
• Form : <start_tag> and </end_tag>
• Tags indicate the meaning of marked sections, data element.
• All text is enclosed by a start and end tag (root of the
document).
• Data elements can be nested, forming a tree.
• Attribute : name="value" qualifying a tag.
16
Introduction Structure XML DTD NameSpace XML Summary
XML Structure
17
Introduction Structure XML DTD NameSpace XML Summary
XML Structure
17
Introduction Structure XML DTD NameSpace XML Summary
XML Example
18
Introduction Structure XML DTD NameSpace XML Summary
Prologue
19
Introduction Structure XML DTD NameSpace XML Summary
Processing Instructions
20
Introduction Structure XML DTD NameSpace XML Summary
Tree of Elements
21
Introduction Structure XML DTD NameSpace XML Summary
Tree of Elements
21
Introduction Structure XML DTD NameSpace XML Summary
Tree of Elements
21
Introduction Structure XML DTD NameSpace XML Summary
22
Introduction Structure XML DTD NameSpace XML Summary
Attributes
23
Introduction Structure XML DTD NameSpace XML Summary
Predened Attributes
There are four special attributes that are part of the XML
namespace (see Section 4).
• xml :lang,
• xml :space,
• xml :base,
• xml :id.
24
Introduction Structure XML DTD NameSpace XML Summary
Predened Attributes
25
Introduction Structure XML DTD NameSpace XML Summary
Predened Attributes
25
Introduction Structure XML DTD NameSpace XML Summary
Predened Attributes
25
Introduction Structure XML DTD NameSpace XML Summary
Predened Attributes
25
Introduction Structure XML DTD NameSpace XML Summary
Predened Attributes
• The xml :base attribute : Each element in an XML document
is associated with a base URI.
26
Introduction Structure XML DTD NameSpace XML Summary
27
Introduction Structure XML DTD NameSpace XML Summary
27
Introduction Structure XML DTD NameSpace XML Summary
DTD
A DTD allows dening a standard structure for an XML document.
It is a grammar that describes how to construct valid XML
documents.
• It can be
• internal
< !DOCTYPE RootName [DEFINITION ]>
• external (PUBLIC or PRIVATE).
< !DOCTYPE RootName SYSTEM "path">
• An element is dened by !ELEMENT, and an attribute is
dened by !ATTLIST
28
Introduction Structure XML DTD NameSpace XML Summary
DTD
A DTD allows dening a standard structure for an XML document.
It is a grammar that describes how to construct valid XML
documents.
• It can be
• internal
< !DOCTYPE RootName [DEFINITION ]>
• external (PUBLIC or PRIVATE).
< !DOCTYPE RootName SYSTEM "path">
• An element is dened by !ELEMENT, and an attribute is
dened by !ATTLIST
28
Introduction Structure XML DTD NameSpace XML Summary
DTD
A DTD allows dening a standard structure for an XML document.
It is a grammar that describes how to construct valid XML
documents.
• It can be
• internal
< !DOCTYPE RootName [DEFINITION ]>
• external (PUBLIC or PRIVATE).
< !DOCTYPE RootName SYSTEM "path">
• An element is dened by !ELEMENT, and an attribute is
dened by !ATTLIST
28
Introduction Structure XML DTD NameSpace XML Summary
DTD
A DTD allows dening a standard structure for an XML document.
It is a grammar that describes how to construct valid XML
documents.
• It can be
• internal
< !DOCTYPE RootName [DEFINITION ]>
• external (PUBLIC or PRIVATE).
< !DOCTYPE RootName SYSTEM "path">
• An element is dened by !ELEMENT, and an attribute is
dened by !ATTLIST
28
Introduction Structure XML DTD NameSpace XML Summary
30
Introduction Structure XML DTD NameSpace XML Summary
30
Introduction Structure XML DTD NameSpace XML Summary
DTD Example 1
31
Introduction Structure XML DTD NameSpace XML Summary
32
Introduction Structure XML DTD NameSpace XML Summary
33
Introduction Structure XML DTD NameSpace XML Summary
Entities in DTDs
34
Introduction Structure XML DTD NameSpace XML Summary
General Entities
General Entities
General entities allow dening elements that can be substituted in
the body of the XML document (although they are dened within
the DTD and not in the XML document itself).
• Internal declaration : < !ENTITY entity-name "entity-value">
• External declaration : < !ENTITY entity-name SYSTEM
"entity-URL">
To reference it :
& entity-name ;
35
Introduction Structure XML DTD NameSpace XML Summary
General Entities
Examples
• Entity (internal) :
< !ENTITY website "http ://www.mysite.com">
• In an XML document, if we declare :
<url>The website for the DSS course is : &website ;</url>.
• It will be evaluated to :
<url>The website for the DSS course is :
http ://www.mysite.com</url>
36
Introduction Structure XML DTD NameSpace XML Summary
General Entities
Examples
• Entity (internal) :
< !ENTITY website "http ://www.mysite.com">
• In an XML document, if we declare :
<url>The website for the DSS course is : &website ;</url>.
• It will be evaluated to :
<url>The website for the DSS course is :
http ://www.mysite.com</url>
36
Introduction Structure XML DTD NameSpace XML Summary
General Entities
Examples
• Entity (internal) :
< !ENTITY website "http ://www.mysite.com">
• In an XML document, if we declare :
<url>The website for the DSS course is : &website ;</url>.
• It will be evaluated to :
<url>The website for the DSS course is :
http ://www.mysite.com</url>
36
Introduction Structure XML DTD NameSpace XML Summary
General Entities
Examples
• Entity (internal) :
< !ENTITY website "http ://www.mysite.com">
• In an XML document, if we declare :
<url>The website for the DSS course is : &website ;</url>.
• It will be evaluated to :
<url>The website for the DSS course is :
http ://www.mysite.com</url>
36
Introduction Structure XML DTD NameSpace XML Summary
General Entities
Examples
• Entity (internal) :
< !ENTITY website "http ://www.mysite.com">
• In an XML document, if we declare :
<url>The website for the DSS course is : &website ;</url>.
• It will be evaluated to :
<url>The website for the DSS course is :
http ://www.mysite.com</url>
36
Introduction Structure XML DTD NameSpace XML Summary
Parameter Entities
37
Introduction Structure XML DTD NameSpace XML Summary
Parameter Entities
37
Introduction Structure XML DTD NameSpace XML Summary
Parameter Entities
Example
38
Introduction Structure XML DTD NameSpace XML Summary
Parameter Entities
Example
38
Introduction Structure XML DTD NameSpace XML Summary
Character Entities
& ; &
< ; <
> ; >
&apos ; '
" ; "
39
Introduction Structure XML DTD NameSpace XML Summary
Character Entities
Note :
Any character can be inserted into a document through the entity
constructed according to the syntax &#decimal_code ; or
&#xhexadecimal_code ;.
Example
The code π ; or π ; :
allows inserting the Greek letter : π
40
Introduction Structure XML DTD NameSpace XML Summary
Notations
41
Introduction Structure XML DTD NameSpace XML Summary
Denition
Namespace (NameSpace)
Namespaces were introduced in XML to allow mixing multiple
vocabularies within a single document.
XML Vocabulary
An XML vocabulary is a set of tag names and attributes with a
given meaning.
42
Introduction Structure XML DTD NameSpace XML Summary
43
Introduction Structure XML DTD NameSpace XML Summary
XML Namespaces
Solution : Namespaces
• Distinguish elements and attributes from dierent XML
applications that have the same name.
• Assign elements and attributes to a URI.
• Assign a prex to this URI.
44
Introduction Structure XML DTD NameSpace XML Summary
Declaration
Declaration is done by an attribute associated with an element, and
two forms exist :
• xmlns="uri" denes the default namespace.
• xmlns :prex="uri" denes the prex representing a qualied
namespace.
45
Introduction Structure XML DTD NameSpace XML Summary
46
Introduction Structure XML DTD NameSpace XML Summary
47
Introduction Structure XML DTD NameSpace XML Summary
Removing a Namespace
48
Introduction Structure XML DTD NameSpace XML Summary
49
Introduction Structure XML DTD NameSpace XML Summary
50
Introduction Structure XML DTD NameSpace XML Summary
51
Introduction Structure XML DTD NameSpace XML Summary
52
Introduction Structure XML DTD NameSpace XML Summary
Summary
53
Introduction Structure XML DTD NameSpace XML Summary
Summary
53
Introduction Structure XML DTD NameSpace XML Summary
Summary
53
Introduction Structure XML DTD NameSpace XML Summary
Summary
53
Introduction Structure XML DTD NameSpace XML Summary
Summary
53
Introduction Structure XML DTD NameSpace XML Summary
Summary
53
Introduction Structure XML DTD NameSpace XML Summary
Summary
53
Introduction Structure XML DTD NameSpace XML Summary
Summary
53
Introduction Structure XML DTD NameSpace XML Summary
References
54