0% found this document useful (0 votes)
3 views

CH4 WEB Lecture

XML (eXtensible Markup Language) is a markup language designed to describe data rather than display it, allowing users to define their own tags. It is based on SGML and serves as a standard for data exchange on the web, with a focus on self-descriptiveness and extensibility. The document also covers authoring guidelines, Document Type Definitions (DTD), XML query languages, and the concept of mediators for data integration.

Uploaded by

teddy haile
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

CH4 WEB Lecture

XML (eXtensible Markup Language) is a markup language designed to describe data rather than display it, allowing users to define their own tags. It is based on SGML and serves as a standard for data exchange on the web, with a focus on self-descriptiveness and extensibility. The document also covers authoring guidelines, Document Type Definitions (DTD), XML query languages, and the concept of mediators for data integration.

Uploaded by

teddy haile
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

CHAPTER 4:

eXtensible Markup Language


(XML)
What is XML?

 XMLstands for EXtensible Markup


Language
 XML
is a markup language much like
HTML
 XML was designed to describe data, not
to display data
 XML tags are not predefined. You must
define your own tags
 XML is designed to be self-descriptive
 XML is a W3C Recommendation
XML….
 Basedon Standard Generalized Markup
Language (SGML)
 Version
1.0 introduced by World Wide
Web Consortium (W3C) in 1998
 Bridge for data exchange on
the Web
Comparisons
XML HTML
 Extensible set of tags  Fixed set of tags
 Content orientated  Presentation oriented
 Standard Data  No data validation
infrastructure
capabilities
 Allows multiple output  Single presentation
forms
  HTML was designed to
XML was designed to
describe data, with focus display data, with focus
on what data is. on how data looks
 XML was designed for how  HTML was designed for
to store data. how to display data.
 XML tags are not  HTML tags are
predefined. You must predefined
define your own tags
Authoring XML
Elements
 An XML element is made up of a start tag, an end
tag, and data in between.
 Example:
<director> Matthew Dunn </director>
 Example of another element with the same value:
<actor> Matthew Dunn </actor>
 XML tags are case-sensitive:
<CITY> <City> <city>
 XML can abbreviate empty elements, for example:
<married> </married> can be abbreviated to
<married/>
Authoring XML
Elements (cont’d)

 Anattribute is a name-value pair


separated by an equal sign (=).
 Example:

<City ZIP=“94608”>
Emeryville </City>
 Attributesare used to attach
additional, secondary
information to an element.
Authoring XML
Documents
A basic XML document is an XML element
that can, but might not, include nested
XML elements.
 Example:

<books>
<book isbn=“123”>
<title> Second Chance </title>
<author> Matthew Dunn </author>
</book>
</books>
XML Data Model:
Example
<BOOKS>
<book id=“123”
loc=“library”>
<author>Hull</author>
<title>California</title>
<year> 1995 </year>
</book>
<article id=“555”
ref=“123”>
<author>Su</author>
<title> Purdue</title>
</article>
Hull
</BOOKS>
XML Data Model: Example
<?xml version="1.0" encoding="ISO-8859-1"?>
<customer>
<firstname>Michael</firstname>
<lastname>Smith</lastname>
<gender>male</gender>
<address>
<street>197 West Park Ave.</street>
<city>New York</city>
<state>NY</state>
<zip>11375</zip>
<country>US</country>
</address>
<phone>718-235-5670</phone>
<email>[email protected]</email>
</customer>
.
customer -> firstname
lastname
gender
address ->
street

city

State

zip

Country
phone
email
Authoring XML
Documents (cont’d)

 Authoring guidelines:
 All elements must have an end tag.
 All elements must be cleanly nested (overlapping
elements are not allowed).
 All attribute values must be enclosed in quotation
marks.
 Each document must have a unique first element,
the root node.
Document Type
Definitions (DTD)
 An XML document may have an optional
DTD.
 DTD serves as grammar for the
underlying XML document, and it is part of
XML language.
 DTDs are somewhat unsatisfactory, but no
consensus exists so far beyond the basic
DTDs.
 DTD has the form:

<!DOCTYPE name [markupdeclaration]>


DTD (cont’d)

 Consider an XML document:


<db><person><name>Alan</name>
<age>42</age>
<email>[email protected]
</email>
</person>
<person>………</person>
……….
</db>
DTD (cont’d)

 DTD for it might be:


<!DOCTYPE db [
<!ELEMENT db (person*)>
<!ELEMENT person (name, age, email)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT age (#PCDATA)>
<!ELEMENT email (#PCDATA)>
]>
DTD (cont’d)
Occurrence Indicator:

Indicator Occurrence

(no indicator) Required One and only


one
? Optional None or one

* Optional, None, one, or


repeatable more
+ Required, One or more
repeatable
XML Query Languages

 The first XML query languages


 LOREL (Stanford)
 XQL

 Several other query languages have


been developed (e.g. UNQL, XPath)
 XML-QL considered by W3C for
standardization
 Currently W3C is considering and
working on a new query language:
XQuery
A Query Language for
XML: XML-QL
 Developed at AT&T labs
 To extract data from the input XML data
 Has variables to which data is bound and
templates which show how the output XML
data is to be constructed
 Uses the XML syntax
 Based on a where/construct syntax
 Where combines from and where parts of SQL
 Construct corresponds to SQL’s select
XML-QL Query:
Example 1
 Retrieve all authors of books published by Morgan
Kaufmann:

where <book>
<publisher>
<name> Morgan Kaufmann</name>
</publisher>
<title> $T </title>
<author> $A </author>
</book> in “www.a.b.c/bib.xml”
construct <result> $A </result>
XML-QL Query: Example 2

 XML-QL query asking for all bookstores that sell


The Java Programming Language for under $25:
where <store>
<name> $N </name>
<book>
<title> The Java Programming Language </title>
<price> $P </price>
</book>
</store> in “www.store/bib.xml”
$P < 25
construct <result> $N </result>
Semistructured Data and
Mediators
 Semistructured data is often encountered in
data exchange and integration
 At the sources the data may be structured
(e.g. from relational databases)
 We model the data as semistructured to
facilitate exchange and integration
 Users see an integrated semistructured view
that they can query
 Queries are eventually reformulated into
queries over the structured resources (e.g.
SQL)
 Only results need to be materialized
What is a mediator ?

A complex software component that


integrates and transforms data from
one or several sources using a
declarative specification
 Two main contexts:
 Data conversion: converts data between two different models

e.g. by translating data from a


relational database into XML
 Data integration: integrates data from different sources into a
common view
Converting Relational
Database to XML
Example: Export the following data into XML and
group books by store
 Relational Database:
Store (sid, name, phone)
Book (bid, title, authors)
StoreBook (sid , bid, price, stock)
price stock

name Store StoreBook Book authors

phone sid title bid


Converting Relational
Database to XML (Cont’d)
 XML:

<store>
<name> … </name>
<phone> … </phone>
<book> <title>… </title>
<authors> … </authors>
<price> … </price>
</book>
<book>…</book>

</store>
Example

You might also like