10 XML
10 XML
1
Well-Formed and Valid
XML
Well-Formed XML allows you to
invent your own tags.
Valid XML conforms to a certain
DTD.
2
Well-Formed XML
Start the document with a
declaration, surrounded by <?xml
… ?> .
Normal declaration is:
<?xml version = ”1.0” standalone
= ”yes” ?>
“standalone” = “no DTD provided.”
Balance of document is a root tag
surrounding nested tags.
3
Tags
Tags are normally matched pairs,
as <FOO> … </FOO>.
Unmatched tags also allowed, as
<FOO/>
Tags may be nested arbitrarily.
XML tags are case-sensitive.
4
Example: Well-Formed
XML
<?xml version = “1.0” standalone = “yes”A?>
NAME
subelement
<BARS>
<BAR><NAME>Joe’s Bar</NAME>
<BEER><NAME>Bud</NAME>
<PRICE>2.50</PRICE></BEER>
Root tag <BEER><NAME>Miller</NAME>
A BEER
<PRICE>3.00</PRICE></BEER>
subelement
</BAR>
<BAR> … Tags surrounding
</BARS> a BEER element
5
DTD Structure
<!DOCTYPE <root tag> [
<!ELEMENT <name>(<components>)>
. . . more elements . . .
]>
6
DTD Elements
The description of an element
consists of its name (tag), and a
parenthesized description of any
nested tags.
Includes order of subtags and their
multiplicity.
Leaves (text elements) have
#PCDATA (Parsed Character DATA )
in place of nested tags.
7
Example: DTD
A BARS object has
<!DOCTYPE BARS [ zero or more BAR’s
nested within.
<!ELEMENT BARS (BAR*)>
<!ELEMENT BAR (NAME, BEER+)> A BAR has one
<!ELEMENT NAME (#PCDATA)> NAME and one
or more BEER
<!ELEMENT BEER (NAME, PRICE)> subobjects.
8
Element Descriptions
Subtags must appear in order shown.
A tag may be followed by a symbol
to indicate its multiplicity.
* = zero or more.
+ = one or more.
? = zero or one.
Symbol | can connect alternative
sequences of tags.
9
Example: Element
Description
A name is an optional title (e.g.,
“Prof.”), a first name, and a last
name, in that order, or it is an IP
address:
<!ELEMENT NAME (
(TITLE?, FIRST, LAST) | IPADDR
)>
10
Use of DTD’s
1. Set standalone = “no”.
2. Either:
a) Include the DTD as a preamble of
the XML document, or
b) Follow DOCTYPE and the <root tag>
by SYSTEM and a path to the file
where the DTD can be found.
11
Example: (a)
<?xml version = “1.0” standalone = “no” ?>
<!DOCTYPE BARS [
<!ELEMENT BARS (BAR*)>
<!ELEMENT BAR (NAME, BEER+)> The DTD
<!ELEMENT NAME (#PCDATA)>
<!ELEMENT BEER (NAME, PRICE)>
<!ELEMENT PRICE (#PCDATA)>
The document
]>
<BARS>
<BAR><NAME>Joe’s Bar</NAME>
<BEER><NAME>Bud</NAME> <PRICE>2.50</PRICE></BEER>
<BEER><NAME>Miller</NAME> <PRICE>3.00</PRICE></BEER>
</BAR>
<BAR> …
</BARS>
12
Example: (b)
Assume the BARS DTD is in file bar.dtd.
<?xml version = “1.0” standalone = “no” ?>
<!DOCTYPE BARS SYSTEM ”bar.dtd”>
<BARS> Get the DTD
<BAR><NAME>Joe’s Bar</NAME> from the file
<BEER><NAME>Bud</NAME> bar.dtd
<PRICE>2.50</PRICE></BEER>
<BEER><NAME>Miller</NAME>
<PRICE>3.00</PRICE></BEER>
</BAR>
<BAR> …
</BARS>
13
Attributes
Opening tags in XML can have
attributes.
In a DTD,
<!ATTLIST E . . . >
declares attributes for element E,
along with its datatype.
14
Example: Attributes
Bars can have an attribute kind, a
character string describing the bar.
<!ELEMENT BAR (NAME, BEER*)>
<!ATTLIST BAR kind CDATA
#IMPLIED>
Character string
type; no tags
Attribute is optional
opposite: #REQUIRED
15
Example: Attribute Use
In a document that allows BAR tags, we
might see:
<BAR kind = ”sushi”>
<NAME>Homma’s</NAME>
<BEER><NAME>Sapporo</NAME>
<PRICE>5.00</PRICE></BEER>
...
</BAR>
16
ID’s and IDREF’s
Attributes can be pointers from
one object to another.
Compare to HTML’s NAME = ”foo”
and HREF = ”#foo”.
Allows the structure of an XML
document to be a general graph,
rather than just a tree.
17
Creating ID’s
Give an element E an attribute A
of type ID.
When using tag <E > in an XML
document, give its attribute A a
unique value.
Example:
<E A = ”xyz”>
18
Creating IDREF’s
To allow elements of type F to
refer to another element with an ID
attribute, give F an attribute of
type IDREF.
Or, let the attribute have type
IDREFS, so the F -element can
refer to any number of other
elements.
19
Example: ID’s and IDREF’s
A new BARS DTD includes both BAR and
BEER subelements.
BARS and BEERS have ID attributes name.
BARS have SELLS subelements, consisting
of a number (the price of one beer) and an
IDREF theBeer leading to that beer.
BEERS have attribute soldBy, which is an
IDREFS leading to all the bars that sell it.
20
The DTD Bar elements have name
as an ID attribute and
have one or more
SELLS subelements.
<!DOCTYPE BARS [
<!ELEMENT BARS (BAR*, BEER*)>
SELLS elements
<!ELEMENT BAR (SELLS+)>
have a number
<!ATTLIST BAR name ID #REQUIRED> (the price) and
<!ELEMENT SELLS (#PCDATA)> one reference
<!ATTLIST SELLS theBeer IDREF to a beer.
#REQUIRED>
<!ELEMENT BEER EMPTY>
<!ATTLIST BEER name ID #REQUIRED>
<!ATTLIST BEER soldBy IDREFS #IMPLIED>
]> ExplainedBeer elements have an ID attribute called name,
next and a soldBy attribute that is a set of Bar names.
21
Example: A Document
<BARS>
<BAR name = ”JoesBar”>
<SELLS theBeer = ”Bud”>2.50</SELLS>
<SELLS theBeer =
”Miller”>3.00</SELLS>
</BAR> …
<BEER name = ”Bud” soldBy = ”JoesBar
SuesBar …” /> …
</BARS>
22
Empty Elements
We can do all the work of an
element in its attributes.
Like BEER in previous example.
Another example: SELLS elements
could have attribute price rather
than a value that is a price.
23
Example: Empty Element
In the DTD, declare:
<!ELEMENT SELLS EMPTY>
<!ATTLIST SELLS theBeer IDREF
#REQUIRED>
<!ATTLIST SELLS price CDATA #REQUIRED>
Example use:
<SELLS theBeer = ”Bud” price = ”2.50” />
Note exception to
“matching tags” rule 24
XML Schema
A more powerful way to describe
the structure of XML documents.
XML-Schema declarations are
themselves XML documents.
They describe “elements” and the
things doing the describing are also
“elements.”
25
Structure of an XML-
Schema Document
<? xml version = … ?>
<xs:schema xmlns:xs =
”http://www.w3.org/2001/XMLschema”>
. . .
27
Example: xs:element
<xs:element name = ”NAME”
type = ”xs:string” />
Describes elements such as
<NAME>Joe’s Bar</NAME>
28
Complex Types
To describe elements that consist of
subelements, we use xs:complexType.
Attribute name gives a name to the type.
Typical subelement of a complex type
is xs:sequence, which itself has a
sequence of xs:element subelements.
Use minOccurs and maxOccurs attributes
to control the number of occurrences of
an xs:element.
29
Example: a Type for Beers
<xs:complexType name = ”beerType”>
<xs:sequence> Exactly one
<xs:element name = ”NAME” occurrence
type = ”xs:string”
minOccurs = ”1” maxOccurs = ”1” />
<xs:element name = ”PRICE”
type = ”xs:float”
minOccurs = ”0” maxOccurs = ”1” />
</xs:sequence>
Like ? in
</xs:complexType>
a DTD
30
An Element of Type
beerType
<xxx>
<NAME>Bud</NAME>
<PRICE>2.50</PRICE>
</xxx>
31
Example: a Type for Bars
<xs:complexType name = ”barType”>
<xs:sequence>
<xs:element name = ”NAME”
type = ”xs:string”
minOccurs = ”1” maxOccurs = ”1” />
<xs:element name = ”BEER”
type = ”beerType”
minOccurs = ”0” maxOccurs =
”unbounded” />
</xs:sequence> Like * in
</xs:complexType> a DTD
32
xs:attribute
xs:attribute elements can be
used within a complex type to
indicate attributes of elements of
that type.
attributes of xs:attribute:
name and type as for xs.element.
use = ”required” or ”optional”.
33
Example: xs:attribute
<xs:complexType name = ”beerType”>
<xs:attribute name = ”name”
type = ”xs:string”
use = ”required” />
<xs:attribute name = ”price”
type = ”xs:float”
use = ”optional” />
</xs:complexType>
34
An Element of This New
Type beerType
The element is
We still don’t know the empty, since there
element name. are no declared
subelements.
35
Restricted Simple Types
xs:simpleType can describe
enumerations and range-restricted
base types.
name is an attribute
xs:restriction is a subelement.
36
Restrictions
Attribute base gives the simple type
to be restricted, e.g., xs:integer.
xs:{min, max}{Inclusive, Exclusive}
are four attributes that can give a
lower or upper bound on a numerical
range.
xs:enumeration is a subelement with
attribute value that allows
enumerated types.
37
Example: license Attribute for
BAR
<xs:simpleType name = ”license”>
<xs:restriction base = ”xs:string”>
<xs:enumeration value = ”Full” />
<xs:enumeration value = ”Beer only” />
<xs:enumeration value = ”Sushi” />
</xs:restriction>
</xs:simpleType>
38
Example: Prices in Range [1,5)
<xs:simpleType name = ”price”>
<xs:restriction
base = ”xs:float”
minInclusive = ”1.00”
maxExclusive = ”5.00” />
</xs:simpleType>
39
Keys in XML Schema
An xs:element can have an xs:key
subelement.
Meaning: within this element, all
subelements reached by a certain selector
path will have unique values for a certain
combination of fields.
Example: within one BAR element, the name
attribute of a BEER element is unique.
40
Example: Key
And @
indicates
<xs:element name = ”BAR” … > an attribute
. . . rather than
<xs:key name = ”barKey”> a tag.
42
Example: Foreign Key
Suppose that we have declared that
subelement NAME of BAR is a key
for BARS.
The name of the key is barKey.
We wish to declare DRINKER
elements that have FREQ
subelements. An attribute bar of
FREQ is a foreign key, referring to
the NAME of a BAR. 43
Example: Foreign Key in XML
Schema
<xs:element name = ”DRINKERS”
. . .
<xs:keyref name = ”barRef”
refers = ”barKey”
<xs:selector xpath =
”DRINKER/FREQ” />
<xs:field xpath = ”@bar” />
</xs:keyref>
</xs:element>
44