0% found this document useful (0 votes)
28 views18 pages

4-Schemas

XML

Uploaded by

samir sahraoui
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views18 pages

4-Schemas

XML

Uploaded by

samir sahraoui
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Objectives

ƒ The purpose of using schemas


An Introduction to XML and Web Technologies
ƒ The schema languages DTD and XML Schema
(and DSD2 and RELAX NG)
Schema Languages
ƒ Regular expressions – a commonly used
formalism in schema languages

Anders Møller & Michael I. Schwartzbach


© 2006 Addison-Wesley
An Introduction to XML and Web Technologies 2

Motivation XML Languages

ƒ We have designed our Recipe Markup Language ƒ XML language:


ƒ ...but so far only informally described its syntax a set of XML documents with some semantics

ƒ schema:
ƒ How can we make tools that check that a formal definition of the syntax of an XML language
an XML document is a syntactically correct
Recipe Markup Language document (and thus
meaningful)? ƒ schema language:
a notation for writing schemas

ƒ Implementing a specialized validation tool for


Recipe Markup Language is not the solution...

An Introduction to XML and Web Technologies 3 An Introduction to XML and Web Technologies 4

1
Validation Why use Schemas?

instance
document ƒ Formal but human-readable descriptions
schema
ƒ Data validation can be performed with existing
schema schema processors
processor

valid invalid

normalized error
instance message
document

An Introduction to XML and Web Technologies 5 An Introduction to XML and Web Technologies 6

General Requirements Regular Expressions

ƒ Commonly used in schema languages to describe


ƒ Expressiveness sequences of characters or elements

ƒ Σ: an alphabet (typically Unicode characters or element names)


ƒ Efficiency
ƒ σ∈Σ matches the string σ
ƒ α? matches zero or one α
ƒ Comprehensibility ƒ α* matches zero or more α’s
ƒ α+ matches one or more α’s
ƒ α β matches any concatenation of an α and a β
ƒ α | β matches the union of α and β

An Introduction to XML and Web Technologies 7 An Introduction to XML and Web Technologies 8

2
Examples DTD – Document Type Definition

ƒ A regular expression describing integers:


ƒ Defined as a subset of the
0|-?(1|2|3|4|5|6|7|8|9)(0|1|2|3|4|5|6|7|8|9)* DTD formalism from SGML

ƒ Specified as an integral part of XML 1.0


ƒ A regular expression describing the valid contents of
table elements in XHTML:
ƒ A starting point for development of more expressive
caption? ( col* | colgroup* ) thead? tfoot? ( tbody+ | tr+ ) schema languages

ƒ Considers elements, attributes, and character data –


processing instructions and comments are
mostly ignored
An Introduction to XML and Web Technologies 9 An Introduction to XML and Web Technologies 10

Document Type Declarations Element Declarations

ƒ Associates a DTD schema with the instance document <!ELEMENT element-name content-model >
ƒ <?xml version="1.1"?>
<!DOCTYPE collection SYSTEM "http://www.brics.dk/ixwt/recipes.dtd"> Content models:
<collection>
... ƒ EMPTY
</collection>
ƒ ANY
ƒ <!DOCTYPE html ƒ mixed content: (#PCDATA|e1|e2|...|en)*
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN”
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> ƒ element content: regular expression over element names
(concatenation is written with “,”)
ƒ <!DOCTYPE collection [ ... ]>

Example:
<!ELEMENT table
(caption?,(col*|colgroup*),thead?,tfoot?,(tbody+|tr+)) >

An Introduction to XML and Web Technologies 11 An Introduction to XML and Web Technologies 12

3
Attribute-
Attribute-List Declarations Attribute Types

<!ATTLIST element-name attribute-definitions > ƒ CDATA: any value


ƒ enumeration: (s1|s2|...|sn)
ƒ ID: must have unique value
Each attribute definition consists of ƒ IDREF (/ IDREFS): must match some ID attribute(s)
ƒ an attribute name ƒ ...
ƒ an attribute type
Examples:
ƒ a default declaration <!ATTLIST p align (left|center|right|justify) #IMPLIED>

<!ATTLIST recipe id ID #IMPLIED>


Example: <!ATTLIST related ref IDREF #IMPLIED>
<!ATTLIST input maxlength CDATA #IMPLIED
tabindex CDATA #IMPLIED>

An Introduction to XML and Web Technologies 13 An Introduction to XML and Web Technologies 14

Attribute Default Declarations Entity Declarations (1/3)

ƒ #REQUIRED ƒ Internal entity declarations – a simple macro


ƒ #IMPLIED (= optional) mechanism
ƒ ”value” (= optional, but default provided)
Example:
ƒ #FIXED ”value” (= required, must have this value) • Schema:
<!ENTITY copyrightnotice "Copyright &#169; 2005 Widgets'R'Us.">
Examples:
<!ATTLIST form • Input:
action CDATA #REQUIRED
onsubmit CDATA #IMPLIED A gadget has a medium size head and a big gizmo subwidget.
&copyrightnotice;
method (get|post) "get"
enctype CDATA "application/x-www-form-urlencoded" > • Output:
<!ATTLIST html A gadget has a medium size head and a big gizmo subwidget.
xmlns CDATA #FIXED "http://www.w3.org/1999/xhtml"> Copyright &#169; 2005 Widgets'R'Us.

An Introduction to XML and Web Technologies 15 An Introduction to XML and Web Technologies 16

4
Entity Declarations (2/3) Entity Declarations (3/3)

ƒ Internal parameter entity declarations – apply ƒ External parsed entity declarations –


to the DTD, not the instance document references to XML data in other files
Example:
• <!ENTITY widgets
Example: SYSTEM "http://www.brics.dk/ixwt/widgets.xml">

• Schema: not widely used!


<!ENTITY % Shape "(rect|circle|poly|default)">
ƒ External unparsed entity declarations –
references to non-XML data
• <!ATTLIST area shape %Shape; "rect">
Example:
corresponds to • <!ENTITY widget-image
SYSTEM "http://www.brics.dk/ixwt/widget.gif”
<!ATTLIST area shape (rect|circle|poly|default) "rect"> NDATA gif >
• <!NOTATION gif
SYSTEM "http://www.iana.org/assignments/media-types/image/gif">
• <!ATTLIST thing img ENTITY #REQUIRED>

An Introduction to XML and Web Technologies 17 An Introduction to XML and Web Technologies 18

Conditional Sections Checking Validity with DTD

ƒ Allow parts of schemas to be enabled/disabled A DTD processor (also called a validating XML parser)
by a switch ƒ parses the input document (includes checking
Example: well-formedness)
• <![%person.simple; [
<!ELEMENT person (firstname,lastname)> ƒ checks the root element name
]]>
<![%person.full; [ ƒ for each element, checks its contents and
<!ELEMENT person (firstname,lastname,email+,phone?)>
<!ELEMENT email (#PCDATA)> attributes
<!ELEMENT phone (#PCDATA)>
]]> ƒ checks uniqueness and referential constraints
<!ELEMENT firstname (#PCDATA)>
<!ELEMENT lastname (#PCDATA)> (ID/IDREF(S) attributes)
• <!ENTITY % person.simple "INCLUDE" >
<!ENTITY % person.full "IGNORE" >

An Introduction to XML and Web Technologies 19 An Introduction to XML and Web Technologies 20

5
RecipeML with DTD (1/2) RecipeML with DTD (2/2)

<!ELEMENT collection (description,recipe*)> <!ELEMENT preparation (step*)>


<!ELEMENT description (#PCDATA)> <!ELEMENT step (#PCDATA)>
<!ELEMENT recipe <!ELEMENT comment (#PCDATA)>
(title,date,ingredient*,preparation,comment?, <!ELEMENT nutrition EMPTY>
nutrition,related*)> <!ATTLIST nutrition calories CDATA #REQUIRED
<!ATTLIST recipe id ID #IMPLIED> carbohydrates CDATA #REQUIRED
<!ELEMENT title (#PCDATA)> fat CDATA #REQUIRED
<!ELEMENT date (#PCDATA)> protein CDATA #REQUIRED
<!ELEMENT ingredient (ingredient*,preparation)?> alcohol CDATA #IMPLIED>
<!ATTLIST ingredient name CDATA #REQUIRED <!ELEMENT related EMPTY>
amount CDATA #IMPLIED <!ATTLIST related ref IDREF #REQUIRED>
unit CDATA #IMPLIED>

An Introduction to XML and Web Technologies 21 An Introduction to XML and Web Technologies 22

Problems with the DTD description Limitations of DTD

ƒ calories should contain a non-negative number 1. Cannot constraint character data


ƒ protein should contain a value on the form N% where 2. Specification of attribute values is too limited
N is between 0 and 100; 3. Element and attribute declarations are context insensitive
4. Character data cannot be combined with the regular expression
ƒ comment should be allowed to appear anywhere in the content model
contents of recipe 5. The content models lack an “interleaving” operator
ƒ unit should only be allowed in an elements where 6. The support for modularity, reuse, and evolution is too primitive
amount is also present 7. The normalization features lack content defaults and proper
ƒ nested ingredient elements should only be allowed whitespace control
when amount is absent 8. Structured embedded self-documentation is not possible
9. The ID/IDREF mechanism is too simple
10. It does not itself use an XML syntax
– our DTD schema permits in some cases too much and in 11. No support for namespaces
other cases too little!
An Introduction to XML and Web Technologies 23 An Introduction to XML and Web Technologies 24

6
Requirements for XML Schema Types and Declarations

- W3C’s proposal for replacing DTD ƒ Simple type definition:


Design principles: defines a family of Unicode text strings
ƒ More expressive than DTD
ƒ Use XML notation ƒ Complex type definition:
ƒ Self-describing defines a content and attribute model
ƒ Simplicity

Technical requirements:
ƒ Namespace support
ƒ Element declaration:
ƒ User-defined datatypes associates an element name with a simple or complex type
ƒ Inheritance (OO-like)
ƒ Evolution ƒ Attribute declaration:
ƒ Embedded documentation associates an attribute name with a simple type
ƒ ...
An Introduction to XML and Web Technologies 25 An Introduction to XML and Web Technologies 26

Example (1/3) Example (2/3)

Instance document: Schema:

<schema xmlns="http://www.w3.org/2001/XMLSchema"
<b:card xmlns:b="http://businesscard.org">
xmlns:b="http://businesscard.org"
<b:name>John Doe</b:name> targetNamespace="http://businesscard.org">
<b:title>CEO, Widget Inc.</b:title>
<b:email>[email protected]</b:email> <element name="card" type="b:card_type"/>
<element name="name" type="string"/>
<b:phone>(202) 555-1414</b:phone>
<element name="title" type="string"/>
<b:logo b:uri="widget.gif"/> <element name="email" type="string"/>
</b:card> <element name="phone" type="string"/>
<element name="logo" type="b:logo_type"/>
<attribute name="uri" type="anyURI"/>

An Introduction to XML and Web Technologies 27 An Introduction to XML and Web Technologies 28

7
Example (3/3) Connecting Schemas and Instances

<complexType name="card_type">
<sequence>
<element ref="b:name"/> <b:card xmlns:b="http://businesscard.org“
<element ref="b:title"/> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
<element ref="b:email"/> xsi:schemaLocation="http://businesscard.org
<element ref="b:phone" minOccurs="0"/> business_card.xsd">
<element ref="b:logo" minOccurs="0"/> <b:name>John Doe</b:name>
</sequence>
<b:title>CEO, Widget Inc.</b:title>
</complexType>
<b:email>[email protected]</b:email>
<b:phone>(202) 555-1414</b:phone>
<complexType name="logo_type">
<b:logo b:uri="widget.gif"/>
<attribute ref=“b:uri" use="required"/>
</b:card>
</complexType>
</schema>

An Introduction to XML and Web Technologies 29 An Introduction to XML and Web Technologies 30

Element and Attribute Declarations Simple Types (Datatypes


(Datatypes)) – Primitive

Examples: string any Unicode string


boolean true, false, 1, 0
decimal 3.1415
• <element name="serialnumber"
float 6.02214199E23
type="nonNegativeInteger"/>
double 42E970
dateTime 2004-09-26T16:29:00-05:00
• <attribute name=”alcohol" time 16:29:00-05:00
type=”r:percentage"/> date 2004-09-26
hexBinary 48656c6c6f0a
base64Binary SGVsbG8K
anyURI http://www.brics.dk/ixwt/
QName rcp:recipe, recipe
...

An Introduction to XML and Web Technologies 31 An Introduction to XML and Web Technologies 32

8
Derivation of Simple Types – Restriction Examples

Constraining facets: <simpleType name="score_from_0_to_100">


<restriction base="integer">
• length • maxInclusive <minInclusive value="0"/>
• minLength • maxExclusive <maxInclusive value="100"/>
• minInclusive </restriction>
• maxLength
</simpleType>
• minExclusive
• pattern
• enumeration • totalDigits <simpleType name="percentage">
• whiteSpace • fractionDigits <restriction base="string">
<pattern value="([0-9]|[1-9][0-9]|100)%"/>
</restriction>
</simpleType> regular expression

An Introduction to XML and Web Technologies 33 An Introduction to XML and Web Technologies 34

Simple Type Derivation – List Simple Type Derivation – Union

<simpleType name="boolean_or_decimal">
<simpleType name="integerList"> <union>
<list itemType="integer"/> <simpleType>
</simpleType> <restriction base="boolean"/>
</simpleType>
matches whitespace separated lists of integers <simpleType>
<restriction base="decimal"/>
</simpleType>
</union>
</simpleType>

An Introduction to XML and Web Technologies 35 An Introduction to XML and Web Technologies 36

9
Built-
Built-In Derived Simple Types Complex Types with Complex Contents

ƒ Content models as regular expressions:


• normalizedString
• nonNegativeInteger • Element reference <element ref=”name”/>
• token
• unsignedLong • Concatenation <sequence> ... </sequence>
• language • Union <choice> ... </choice>
• long
• Name • All <all> ... </all>
• int
• NCName • Element wildcard: <any namespace=”...”
• short processContents=”...”/>
• ID
• byte ƒ Attribute reference: <attribute ref=”...”/>
• IDREF
• ... ƒ Attribute wildcard: <anyAttribute namespace=”...”
• integer processContents=”...”/>

Cardinalities: minOccurs, maxOccurs, use


Mixed content: mixed=”true”

An Introduction to XML and Web Technologies 37 An Introduction to XML and Web Technologies 38

Example Complex Types with Simple Content

<element name="order" type="n:order_type"/> <complexType name="category">


<simpleContent> <complexType name="extended_category">
<extension base="integer"> <simpleContent>
<complexType name="order_type" mixed="true"> <attribute ref=”r:class”/> <extension base="n:category">
<choice> </extension> <attribute ref=”r:kind"/>
</extension>
<element ref="n:address"/> </simpleContent>
</complexType> </simpleContent>
<sequence> </complexType>
<element ref="n:email"
minOccurs="0" maxOccurs="unbounded"/> <complexType name="restricted_category">
<simpleContent>
<element ref="n:phone"/>
<restriction base="n:category">
</sequence> <totalDigits value="3"/>
</choice> <attribute ref=“r:class" use="required"/>
<attribute ref=”n:id" use="required"/> </restriction>
</simpleContent>
</complexType>
</complexType>

An Introduction to XML and Web Technologies 39 An Introduction to XML and Web Technologies 40

10
Derivation with Complex Content Global vs. Local Descriptions
<complexType name="basic_card_type">
<sequence> Global (toplevel) style: Local (inlined) style:
<element ref="b:name"/>
<element name="card“ <element name="card">
</sequence> inlined
</complexType> type="b:card_type"/> <complexType>
<element name="name“ <sequence>
<complexType name="extended_type"> type="string"/> <element name="name"
<complexContent> <complexType name="further_derived"> type="string"/>
<extension base= <complexContent>
<restriction base= <complexType name="card_type"> ...
"b:basic_card_type">
<sequence> "b:extended_type"> <sequence> </sequence>
<element ref="b:title"/> <sequence> <element ref="b:name"/> </complexType>
<element ref="b:email" <element ref="b:name"/>
... </element>
minOccurs="0"/> <element ref="b:title"/>
<element ref="b:email"/> </sequence>
</sequence>
</extension> </sequence> </complexType>
</complexContent> </restriction>
</complexType> </complexContent>
</complexType>

Note: restriction is not the opposite of extension!


An Introduction to XML and Web Technologies 41 An Introduction to XML and Web Technologies 42

Global vs. Local Descriptions Requirements to Complex Types

ƒ Local type definitions are anonymous ƒ Two element declarations that have the same name
and appear in the same complex type must have identical types

ƒ Local element/attribute declarations can be overloaded <complexType name=”some_type">


<choice>
– a simple form of context sensitivity <element name=”foo" type=”string"/>
(particularly useful for attributes!) <element name=”foo" type=”integer"/>
</choice>
</complexType>

ƒ Only globally declared elements can be starting points


for validation (e.g. roots) • This requirement makes efficient implementation easier

ƒ all can only contain element (e.g. not sequence!)


ƒ Local definitions permit an alternative namespace • so we cannot use all to solve the problem with comment in RecipeML
semantics (explained later...) ƒ ...

An Introduction to XML and Web Technologies 43 An Introduction to XML and Web Technologies 44

11
Namespaces Derived Types and Subsumption

ƒ <schema targetNamespace="...” ...> ƒ Assume that


• T is some type
ƒ Prefixes are also used in certain attribute values! • T- is derived from T by restriction
• T+ is derived from T by extension
ƒ Unqualified Locals:
• if enabled, the name of a locally declared element ƒ Subsumption: Whenever a T instance is required,
or attribute in the instance document must have • a T- instance may be used instead (trivial)
no namespace prefix (i.e. the empty namespace URI) • a T+ instance may be used instead – if the instance has
• such an attribute or element “belongs to” the element xsi:type=”T+”
(with xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance")
declared in the surrounding global definition
ƒ Derivation, instantiation, and subsumption can be constrained
• always change the default behavior using using final, abstract, and block
elementFormDefault="qualified"
An Introduction to XML and Web Technologies 45 An Introduction to XML and Web Technologies 46

Substitution Groups Uniqueness, Keys, References

ƒ Assume D is (in some number of steps) derived from B, <element name="w:widget" xmlns:w="http://www.widget.org">
<complexType>
ED is an element declaration of type D, and in every widget, each part must have
...
EB is an element declaration of type B unique (manufacturer, productid)
</complexType>
<key name="my_widget_key">
<selector xpath="w:components/w:part"/>
ƒ If ED is in substitution group of EB then <field xpath="@manufacturer"/> only a “downward”
an ED element may be used whenever an EB is required <field xpath="w:info/@productid"/> subset of XPath is used
</key>
<keyref name="annotation_references" refer="w:my_widget_key">
ƒ (This is subsumption based on element declarations, <selector xpath=".//w:annotation"/>
not on types) <field xpath="@manu"/>
<field xpath="@prod"/>
</keyref>
in every widget, for each annotation,
</element>
(manu, prod) must match a my_widget_key
unique: as key, but fields may be absent

An Introduction to XML and Web Technologies 47 An Introduction to XML and Web Technologies 48

12
Other Features in XML Schema RecipeML with XML Schema (1/5)

ƒ Groups <schema xmlns="http://www.w3.org/2001/XMLSchema"


xmlns:r="http://www.brics.dk/ixwt/recipes"
ƒ Nil values targetNamespace="http://www.brics.dk/ixwt/recipes"
elementFormDefault="qualified">

ƒ Annotations <element name="collection">


<complexType>
ƒ Defaults and whitespace <sequence>
<element name="description" type="string"/>
ƒ Modularization <element ref="r:recipe" minOccurs="0" maxOccurs="unbounded"/>
</sequence>
</complexType>
<unique name="recipe-id-uniqueness">
<selector xpath=".//r:recipe"/>
– read the book chapter <field xpath="@id"/>
</unique>
<keyref name="recipe-references" refer="r:recipe-id-uniqueness">
<selector xpath=".//r:related"/>
<field xpath="@ref"/>
</keyref>
</element>

An Introduction to XML and Web Technologies 49 An Introduction to XML and Web Technologies 50

RecipeML with XML Schema (2/5) RecipeML with XML Schema (3/5)
<element name="ingredient">
<complexType>
<element name="recipe"> <sequence minOccurs="0">
<complexType> <element ref="r:ingredient" minOccurs="0" maxOccurs="unbounded"/>
<sequence> <element ref="r:preparation"/>
<element name="title" type="string"/> </sequence>
<element name="date" type="string"/> <attribute name="name" use="required"/>
<element ref="r:ingredient" minOccurs="0" maxOccurs="unbounded"/> <attribute name="amount" use="optional">
<element ref="r:preparation"/> <simpleType>
<element name="comment" type="string" minOccurs="0"/> <union>
<element ref="r:nutrition"/> <simpleType>
<element ref="r:related" minOccurs="0" maxOccurs="unbounded"/> <restriction base="r:nonNegativeDecimal"/>
</sequence> </simpleType>
<attribute name="id" type="NMTOKEN"/> <simpleType>
</complexType> <restriction base="string">
</element> <enumeration value="*"/>
</restriction>
</simpleType>
</union>
</simpleType>
</attribute>
<attribute name="unit" use="optional"/>
</complexType>
</element>

An Introduction to XML and Web Technologies 51 An Introduction to XML and Web Technologies 52

13
RecipeML with XML Schema (4/5) RecipeML with XML Schema (5/5)
<element name="preparation">
<complexType> <simpleType name="nonNegativeDecimal">
<sequence> <restriction base="decimal">
<element name="step" type="string“ minOccurs="0“ maxOccurs="unbounded"/> <minInclusive value="0"/>
</sequence> </restriction>
</complexType> </simpleType>
</element>
<simpleType name="percentage">
<element name="nutrition"> <restriction base="string">
<complexType> <pattern value="([0-9]|[1-9][0-9]|100)%"/>
<attribute name="calories" type="r:nonNegativeDecimal“ use="required"/> </restriction>
<attribute name="protein" type="r:percentage" use="required"/> </simpleType>
<attribute name="carbohydrates" type="r:percentage" use="required"/>
<attribute name="fat" type="r:percentage" use="required"/> </schema>
<attribute name="alcohol" type="r:percentage" use="optional"/>
</complexType>
</element>

<element name="related">
<complexType>
<attribute name="ref" type="NMTOKEN" use="required"/>
</complexType>
</element>

An Introduction to XML and Web Technologies 53 An Introduction to XML and Web Technologies 54

Problems with the XML Schema description Limitations of XML Schema

ƒ calories should contain a non-negative number 1. The details are extremely complicated (and the spec is unreadable)
ed
solv a value on the form N% where N
ƒ protein should contain 2. Declarations are (mostly) context insentitive
is between 0 and 100; 3. It is impossible to write an XML Schema description of XML Schema
ƒ comment should be allowed to appear 4. With mixed content, character data cannot be constrained
anywhere in the contents of recipe 5. Unqualified local elements are bad practice
ƒ unit should only be allowed in an elements 6. Cannot require specific root element
where amount is also present 7. Element defaults cannot contain markup
8. The type system is overly complicated
ƒ nested ingredient elements should only be
9. xsi:type is problematic
allowed when amount is absent
10. Simple type definitions are inflexible

– even XML Schema has insufficient expressiveness!

An Introduction to XML and Web Technologies 55 An Introduction to XML and Web Technologies 56

14
Strengths of XML Schema Document Structure Description 2.0

ƒ Namespace support – read the book chapter

ƒ Data types (built-in and derivation)

ƒ Modularization

ƒ Type derivation mechanism

An Introduction to XML and Web Technologies 57 An Introduction to XML and Web Technologies 58

RELAX NG Processing Model

ƒ OASIS + ISO competitor to XML Schema ƒ For a valid instance document, the root element
must match a designated pattern
ƒ Validation only (no normalization)
ƒ A pattern may match elements, attributes, or
ƒ Designed for simplicity and expressiveness, character data
solid mathematical foundation
ƒ Element patterns can contain sub-patterns, that
describe contents and attributes

An Introduction to XML and Web Technologies 59 An Introduction to XML and Web Technologies 60

15
Patterns – Regular Hedge Expressions Example

ƒ <element name=”...”> ... </element> <element name="card">


ƒ <attribute name=”...”> ... </attribute> <element name="name"><text/></element>
ƒ <text/> <element name="title"><text/></element>
<element name="email"><text/></element>
<optional>
ƒ <group> ... </group> (concatenation)
<element name="phone"><text/></element>
ƒ <optional> ... </optional>
</optional>
ƒ <zeroOrMore> ... </zeroOrMore> <optional>
ƒ <oneOrMore> ... </oneOrMore> <element name="logo">
ƒ <choice> ... </choice> (union) <attribute name="uri"><text/></attribute>
ƒ <empty/> </element>
ƒ <interleave> ... </interleave> </optional>
ƒ <mixed> ... </mixed> </element>

An Introduction to XML and Web Technologies 61 An Introduction to XML and Web Technologies 62

Grammars Other Features in RELAX NG

ƒ Pattern definitions and references ƒ Name classes


allow description of recursive structures ƒ Datatypes (based on XML Schema’s datatypes)
ƒ Modularization
ƒ <grammar ...>
<start> ƒ An alternative compact, non-XML syntax
...
</start>
– read the book chapter
<define name="...">
...
</define>
...

</grammar>

An Introduction to XML and Web Technologies 63 An Introduction to XML and Web Technologies 64

16
RecipeML with RELAX NG (1/5) RecipeML with RELAX NG (2/5)

<grammar xmlns="http://relaxng.org/ns/structure/1.0" <interleave>


ns="http://www.brics.dk/ixwt/recipes" <group>
datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <element name="title"><text/></element>
<start> <element name="date"><text/></element>
<element name="collection"> <zeroOrMore><ref name="element-ingredient"/></zeroOrMore>
<element name="description"><text/></element> <ref name="element-preparation"/>
<zeroOrMore><ref name="element-recipe"/></zeroOrMore> <element name="nutrition">
</element> <ref name="attributes-nutrition"/>
</start> </element>
<zeroOrMore><ref name="element-related"/></zeroOrMore>
<define name="element-recipe"> </group>
<element name="recipe"> <optional><element name="comment"><text/></element></optional>
<optional><attribute name="id"> </interleave>
<data datatypeLibrary=“http://relaxng.org/...“ type="ID"/> </element>
</attribute></optional> </define>

An Introduction to XML and Web Technologies 65 An Introduction to XML and Web Technologies 66

RecipeML with RELAX NG (3/5) RecipeML with RELAX NG (4/5)

<define name="element-ingredient"> <define name="element-preparation">


<element name="ingredient"> <element name="preparation">
<attribute name="name"/> <zeroOrMore><element name="step"><text/></element></zeroOrMore>
<choice> </element>
<group> </define>
<attribute name="amount">
<choice><value>*</value><ref name="NUMBER"/></choice> <define name="attributes-nutrition">
</attribute> <attribute name="calories"><ref name="NUMBER"/></attribute>
<optional><attribute name="unit"/></optional> <attribute name="protein"><ref name="PERCENTAGE"/></attribute>
</group> <attribute name="carbohydrates"><ref name="PERCENTAGE"/></attribute>
<group> <attribute name="fat"><ref name="PERCENTAGE"/></attribute>
<zeroOrMore><ref name="element-ingredient"/></zeroOrMore> <optional>
<ref name="element-preparation"/> <attribute name="alcohol"<ref name="PERCENTAGE"/></attribute>
</group> </optional>
</choice> </define>
</element>
</define>

An Introduction to XML and Web Technologies 67 An Introduction to XML and Web Technologies 68

17
RecipeML with RELAX NG (5/5) Summary

<define name="element-related">
<element name="related"> ƒ schema: formal description of the syntax of an
<attribute name="ref">
<data datatypeLibrary="http://relaxng.org/..." type="IDREF"/>
XML language
</attribute>
</element>
</define>
ƒ DTD: simple schema language
<define name="PERCENTAGE"> • elements, attributes, entities, ...
<data type="string">
<param name="pattern">([0-9]|[1-9][0-9]|100)%</param> ƒ XML Schema: more advanced schema language
</data>
</define> • element/attribute declarations
<define name="NUMBER">
• simple types, complex types, type derivations
<data type="decimal"><param name="minInclusive">0</param></data> • global vs. local descriptions
</define>
• ...
</grammar>

An Introduction to XML and Web Technologies 69 An Introduction to XML and Web Technologies 70

Essential Online Resources

ƒ http://www.w3.org/TR/xml11/
ƒ http://www.w3.org/TR/xmlschema-1/
ƒ http://www.w3.org/TR/xmlschema-2/

An Introduction to XML and Web Technologies 71

18

You might also like