0% found this document useful (0 votes)
36 views23 pages

Unit 2 XML-DTD

DTD stands for Document Type Definition. It defines the structure and elements that can be used in an XML document. DTDs can be internal, defined within the XML document, or external, defined in a separate file. Elements have a name and content model that defines what child elements they can contain. Attributes provide additional information about elements.

Uploaded by

chitra devi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views23 pages

Unit 2 XML-DTD

DTD stands for Document Type Definition. It defines the structure and elements that can be used in an XML document. DTDs can be internal, defined within the XML document, or external, defined in a separate file. Elements have a name and content model that defines what child elements they can contain. Attributes provide additional information about elements.

Uploaded by

chitra devi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 23

Document Type Definition

DTDs
What is a DTD
• Defines the structure of an XML document
• Only the elements defined in a DTD can be used
in an XML document
• can be internal or external
• A DTD defines the structure of a “valid” XML
document
• Processing overhead is incurred when validating
XML with a DTD
An internal DTD
<?xml version=“1.0”?>

<!DOCTYPE invoice [
<!ELEMENT invoice (sku, qty, desc, price) >
<!ELEMENT sku (#PCDATA) >
<!ELEMENT qty (#PCDATA) >
<!ELEMENT desc (#PCDATA) >
<!ELEMENT price (#PCDATA) >
}>

<invoice>
<sku>12345</sku>
<qty>55</qty>
<desc>Left handed monkey wrench</desc>
<price>14.95</price>
</invoice>
An referenced external DTD

<?xml version=“1.0”>

<!DOCTYPE invoice SYSTEM “invoice.dtd”>

<invoice>
<sku>12345</sku>
<qty>55</qty>
<desc>Left handed monkey wrench</desc>
<price>14.95</price>
</invoice>
An external DTD (invoice.dtd)

<?xml version=“1.0”?>
<!ELEMENT invoice (sku, qty, desc, price) >
<!ELEMENT sku (#PCDATA) >
<!ELEMENT qty (#PCDATA) >
<!ELEMENT desc (#PCDATA) >
<!ELEMENT price (#PCDATA) >
Content Model

• Identify the name of the element and the nature of that


element’s content
• The example declares an element that then describes the
document’s content model

Name Content model

<!ELEMENT note (to, from, subject, body)>

Element
definition
Document Type Declarations

• There are four types of declarations:


– Element type declarations
• http://www.w3.org/TR/REC-xml#elemdecls
– Attribute List Declarations
• http://www.w3.org/TR/RECxml-attdecls
– Entity declarations
• http://www.w3.org/TR/REC-xml#sec-entity-decl
– Notation declarations
• http://www.w3.org/TR.REC-xml#Notations
Element Type Declarations

• Three types of elements


– EMPTY elements
– ANY elements
– MIXED elements
Empty Elements
• An element that can not contain any content
• The html image tag in xml would typically be empty, such
as <image></image> or <image/>
• empty elements are more useful with the use of attributes

<!ELEMENT test EMPTY>


<!ELEMENT image EMPTY>
<!ELEMENT br EMPTY>
ANY Element
• An element that can contain any content
• it is recommended not to get into the habit declaring
elements with the ANY keyword
• useful when transferring a lot of mixed or unknown data

<!ELEMENT test ANY >


Mixed Element
• Elements that can contain a set of content alternatives
• Separate the options with the “or” symbol “|”

<!ELEMENT test <#PCDATA | name>


Data Types
• Parsed Character Data
– #PCDATA
• <!ELEMENT firstname (#PCDATA)
• <!ELEMENT lastname (#PCDATA)

• Unparsed Character Data


– CDATA
• <firstname><![CDATA[<b>Jim</b>]]></firstname>
• <lastname><![CDATA[<b>Peters</b>]]></lastname>
Structure Symbols
• Parenthesis (samp1, samp2) - The element must contain the sequence samp1 and samp2

• Comma (samp1,samp2,samp3) - The element must contain samp1,samp2 and samp3 in that order

• Or (samp1|samp2|samp3) - The element can contain samp1, samp2 or samp3

• ? samp1? - Element might contain samp1, if it does it can only do it once

• * samp1* - Element can contain samp1 one or more times

• + samp1+ - Element must contain samp1 at least once

• none samp1 - Element must contain samp1


Elements with more structure
<!ELEMENT email (to+ , from , subject? , body)

to: is reqd and can appear more than once


from: must appear only once
subject: optional, but if included can only appear once
body: optional, but if included can only appear once
XML Element Attributes

• XML tags can contain attributes similar to attributes in


HTML tags

HTML Examples:
<h1> align=“center”>An XML Example<h1>
<table width=page> </table>

• Attributes are usually used to provide processing


information to the XML application (the application that is
going to consume the XML)
Attribute Rules

• attribute values must be placed in “ “


– in HTML this is only required id the attribute
contains the space character
• attribute values are not processed by the
XML parser
– this means the values can’t be automatically
checked by the parser
Attributes or Elements?
• Is it better to use attributes or to just make additional XML
elements
– there are no set rules when to use one over the other
• experience is best teacher
– but to help you decide:
• attribute values are not parsed
– can contain special characters that aren’t allowed in elements
• drawback - they cannot be validated by the parser
– must be validated by additional code in the application
An Example
<?xml version=“1.0” ?>
<?xml version=“1.0” ?>
<invoice>
<invoice date=“7/22/2002”>
<date> <sku>12345</sku>
<month>12</month
<qty>55</qty>
<day>22</day>
<desc>Left handed monkey wrench</desc>
<year>2002</year> <price>14.95</price>
</date> </invoice>
<sku>12345</sku>
<qty>55</qty>
<desc>Left handed monkey wrench</desc>
<price>14.95</price>
</invoice> this can’t

this can be validated


Attribute Declarations
Invoice Element Declaration:
<?xml version=“1.0” ?>
<!ELEMENT employee (#PCDATA)

<!ATTLIST ElementName AttributeName Type Default >

<!ATTLIST employee type (FullTime | PartTime) “FullTime” >

Usage in XML file:


<?xml version=“1.0” ?>
<employee type=“PartTime”/>
Other Attribute Declarations
• CDATA
– CDATA attributes are strings , any text is allowed
• ID
– The values of an ID attribute must be a name. All id the ID attributes used in a document must
be unique. IDs uniquely identify individual elements in a document.Elements can only have a
single ID attrinute
• IDREF or IDREFS
– An IDREF attributes value must be the value of a single ID attribute on some element in the
document. The value of an IDREFs attribute may contain multiple IDREF values seperated by
white space.
• ENTITY or ENTITIES
– An ENTITY attribute’s must be the name of a single ENTITY. The value of an ENTITIES
attribute may contain multiple entity names separated by white space.
• NMTOKEN or NMTOKENS
– Name token attributes are a restricted form of string attribute, but there are no other restrictions
on the word.
• List of Names Enumerated
– You can specify that the value of an attribute must be taken from a specific list of names. This
frequently called an enumerated type because each of the possible values must be explicitely
enumerated in the declaration
Attribute Defaults
• #REQUIRED
– The attribute must have an explicitly specified value for every occurrence of the element in the
document
• #IMPLIED
– The attribute value is not required and no default value is provided. If a value is not specified
the XMP processor must proceed without one.
• “value”
– An attrubute can be given any legal value as a default. The attribute value is not required on
each element of the document, and if it is not present it will appear to be the specified default
• #FIXED “value”
– An attribute declaration may specify that an attribute has a fixed value. In this case, the
attribute is not required, but if it occurrs, it must have the specified value. If it is not present, it
will appear to be the specified defualt
A Code sample
<?xml version=“1.0” ?>
<!DOCTYPE email[
<!ATTLIST email
language (english | french | spanish) “english”
priority (normal | high | low) “normal” >
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA) >
<!ELEMENT subject (#PCDATA) >
<!ELEMENT message (#PCDATA) > ] >
<email language=“spanish” priorit=“high”>
<to>Peter Brenner</to>
<from>Dick Steflik</from
<subject> Test Reminder</subject>
<message>The exam is a week from today</message>
</email>
Attribute Summary
• Attributes
– cannot contain multipe values
– cannot be validated
– cannot describe structures like child elements can
• It is recommended to use attributes sparingly
• The following code would not be good form:

<?xml version=“1.0” ?>


<email language=“english” priority=“high”
to=“you” from=“me” subject=“Reminder”
message=“The test is a week from today !” />

You might also like