0% found this document useful (0 votes)
114 views39 pages

E Tensible Arkup Anguage Unit-3: Basic XML DTD XML Schema Dom Vs Sax Presenting XML

XML is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It allows users to define their own tags to structure documents. XML documents must have a root element and properly nested elements. Elements can have attributes to provide additional information. Documents are validated using DTDs or XML schemas to ensure they follow the defined structure and syntax.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
114 views39 pages

E Tensible Arkup Anguage Unit-3: Basic XML DTD XML Schema Dom Vs Sax Presenting XML

XML is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It allows users to define their own tags to structure documents. XML documents must have a root element and properly nested elements. Elements can have attributes to provide additional information. Documents are validated using DTDs or XML schemas to ensure they follow the defined structure and syntax.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 39

 eXtensible Markup Language

Unit-3 Contents
Basic XML
DTD
XML schema
DOM Vs SAX
Presenting XML
XML stands for Extensible Markup Language. It is a
text-based markup language derived from Standard
Generalized Markup Language (SGML).

XML tags identify the data and are used to store and organize
the data

1. XML is extensible − XML allows you to create your own self-


descriptive tags, or language, that suits your application.
2. XML carries the data − XML allows you to store the data irrespective
of how it will be presented.
3. XML is a public standard − XML was developed by an organization
called the World Wide Web Consortium (W3C) and is available as an
open standard.
 HTML is comes from SGML
 XML is also subset of SGML
 SGML(Standard Generalized Markup Language )
 XML allows document authors to describe data

more precisely by creating new tags


 HTML is syntax free language
 XML is strict syntax language
 XML and HTML were designed with different
goals:

◦ XML was designed to transport and store


data, with focus on what data is
◦ HTML was designed to display data, with
focus on how data looks
XML is case sensitive
XML is a markup language much like HTML
XML was designed to carry data,
not to display data
XML tags are not predefined.
You must define your own tags
XML is designed to be self-descriptive
XML is a W3C Recommendation
 XML is the first language that makes documents
both human-readable and computer-manipulable
 XML is more powerful, flexible and extensible
than HTML
 Data independence
1<?xml version = "1.0"?>
2
3<!-- intro.xml -->
4<!-- Simple introduction to XML markup -->
5
6<myMessage>
7<message>Welcome to XML!</message>
8</myMessage>
 XML Does Not DO Anything
 Maybe it is a little hard to understand, but
XML does not DO anything.
 XML was created to structure, store, and
transport information.
 The following example is a note to Tony,
from Jani, stored as XML:
 <note>

<to>Tony</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me</body>
</note>
<?xml version = "1.0"?>
<bookstore>
  <book category="COOKING">
    <title lang="en">Everyday Italian</title>
    <author>Giada De Laurentiis</author>
    <year>2005</year>
    <price>30.00</price>
</book>
  <book category="CHILDREN">
    <title lang="en">Harry Potter</title>
    <author>J K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
  </book>
  <book category="WEB">
    <title lang="en">Learning XML</title>
    <author>Erik T. Ray</author>
    <year>2003</year>
    <price>39.95</price>
  </book>
</bookstore>
 All XML Elements Must Have a Closing Tag

◦ <p>This is a paragraph
◦ <p>This is a paragraph</p>

 XML Tags are Case Sensitive

◦ <Message>This is incorrect</message>
<message>This is correct</message>
 XML Elements Must be Properly Nested
 XML Documents Must Have a Root Element

◦ <root>
  <child>
    <subchild>.....</subchild>
  </child>
</root>
 XML Attribute Values Must be Quoted
◦ <note date=12/11/2007> (Incorrect)
◦ <note date="12/11/2007"> (correct)
 Entity References
&lt; < less than

&gt; > greater than

&amp; & ampersand 

&apos; ' apostrophe

&quot; " quotation mark


 XML elements must follow these naming
rules:
◦ Names can contain letters, numbers, and other
characters
◦ Names cannot start with a number or punctuation
character
◦ Names cannot start with the letters xml (or XML, or
Xml, etc)
◦ Names cannot contain spaces
 Any name can be used.
 XML elements can have attributes and
provides additional information about an
element.
◦ <file type="gif">computer.gif</file>
 XML Attributes Must be Quoted
◦ <person sex="female">
 or
◦ <person sex='female'>
◦ <gangster name='George "Shotgun" Ziegler'>
 <person sex="female">
  <firstname>Anna</firstname>
  <lastname>Smith</lastname>
</person>
 <person>

  <sex>female</sex>
  <firstname>Anna</firstname>
  <lastname>Smith</lastname>
</person>
 Some of the problems with using attributes
are:
◦ attributes cannot contain multiple values (elements
can)
◦ attributes cannot contain tree structures (elements
can)
◦ attributes are not easily expandable (for future
changes)
 Attributes are difficult to read and maintain.
Use elements for data. Use attributes for
information that is not relevant to the data.
 XML with correct syntax is "Well Formed"
XML.

 XML validated against a DTD is "Valid" XML.


 A "Well Formed" XML document has correct
XML syntax.
◦ XML documents must have a root element
◦ XML elements must have a closing tag
◦ XML tags are case sensitive
◦ XML elements must be properly nested
◦ XML attribute values must be quoted
 <?xml version="1.0" ?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!
</body>
</note>
 A "Valid" XML document is a "Well Formed"
XML document, which also conforms to the
rules of a Document Type Definition (DTD):
◦ <?xml version="1.0"?>
<!DOCTYPE note SYSTEM "Note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
 The purpose of a DTD is to define the structure of
an XML document.
 Using DTD we can specify the various elements
types, attributes and their relationships.
 DTD is used to specify the set of rules for
structuring data in any xml file

 A DTD is placed in the XML document begins with


<!DOCTYPE and ends with >.
 Element
◦ The elements are used for defining the tags. It
consisting of opening and ending tag.
 Attribute
◦ These are used to represent the values of the
elements
 CDATA(Character Data)
◦ CDATA is text that will NOT be parsed by a parser.
Tags inside the text will NOT be treated as markup
and entities will not be expanded.
 PCDATA(Parsed Character Data(i.e TEXT))
◦ PCDATA is text that WILL be parsed by a
parser. The text will be examined by the
parser for entities and markup.
Internal DTD

External DTD
 Syntax:

<! DOCTYPE rootnote [list of elements]>


EX:
<!DOCTYPE student [
<!ELEMENT student(sid,sname,savg,sgrade)>
<!ELEMENT sid (#PCDATA)>
<!ELEMENT sname (#PCDATA)>
<!ELEMENT savg (#PCDATA)>
<!ELEMENT sgrade (#PCDATA)>
]>
<!DOCTYPE student [
<!ELEMENT student(sid,sname,savg,sgrade)>
<!ELEMENT sid (#PCDATA)>
<!ELEMENT sname (#PCDATA)>
<!ELEMENT savg (#PCDATA)>
<!ELEMENT sgrade (#PCDATA)>
]>
 <student>
◦ <sid>503</sid>
◦ <sname>Ramesh</sname>
◦ <savg>8.5</savg>
◦ <sgrade>A</sgrade>
 </student>
 Syntax:

Note: The filename must be save with the .dtd


extinction
student [
<!ELEMENT student(sid,sname,savg,sgrade)>
<!ELEMENT sid (#PCDATA)>
<!ELEMENT sname (#PCDATA)>
<!ELEMENT savg (#PCDATA)>
<!ELEMENT sgrade (#PCDATA)>
]
Note: Save this file as student.dtd
<!DOCTYPE student SYSTEM “student.dtd”>
<student>
<sid>503</sid>
<sname>Ramesh</sname>
<savg>8.5</savg>
<sgrade>A</sgrade>
</student>
 XML-based alternative to DTD, called XML
Schema for validating XML documents

 Two major schema models exist: W3C XML


Schema and Microsoft XML Schema.
 You can define XML schema elements in the
following ways -

Simple Type

Complex Type
 Simple type element is used only in the
context of the text. Some of the predefined
simple types are: xs:integer, xs:boolean,
xs:string, xs:date.
 For example -
 <xs:element name="xxx" type="yyy"/>

 <xs:element name = "phone_number" type =


"xs:int" />
 A complex type is a container for other
element definitions. This allows you to
specify which child elements an element can
contain and to provide some structure within
your XML documents. For example -
 <?xml version=“1.0”?>
 <xs:element name = "Address">
 <xs:complexType>
 <xs:sequence>
 <xs:element name = "name" type = "xs:string" />
 <xs:element name = "company" type = "xs:string" />
 <xs:element name = "phone" type = "xs:int" />
 </xs:sequence>
 </xs:complexType>
 </xs:element>
 XML Schema has a lot of built-in data types.
The most common types are:

 xs:string
 xs:decimal
 xs:integer
 xs:boolean
 xs:date
 xs:time
 The date data type is used to specify a date.
 The date is specified in the following form

"YYYY-MM-DD" where:
 The following is an example of a date

declaration in a schema:
 <xs:element name="start" type="xs:date"/>

 An element in your document might look like


this:
 <start>2002-09-24</start>
 The time data type is used to specify a time.
 The time is specified in the following form

"hh:mm:ss" where:
 The following is an example of a time

declaration in a schema:
 <xs:element name="start" type="xs:time"/>
 An element in your document might look like

this:
 <start>09:00:00</start>
 The boolean data type is used to specify a
true or false value.
 The following is an example of a boolean

declaration in a schema:
 <xs:attribute name="disabled"
type="xs:boolean"/>
 An element in your document might look like
this:
 <prize disabled="true">999</prize>
 END OF UNIT-3

You might also like