0% found this document useful (0 votes)

17 views11 pages

Unit 5-4

An XML document has both logical and physical structures, consisting of entities and markup elements like declarations, elements, comments, and processing instructions. The document is structured into three parts: an XML declaration, a document type declaration, and the document body, with the prolog combining the first two. The document body contains the actual data, while the markup defines the structure, and parsing involves breaking down the document into its components.

Uploaded by

unknown.ro.myself

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views11 pages

Unit 5-4

Uploaded by

unknown.ro.myself

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

XML Document Structure

The XML Recommendation states that an XML document has both logical and physical
structure. Physically, it is comprised of storage units called entities, each of which may refer to
other entities, similar to the way that includes works in the C language. Logically, an XML
document consists of declarations, elements, comments, character references, and processing
instructions, collectively known as the markup.
NOTE

Although throughout this book we refer to an "XML document," it is crucial to understand that
XML may not exist as a physical file on disk. XML is sometimes used to convey messages
between applications, such as from a Web server to a client. The XML content may be
generated on the fly, for example by a Java application that accesses a database. It may be
formed by combining pieces of several files, possibly mixed with output from a program.
However, in all cases, the basic structure and syntax of XML is invariant.

An XML document consists of three parts, in the order given:

1. An XML declaration (which is technically optional, but recommended in most normal

cases)

2. A document type declaration that refers to a DTD (which is optional, but required if you
want validation)

3. A body or document instance (which is required)

Collectively, the XML declaration and the document type declaration are called the XML prolog.

XML Declaration
The XML declaration is a piece of markup (which may span multiple lines of a file) that identifies
this as an XML document. The declaration also indicates whether the document can be
validated by referring to an external Document Type Definition (DTD). DTDs are the subject of
chapter 4; for now, just think of a DTD as a set of rules that describes the structure of an XML
document.

The minimal XML declaration is:

<?xml version="1.0" ?>

XML is case-sensitive (more about this in the next subsection), so it's important that you use
lowercase for xml and version. The quotes around the value of the version attribute are
required, as are the ? characters. At the time of this writing, "1.0" is the only acceptable value
for the version attribute, but this is certain to change when a subsequent version of the XML
specification appears.
NOTE

Do not include a space before the string xml or between the question mark and the angle
brackets. The strings <?xml and ?> must appear exactly as indicated. The space before the ?> is
optional. No blank lines or space may precede the XML declaration; adding white space here
can produce strange error messages.
In most cases, this XML declaration is present. If so, it must be the very first line of the
document and must not have leading white space. This declaration is technically optional; cases
where it may be omitted include when combining XML storage units to create a larger,
composite document.

Actually, the formal definition of an XML declaration, according to the XML 1.0 specification is
as follows:

XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'

This Extended Backus-Naur Form (EBNF) notation, characteristic of many W3C specifications,
means that an XML declaration consists of the literal sequence '<?xml', followed by the
required version information, followed by optional encoding and standalone declarations,
followed by an optional amount of white space, and terminating with the literal sequence '?>'.
In this notation, a question mark not contained in quotes means that the term that precedes it
is optional.
The following declaration means that there is an external DTD on which this document
depends. See the next subsection for the DTD that this negative standalone value implies.
<?xml version="1.0" standalone="no" ?>

On the other hand, if your XML document has no associated DTD, the correct XML declaration
is:

<?xml version="1.0" standalone="yes" ?>

The XML 1.0 Recommendation states: "If there are external markup declarations but there is no
standalone document declaration, the value 'no' is assumed."

The optional encoding part of the declaration tells the XML processor (parser) how to interpret
the bytes based on a particular character set. The default encoding is UTF-8, which is one of
seven character-encoding schemes used by the Unicode standard, also used as the default for
Java. In UTF-8, one byte is used to represent the most common characters and three bytes are
used for the less common special characters. UTF-8 is an efficient form of Unicode for ASCII-
based documents. In fact, UTF-8 is a superset of ASCII.3

<?xml version="1.0" encoding="UTF-8" ?>

For Asian languages, however, an encoding of UTF-16 is more appropriate because two bytes
are required for each character. It is also possible to specify an ISO character encoding, such as
in the following example, which refers to ASCII plus Greek characters. Note, however, that
some XML processors may not handle ISO character sets correctly since the
specification requires only that they handle UTF-8 and UTF-16.

<?xml version="1.0" encoding="ISO-8859-7" ?>

Both the standalone and encoding information may be supplied:

<?xml version="1.0" standalone="no" encoding="UTF-8" ?>

Is the next example valid?

<?xml version="1.0" encoding='UTF-8' standalone='no'?>

Yes, it is. The order of attributes does not matter. Single and double quotes can be used
interchangeably, provided they are of matching kind around any particular attribute value.
(Although there is no good reason in this example to use double quotes for version and single
quotes for the other, you may need to do so if the attribute value already contains the kind of
quotes you prefer.) Finally, the lack of a blank space between 'no' and ?> is not a problem.
Neither of the following XML declarations is valid.

<?XML VERSION="1.0" STANDALONE="no"?>

<?xml version="1.0" standalone="No"?>

The first is invalid because these particular attribute names must be lowercase, as must "xml".
The problem with the second declaration is that the value of the standalone attribute must be
literally "yes" or "no", not "No". (Do I dare call this a "no No"?)

Document Type Declaration

The document type declaration follows the XML declaration. The purpose of this declaration is
to announce the root element (sometimes called the document element) and to provide the
location of the DTD. The general syntax is:
4

<!DOCTYPE RootElement (SYSTEM | PUBLIC)

ExternalDeclarations? [InternalDeclarations]? >

where <!DOCTYPE is a literal string, RootElement is whatever you name the outermost element
of your hierarchy, followed by either the literal keyword SYSTEM or PUBLIC. The
optional ExternalDeclarations portion is typically the relative path or URL to the DTD that
describes your document type. (It is really only optional if the entire DTD appears as
an InternalDeclaration, which is neither likely nor desirable.) If there are InternalDeclarations,
they must be enclosed in square brackets. In general, you'll encounter far more cases
with ExternalDeclarations than InternalDeclarations, so let's ignore the latter for now. They
constitute the internal subset, which is described in chapter 4.
Let's start with a simple but common case. In this example, we are indicating that the DTD and
the XML document reside in the same directory (i.e., the ExternalDeclarations are contained in
the file employees.dtd) and that the root element is Employees:

<!DOCTYPE Employees SYSTEM "employees.dtd">

Similarly,

<!DOCTYPE PriceList SYSTEM "prices.dtd">

indicates a root element PriceList and the DTD is in the local file: prices.dtd.
In the next example, we use normal directory path syntax to indicate a different location for the
DTD.

<!DOCTYPE Employees SYSTEM "../dtds/employees.dtd">

As is often the case, we might want to specify a URL for the DTD since the XML file may not
even be on the same host as the DTD. This case also applies when you are using an XML
document for message passing or data transmission across servers and still want the validation
by referencing a common DTD.

<!DOCTYPE Employees SYSTEM

"http://somewhere.com/dtds/employees.dtd">

Next, we have the case of the PUBLIC identifier. This is used in formal environments to declare
that a given DTD is available to the public for shared use. Recall that XML's true power as a
syntax relates to developing languages that permit exchange of structured data between
applications and across company boundaries. The syntax is a little different:

<!DOCTYPE RootElement PUBLIC PublicID URI>

The new aspect here is the notion of a PublicID, which is a slightly involved formatted string
that identifies the source of the DTD whose path follows as the URI. This is sometimes known as
the Formal Public Identifier (FPI).
For example, I was part of a team that developed (Astronomical) Instrument Markup Language
(AIML, IML) for NASA Goddard Space Flight Center. We wanted our DTD to be available to
5

other astronomers. Our document type declaration (with a root element named Instrument)
was:

<!DOCTYPE Instrument PUBLIC

"-//NASA//Instrument Markup Language 0.2//EN"

"http://pioneer.gsfc.nasa.gov/public/iml/iml.dtd">

In this case the PublicID is:

"-//NASA//Instrument Markup Language 0.2//EN"

The URI that locates the DTD is:

http://pioneer.gsfc.nasa.gov/public/iml/iml.dtd

Let's decompose the PublicID. The leading hyphen indicates that NASA is not a standards body.
If it were, a plus sign would replace the hyphen, except if the standards body were ISO, in which
case the string "ISO" would appear. Next we have the name of the organization responsible for
the DTD (NASA, in this case), surrounded with double slashes, then a short free-text description
of the DTD ("Instrument Markup Language 0.2"), double slashes, and a two-character language
identifier ("EN" for English, in this case).
Since the XML prolog is the combination of the XML declaration and the document type
declaration, for our NASA example the complete prolog is:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>

<!DOCTYPE Instrument PUBLIC

"-//NASA//Instrument Markup Language 0.2//EN"

"http://pioneer.gsfc.nasa.gov/public/iml/iml.dtd">

As another example, let's consider a common case involving DTDs from the W3C, such as those
for XHTML 1.0.

<?xml version="1.0" encoding="utf-8"?>

The XHTML Basic 1.0 PublicID is similar but not identical to the XHTML 1.0 case and of course
the DTD is different since it's a different language.
If you noticed that the NASA example uses uppercase for the encoding value UTF-8 and the
W3C examples use lowercase, you may have been bothered because that is inconsistent with
what we learned about the case-sensitive value for the standalone attribute. The only
explanation I can offer is that although element and attribute names are always case-sensitive,
attributes values may or may not be. A reasonable guess is that if the possible attribute values
are easily enumerated (i.e., "yes" or "no", or other relatively short lists of choices), then case
probably matters.
NOTE

DTD-related keywords such as DOCTYPE, PUBLIC, and SYSTEM must be uppercase. XML-related
attribute names such as version, encoding, and standalone must be lowercase.

Document Body
The document body, or instance, is the bulk of the information content of the document.
Whereas across multiple instances of a document of a given type (as identified by
the DOCTYPE) the XML prolog will remain constant, the document body changes with each
document instance (in general). This is because the prolog defines (either directly or indirectly)
the overall structure while the body contains the real instance-specific data. Comparing this to
data structures in computer languages, the DTD referenced in the prolog is analogous to
a struct in the C language or a class definition in Java, and the document body is analogous to a
runtime instance of the struct or class.
Because the document type declaration specifies the root element, this must be the first
element the parser encounters. If any other element but the one identified by
the DOCTYPE line appears first, the document is immediately invalid.
Listing 3-1 shows a very simple XHTML 1.0 document. The DOCTYPE is "html" (not "xhtml"), so
the document body begins with <html ....> and ends with </html>.
Listing 3-1 Simple XHTML 1.0 Document with XML Prolog and Document Body

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>XHTML 1.0</title>
</head>
<body>
<h1>Simple XHTML 1.0 Example</h1>
<p>See the <a href=
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">DTD</a>.</p>
</body>
</html>

Markup, Character Data, and Parsing

An XML document contains text characters that fall into two categories: either they are part of
the document markup or part of the data content, usually called character data, which simply
means all text that is not part of the markup. In other words, XML text consists of intermingled
character data and markup. Let's revisit an earlier fragment.

<Address>
<Street>123 Milky Way</Street>
<City>Columbia</City>
<State>MD</State>
<Zip>20777</Zip>
</Address>

The character data comprises the four strings "123 Milky Way", "Columbia", "MD", and
"20777"; the markup comprises the start and end tags for the five
elements Address, Street, City, State, and Zip. Note that this is similar but not identical, to what
we previously called content. For example, although each chunk of character data is the
content of a particular element, the content of the Address element is all of the child
elements. We can think of all the character data belonging to both the element that directly
contains it and indirectly to Address. (In fact, in some XML applications such as XSLT, if we ask
for the text content of Address, we'll get the concatenation of all the individual strings.)
The markup itself can be divided into a number of categories, as per section 2.4 of the XML 1.0
specification.
 start tags and end tags (e.g., <Address> and </Address> )
 empty-element tags (e.g., <Divider/> )
 entity references (e.g., &footer; or %otherDTD; )
 character references (e.g., < or > )
 comments (e.g.,  )
 CDATA section delimiters (e.g., <![CDATA[ insert code here ]]> )
 document type declarations (e.g., <!DOCTYPE ....> )
 processing instructions (e.g., <?myJavaApp numEmployees="25" location="Columbia" ....
?> )
 XML declarations (e.g., <?xml version=.... ?> )
 text declarations (e.g., <?xml encoding=.... ?> )
 any white space at the top level (before or after the root element)

We will discuss each of these markup aspects in either this chapter or the next. Note that for all
types of markup, there are some delimiters, most but not all of which involve angle brackets.

The specification states that all text that is not markup constitutes the character data of the
document. In other words, if you stripped all markup from the document, the remaining
content would be the character data. Consider this example:

<?xml version="1.0" standalone="no" ?>

<!DOCTYPE Message SYSTEM "message.dtd">
<Message mime-type="text/plain">

<From>The Kenster</From>
<To>Silly Little Cowgirl</To>
<Body>
Hi, there. How is your gardening going?
</Body>
</Message>

The character data when the markup is removed would be:

The Kenster Silly Little Cowgirl Hi, there. How is your gardening going?
In general this is essentially the text between the start and end tags, which we previously called
the content of the element, but there is a subtlety related to parsing. Depending on parser
details, the newlines after </From> and </To> might be replaced by single spaces, as shown.
Alternatively, the newlines might be preserved.
Parsing is the process of splitting up a stream of information into its constituent pieces (often
called tokens). In the context of XML, parsing refers to scanning an XML document (which need
not be a physical file—it can be a data stream) in order to split it into its various markup and
character data, and more specifically, into elements and their attributes. XML parsing reveals
the structure of the information since the nesting of elements implies a hierarchy. It is possible
for an XML document to fail to parse completely if it does not follow the well-formedness
rules described in the XML 1.0 Recommendation. A successfully parsed XML document may be
either well-formed (at a minimum) or valid, as discussed in detail later in this chapter and the
next.

There is a subtlety about processing character data. During the parsing process, if there is
markup that contains entity references, the markup will be converted into character data. A
typical example from XHTML would be:

<p>"AT&T is a winning company," he said.</p>

After the parser substitutes for the entities, the resultant character data is:

"AT&T is a winning company," he said.

After parsing and substituting for special characters, the character data that remains after the
substitution is parsed character data, which is referred to as #PCDATA in DTDs and always
refers to textual content of elements. Character data that is not parsed is called CDATA in DTDs;
this relates exclusively to attribute values.

Chapter 3-The Client Tier
No ratings yet
Chapter 3-The Client Tier
66 pages
Jaipuriar School
No ratings yet
Jaipuriar School
11 pages
Unit 2 - XML
No ratings yet
Unit 2 - XML
48 pages
Ejob Circular: Latest Standard CV Format For Bangladesh PDF
No ratings yet
Ejob Circular: Latest Standard CV Format For Bangladesh PDF
19 pages
Unit 3 - XML
No ratings yet
Unit 3 - XML
44 pages
XML, Ajax and PHP
No ratings yet
XML, Ajax and PHP
40 pages
Unit 9 XMLandJAva
No ratings yet
Unit 9 XMLandJAva
70 pages
XML Notes
No ratings yet
XML Notes
48 pages
SOA - Module 1 - PPT
No ratings yet
SOA - Module 1 - PPT
64 pages
Unit 3
No ratings yet
Unit 3
80 pages
Extensible Markup Language
No ratings yet
Extensible Markup Language
108 pages
Ch-2 - Defining SOAP Messages With WSDL
No ratings yet
Ch-2 - Defining SOAP Messages With WSDL
49 pages
Edited Uint2
No ratings yet
Edited Uint2
87 pages
What Is XML and The Usage of XML
No ratings yet
What Is XML and The Usage of XML
46 pages
Web Dev Final Book Page Num
No ratings yet
Web Dev Final Book Page Num
55 pages
Chapter 4 XML
No ratings yet
Chapter 4 XML
52 pages
Unit 3 - XML
No ratings yet
Unit 3 - XML
44 pages
Unit Ii
No ratings yet
Unit Ii
106 pages
Extensible Markup Language Store and Transport Data
No ratings yet
Extensible Markup Language Store and Transport Data
43 pages
CS8651-Internet Programming Department of CSE 2022-2023
No ratings yet
CS8651-Internet Programming Department of CSE 2022-2023
30 pages
Module 5
No ratings yet
Module 5
29 pages
XML Presentation
No ratings yet
XML Presentation
62 pages
XML Introduction1
No ratings yet
XML Introduction1
31 pages
Web Dev Final Book Page Num-1-30
No ratings yet
Web Dev Final Book Page Num-1-30
30 pages
XML Stands For Extensible Markup Language.: 2. XML Is Designed To Transport and Store Data
No ratings yet
XML Stands For Extensible Markup Language.: 2. XML Is Designed To Transport and Store Data
62 pages
4 XML and PHP
No ratings yet
4 XML and PHP
34 pages
XML
No ratings yet
XML
79 pages
CO 1 Material
No ratings yet
CO 1 Material
29 pages
Report Main
No ratings yet
Report Main
23 pages
Chap1 - Introduction To DSS and XML
No ratings yet
Chap1 - Introduction To DSS and XML
26 pages
WP Unit5
No ratings yet
WP Unit5
17 pages
Chapter2 CEF482
No ratings yet
Chapter2 CEF482
16 pages
Unit 5
No ratings yet
Unit 5
19 pages
Unit-III Introduction To XML
No ratings yet
Unit-III Introduction To XML
25 pages
XML What Is XML?: Ralph Mosely
No ratings yet
XML What Is XML?: Ralph Mosely
15 pages
Unit 3xml
No ratings yet
Unit 3xml
19 pages
XML and DTD: Mario Alviano
No ratings yet
XML and DTD: Mario Alviano
51 pages
WT Unit Iv
No ratings yet
WT Unit Iv
18 pages
2 XML
No ratings yet
2 XML
14 pages
XML Introduction
No ratings yet
XML Introduction
44 pages
Unit-2 XML
No ratings yet
Unit-2 XML
17 pages
XML What Is XML?
No ratings yet
XML What Is XML?
15 pages
Iwt 4 Unit
No ratings yet
Iwt 4 Unit
30 pages
XML Notes
No ratings yet
XML Notes
11 pages
Ma 0702 05 en 00 - Setup Manual
No ratings yet
Ma 0702 05 en 00 - Setup Manual
214 pages
Unit 1
No ratings yet
Unit 1
9 pages
Web Data: XML
No ratings yet
Web Data: XML
13 pages
XML Documents - Xquery Xpath
No ratings yet
XML Documents - Xquery Xpath
11 pages
Unit-2 XML
No ratings yet
Unit-2 XML
13 pages
Unit 1: Benefits of XML 1.structured Document
No ratings yet
Unit 1: Benefits of XML 1.structured Document
26 pages
Introduction To DTD
No ratings yet
Introduction To DTD
24 pages
Unit 1
No ratings yet
Unit 1
10 pages
XML Basics: XML Is Extensible: XML Allows You To Create Your Own Self-Descriptive Tags, or
No ratings yet
XML Basics: XML Is Extensible: XML Allows You To Create Your Own Self-Descriptive Tags, or
18 pages
What Is XML?
No ratings yet
What Is XML?
19 pages
Unit 2: Classes, Objects and Class Diagrams
No ratings yet
Unit 2: Classes, Objects and Class Diagrams
17 pages
XML 1
No ratings yet
XML 1
5 pages
A.M.Senthilkumar: Changepond Technologies LTD
No ratings yet
A.M.Senthilkumar: Changepond Technologies LTD
15 pages
Applications of XML
No ratings yet
Applications of XML
19 pages
XML
No ratings yet
XML
7 pages
XML
No ratings yet
XML
14 pages
Compiler Design Full PDF
No ratings yet
Compiler Design Full PDF
138 pages
Talha Nadeem 11610
100% (1)
Talha Nadeem 11610
6 pages
Group4 Asm Final Nguyenviettien BH00785
No ratings yet
Group4 Asm Final Nguyenviettien BH00785
154 pages
OpenText CARS Installation and Administration Guide
No ratings yet
OpenText CARS Installation and Administration Guide
39 pages
Session 1 2 Blockchain v2.16
No ratings yet
Session 1 2 Blockchain v2.16
204 pages
Free Batchography: The Art of Batch Files Programming PDF Download
No ratings yet
Free Batchography: The Art of Batch Files Programming PDF Download
2 pages
Project Online To Planner Sync
100% (1)
Project Online To Planner Sync
6 pages
BMC+Automation+Console+20.02 Home 04 21 2020
100% (2)
BMC+Automation+Console+20.02 Home 04 21 2020
168 pages
Hazardous Waste Online Application
No ratings yet
Hazardous Waste Online Application
40 pages
Hello World
No ratings yet
Hello World
4 pages
Revit Tutorial - MB 86 ST - Panel Doors - EN
No ratings yet
Revit Tutorial - MB 86 ST - Panel Doors - EN
5 pages
Datasheet c78 741523
No ratings yet
Datasheet c78 741523
12 pages
Module 2 Cosc 205 Basic Elements of Java
No ratings yet
Module 2 Cosc 205 Basic Elements of Java
34 pages
Sample Thesis Computerized Billing System
100% (3)
Sample Thesis Computerized Billing System
4 pages
Topic-4 MCQ
No ratings yet
Topic-4 MCQ
22 pages
Increasing TLB Reach by Exploiting Clustering in Page Translations
No ratings yet
Increasing TLB Reach by Exploiting Clustering in Page Translations
10 pages
Multiple Pointers To A Variable
No ratings yet
Multiple Pointers To A Variable
20 pages
React Props
No ratings yet
React Props
19 pages
Digital Version of The Rapid Automatized Naming (RAN) : A Contribution To Early Detection of Reading Problems in Children
No ratings yet
Digital Version of The Rapid Automatized Naming (RAN) : A Contribution To Early Detection of Reading Problems in Children
9 pages
zz1004D SampleQuestions
No ratings yet
zz1004D SampleQuestions
4 pages
Mentorpaper 95463
No ratings yet
Mentorpaper 95463
5 pages
ChatGPT Teardown
No ratings yet
ChatGPT Teardown
9 pages
Module-5 Structure, Union, Pointers and Preprocessor Directives
No ratings yet
Module-5 Structure, Union, Pointers and Preprocessor Directives
12 pages
Christmas Around The World Powerpoint
No ratings yet
Christmas Around The World Powerpoint
27 pages
Computer Subject File 2 Grade 4
No ratings yet
Computer Subject File 2 Grade 4
3 pages
7d41 PDF
No ratings yet
7d41 PDF
7 pages
Beginning XML
From Everand
Beginning XML
Joe Fawcett
3/5 (1)
XML Programming: The Ultimate Guide to Fast, Easy, and Efficient Learning of XML Programming
From Everand
XML Programming: The Ultimate Guide to Fast, Easy, and Efficient Learning of XML Programming
Christopher Right
2.5/5 (2)
XSL Primer
From Everand
XSL Primer
Stephen Cote
No ratings yet
MVS JCL Utilities Quick Reference, Third Edition
From Everand
MVS JCL Utilities Quick Reference, Third Edition
Robert Wingate
5/5 (1)

Unit 5-4

Uploaded by

Unit 5-4

Uploaded by

XML Document Structure

An XML document consists of three parts, in the order given:

1. An XML declaration (which is technically optional, but recommended in most normal

3. A body or document instance (which is required)

The minimal XML declaration is:

XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'

<?xml version="1.0" standalone="yes" ?>

<?xml version="1.0" encoding="UTF-8" ?>

<?xml version="1.0" encoding="ISO-8859-7" ?>

<?xml version="1.0" standalone="no" encoding="UTF-8" ?>

Is the next example valid?

<?xml version="1.0" encoding='UTF-8' standalone='no'?>

<?XML VERSION="1.0" STANDALONE="no"?>

<?xml version="1.0" standalone="No"?>

Document Type Declaration

<!DOCTYPE RootElement (SYSTEM | PUBLIC)

<!DOCTYPE Employees SYSTEM "employees.dtd">

<!DOCTYPE PriceList SYSTEM "prices.dtd">

<!DOCTYPE Employees SYSTEM "../dtds/employees.dtd">

<!DOCTYPE Employees SYSTEM

<!DOCTYPE RootElement PUBLIC PublicID URI>

<!DOCTYPE Instrument PUBLIC

"-//NASA//Instrument Markup Language 0.2//EN"

In this case the PublicID is:

The URI that locates the DTD is:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>

<!DOCTYPE Instrument PUBLIC

"-//NASA//Instrument Markup Language 0.2//EN"

<?xml version="1.0" encoding="utf-8"?>

<?xml version="1.0" encoding="UTF-8"?>

Markup, Character Data, and Parsing

<?xml version="1.0" standalone="no" ?>

The character data when the markup is removed would be:

<p>&quot;AT&amp;T is a winning company,&quot; he said.</p>

"AT&T is a winning company," he said.

You might also like

<p>"AT&T is a winning company," he said.</p>