XML Faqs

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 17

XML FAQS

1. What Is XML?

XML stands for Extensible Markup Language. It is a text-based meta-markup


language defined by the World Wide Web Consortium. It has become the
standard for data interchange on the Web.

2. Why do we need to learn XML?

Because XML is a meta-markup language, it lets you create your own markup
language.

It is easy for data exchange, customization, self-describing, structured and


integrated.

3. What does XML look like?


4. <?xml version="1.0" encoding="UTF-8"?>
5. <DOCUMENT>
6. <MESSAGE>
7. Hello guys!
8. </MESSAGE>
9. </DOCUMENT>
10. What is the difference between HTML and XML?

HTML and XML both are based on Standard Generalized Markup


Language(SGML), but

o HTML uses predefined tags, whereas XML uses user-defined tags which
can be used to identify data relationships like hierarchical
structure(elements, subelements, subsubelements,and so on.)
o HTML specifies its representation, whereas XML identifies the content for
the data.
o Unlike HTML, XML tags are well-formed. XML data is searchable,
format-free and reusable.
11. What is XML attribute?

Attribute is an additional information attached to a tag. For example

<message to="[email protected]" from="[email protected]"


subject="Discuss XML issues">
<text>
here we go
</text>
</message>

The "to", "from" and "subject" are attributes of "message" tag.


12. How to deal with double quotes in attribute assignment?

Use a single quote to enclose a double quotes, for example,

<quote txt='he said, "hello guys" ' />

13. What is an empty tag?

Empty tag is a tag with ending "/>" and used to mark something in your well-
formed tags. It doesn't contain any content, so it is called "empty tag". It has two
forms:

<info/>
Note: not </info>
or
<tg> </tg>
or
<Greeting text="hello guys" />

14. What comments should be used in XML?

Comments are ignored by XML parsers. A program will never see them in fact,
unless you activate special settings in the parser. XML comments are very much
like HTML comments.

<!-- this is a comment -->

It is worth noting that Comments must not come before an XML declaration or
inside markups. You cannot use "--" between your comments. You can use
comments to hide or remove parts of documents as long as the enclosed parts do
not themselves contain any comments.

15. How to deal with special characters in XML like < or &, etc.?

Like HTML, you should use entity reference to replace them, even if in
embeded JavaScript code.

16. What is XML Prolog?

Prologs come at the very beginning of XML documents. Like HTML's tag
<html>, XML prolog is a declaration that is used to indicate the start of XML
file like the following:

<?xml version="1.0"?>
or
<?xml version='1.0' encoding='utf-8'?>
or
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
It contains some or all of three attributes:

o version -- not optional


o encoding -- default UTF-8
o standalone -- "yes" or "no"

It is a good practice to include XML prolog whenever you create an XML file,
though it is optional.

Return to top

17. What is Unicode?

Unicode uses 2 bytes to represent characters, extending from 0 to 65,535.


ASCII(American Standard Code for Information Interchange) code uses 1 byte
to represent characters, extending from 0 to 255. The Unicodes 0 to 255
correslpond to the ASCII 0 to 255 codes. Therefore, Unicode can include many
of the symbols commonly used in worldwide character and ideograph sets.

UTF-8 means using a compressed version of Unicode that uses 8 bits to


represent characters. UTF-16 is a special encoding that represent UCS(Universal
Character System) symbols using 2 bytes to represent characters.

18. How to write XML processing instructions?

The XML processing instructions give commands or information to an


application that is processing the XML data; it is application specific. It has the
following format:

<?target instructions?>

where <? is the start and ?> is the end of procession instruction, the "target" is the
name of the application that is expected to do the processing, and "instructions" is
a string of characters that embodies the information or commands for the
application to process. Note: there cannot be any space between the initial <? and
the target identifier. The "instructions" begins after the first space. Fully
qualifying the target with the complete web-unique package prefix is
recommended. For readability, use a (:) after the target name. Like

<?my.subdirectory.myprograme: query="..."?>

An XML file may have multiple processing instructions to tell different


applications to do similar things.

19. How XML treats with whitespace?


The spaces, carriage returns, line feeds and tabs are all treated as whitespace in
XML. XML document uses the UNIX convention for line endings, which means
that lines are ended with a linefeed character only -- ASCII code 10(DOS file
uses a pair of ASCII codes 13 and 10). When parsed, that is treated simply as a
single linefeed.

If you want to preserve whitespace, use a special attribute xml:space or set


attribute to default to indicate it in a element declaration.

20. Is XML tag case-sensitive?

Yes. XML tags are case-sensitive. The start and end tags should be matched
exactly.

21. How to let browser to display XML file?

There are two ways to do so:

o Use a style sheet to indicate to a browser how you want the content of the
elements to be displayed, like Cascading Style Sheet(CSS) or Extensible
Style Sheet Language(XSL).
o use a programming language to handle the XML document in
programming code,like Java or JavaScript.
22. Which is better to store data using elements or using attributes?

There is no clear-cut to say which is better. It depends on the case. But it is


worth noting that too many attributes make documents hard to read and attribute
names must be unique. If more than 4 attributes in a tag are used, think about
breaking up the tag into a number of enclosed tags.

Return to top

23. How to use JavaScript to display XML document?

To illustrate it, we use a simple XML document hello.xml as follows:

<?xml version="1.0" encoding="UTF-8"?>


<DOCUMENT>
<MESSAGE>
Hello guys!
</MESSAGE>
</DOCUMENT>

In an HTML file called getxml.html:

<html>
<head>
<xml id="Myxml" src="hello.xml"></xml>
<script language="JavaScript">
function getData(){
xmldoc=document.all("Myxml").XMLDocument;
node=xmldoc.documentElement;
nodeMsg=node.firstChild;
output="From hello.xml file, you get -- " +
nodeMsg.firstChild.nodeValue;
message.innerHTML=output;
}
</script>
</head>
<body>
<h1 align=center>Get data from XML document</h1>
Here comes: <div id ="message"></div>
<input type="button" value="get data from xml"
onclick="getData()">
</body>
</html>

Here is the display in your browser. Click the button to see what happens? Click
the "F5" button on your keyboard to refresh the display.

Get data from XML


document
Here comes:

24. What is DTD tag?

A DTD tag is a tag used in DTD definition file. It starts with <! and ends with >.
It tells parser how to handle xml file.

25. What is CDATA?

A CDATA is a section mark which works like <pre>...</pre> in HTML, only


more so--all whitespace in a CDATA section is significant, and characters in it
are not interpreted as XML. A CDATA section starts with <![CDATA[ and ends
with ]]>. If you have a section which contains many & or <, you can mark it, so
the XML processor will not parse it.

26. What is the basic syntax for the document type declaration?

The basic syntax:

<!DOCTYPE root-name [DTD]>


where the <!DOCTYPE> is part of a document's prolog; the root-name is the
name of root tag; the DTD is a document type definition. It can be internal or
external.

The document type declaration may have the following forms:

<!DOCTYPE root-name [DTD]>


<!DOCTYPE root-name SYSTEM URL>
<!DOCTYPE root-name SYSTEM URL [DTD]>
<!DOCTYPE root-name PUBLIC identifier URL>
<!DOCTYPE root-name PUBLIC identifier URL [DTD]>

27. How the internal DTD looks like?


28.<?xml version="1.0" standalone="yes"?>
29.<!DOCTYPE DOCUMENT [
30.<!ELEMENT DOCUMENT (CUSTOMER) *>
31.
32.]>
33.<DOCUMENT>
34. <CUSTOMER>
35.....
36.</DOCUMENT>
37. What is the syntax of element declaration?
38.<!ELEMENT NAME CONTENT_MODEL>

where name is the name of the element; CONTENT_MODEL can be set to


EMPTY or ANY, or it can hold mixed content or child elements.

39. What is the meaning of the following statement?


40.<!ELEMENT slideshow (slide+)>

This is a DTD tag definition. The notation says that a slideshow element consists
of one or more slide elements. If there is no plus sign after slide, it says that a
slideshow has only one slide element. If the plus sign is replaced with question
mark "?", it says there may be zero or one slide. If the plus sign is replaced with
asterisk "*", it say that there may be zero or more slide elements.

41. What syntax should be used to describe a more children elements?

For example, if a and b represent child elements:

o a+ -- one or more occurences of a


o a* -- zero or more occurences of a
o a? -- a or nothing
o a, b -- a followed by b
o a | b -- a or b, but not both
o (expression) -- a unit may have more of above expressions
42. What the following tells us in a dtd file?
43.1. <!ELEMENT slideshow (slide+)>
44.2. <!ELEMENT slide (title, item*)>
45.3. <!ELEMENT title (#PCDATA)>
46.4. <!ELEMENT item (#PCDATA | item)*>
47.5. <!ELEMENT %content; >

The first line says that a slideshow element contains one or more slide elements.
The second line says that a slide element consists of a title followed by zero or
more item elements. The third line says that a title element consists entirely of
parsed character data(PCDATA). The "#" sign that precedes PCDATA indicates
that what follows is a special word. The fourth line says the item element is
either PCDATA or an item. The asterisk at the end says that either one can occur
zero or more times in succession. The fifth line says that content is a parameter
entity reference.

48. What is mixed-content model?

The content of a tag in the xml file can be #PCDATA or any number of item
elements like the fourth line above.

49. Is DTD definition hierarchical?

No. The DTD definition is not hierarchical. But you can work around to make
your xml tags hierarchical. For example, if you have a title for slideshow and a
title for each slide, you can use slide-title to represent the title in slide and make
a definition for slide-title. It is so called "hyphenation hierarchy. Otherwise, the
title definition will work for every title in xml file.

50. What is special element value and how to use it?

There are two special values: ANY or EMPTY. The "ANY" notation says that
the element may contain any other defined element, or PCDATA. The "EMPTY"
notation says that the element contains no contents. For example an empty tag
contains no contents.

51. How to reference a DTD file?

If the DTD definition is in a separate file from the XML document, you have to
write something to reference it from the XML document. For example, if your
slideshow.dtd is ready for use, then, in your xml document file, after the xml
declaration, write:

<!DOCTYPE slideshow SYSTEM "slideshow.dtd">

The above statement says that the element slideshow tag will use definition in
slideshow.dtd. The SYSTEM identifier specifies the location of the DTD file and
the path is relative to the location of the xml document. You may use http:// or
file:/ to indicate the path of the DTD file.

Or you can reference a definition within the XML document by using a square
brackets like the following, rather than referring to an external DTD file.

<!DOCTYPE slideshow SYSTEM "slideshow1.dtd" [


...local subset definitions here...
]>

52. How to declare a public DTD?

Replace SYSTEM to PUBLIC and give url to that dtd.

<!DOCTYPE slideshow PUBLIC "-//somewhere//customized XML Version


1.0//EN"
"http://www.somewhere.com/someones/slideshow1.dtd" >

53. What is the meaning of ATTLIST? What do the following statements tell us?

ATTLIST means attribute list. The name that follows ATTLIST specifies the
element for which the attributes are being defined. For example, you have a
slideshow tag with title, date and author attributes, you may code as:

<!ELEMENT slideshow (slide+)>


<!ATTLIST slideshow
title CDATA #REQUIRED
date CDATA #IMPLIED
author CDATA "unknown"
>
<!ELEMENT slide (title, item*)>

The DTD tag ATTLIST begins the series of attribute definitions. The slideshow
element has three attributes. The title, date and author are the names of attributes
of slideshow. CDATA is a type of the attribute; it means unparsed charater data or
a text string.

The #REQUIRED means the attribute value must be specified in the document.
The #IMPLIED means the value need not be specified in the document.

54. What does the & sign mean in dtd file?

The & sign means an entity variable name. Note it should be ended with
semicolon ";" sign. For example:

<!DOCTYPE slideshow SYSTEM "slideshow.dtd" [


<!ENTITY copyright "&#169" >
...
]>

...
&copyright; ...

Wherever the &copyright; is parsed, it will be replaced with entity copyright sign
©.

55. How to declare a parameter entity reference?

Use <!ENTITY> to declare it and use an % as start and ; as end to enclose the
parameter entity reference. For example, TODAY is a parameter entity
reference.

<?xml version='1.0' standalone="yes"?>

<!DOCTYPE DOCUMENT [
<!ELEMENT GREETING (#PCDATA)>
<!ELEMENT MESSAGE (#PCDATA)>
<!ELEMENT DATE (#PCDATA)>
<!ENTITY TODAY "NOV 1, 2003">
]>

<DOCUMENT>
<GREETING>
HELLO FROM HTML
</GREETING>
<MESSAGE>
WELCOME TO THE WILD WORLD
</MESSAGE>
<DATE> &TODAY; </DATE>
</DOCUMENT>

Save above file as greeting.xml, use your browser to look at it. You may get a
similar display, except that the <DATE> tag will display "NOV 1, 2003".

Note: It is possible that your browser may not support XML. If you use MS IE
5.5 above, or NS 5.0 above, you may be able to see it.

56. How to declare an image tag in a DTD file?

You can declare an image tag in the following form:

1. <!ELEMENT image EMPTY>


2. <!ATTLIST image
alt CDATA #IMPLIED
src CDATA #REQUIRED
type CDATA "image/gif"
>
3. <!NOTATION GIF SYSTEM "image/gif">
4. <!ENTITY some SYSTEM "image.gif" NDARA GIF>
The line one declares image as an optional element. The line 2 declares attributes
of an image tag. For the moment, you can not declare an image tag like type
("image/gif", "image/jpeg"). The line 3 declares a notation named GIF that stands
for the image/gif MIME type. The line 4 creates an external unparsed entity
named some to refer to the external image file, image.gif.

57. What is conditional section?

Conditional section is a way to let XML document to choose which dtd should
be "included" or "ignored". Use <![ as a start and ]]> as an end. For example,
you want to use a different versions of a DTD for xml document or sgml
document, you may code as follows:

<![ INCLUDE [
... XML-only definitions
]]>
<![ IGNORE [
... SGML-only definitions
]]>
... common definitions

58. How many entities are catagorized in dtd?


o Internal entity: An entity that is referenced in its own document content.
o External entity: An entity that is referenced in another file.
o General entity: including internal or external entity
o Parameter entity: An entity that contains DTD specifications that are
referenced from within the DTD.
o Parsed entity: An entity that contains XML(text and markup) and is
therefore parsed.
o Unparsed entity: An entity that contains binary data(like images)

Return to top

59. What is xmlns?

The xmlns stands for XML NameSpace. It is an attribute fro a tag. It is used in
DTD to prevent conflicts. For example, the following tells us the title element
will use designated DTD.

<!ELEMENT title (%inline;)*>


<!ATTLIST title
xmlns CDATA #FIXED "http://www.example.com/slideshow">
...
or
<title xmlns="http://www.example.com/slideshow">
Overview
</title>
or
<SL:slideshow xmlns:SL='http:/www.example.com/slideshow'
...>
...
</SL:slideshow>
or
<SL:slideshow xmlns:SL='http:/www.example.com/slideshow'
...>
...
<slide>
<SL:title>Overview</SL:title>
</slide>
...
</SL:slideshow>
or
<SL:slideshow xmlns:SL='http:/www.example.com/slideshow'
xmlns:xhtml='urn:...'>
...
</SL:slideshow>

Here we use "http:", you may use "urn:" instead.

60. What is xsi?

The "xsi" stands for XML Schema Instance like:

xsi:noNamespaceSchemaLocation='YourSchemaDefinition.xsd'

The line specifies the schema to use for elements in the document that do not
have a namespace prefix.

61. Where to use XML?

XML can be used in many places:

o Data representation in Web, especially for Java client/server web.


o Data interchange in all sorts of transactions as long as both sides agree on.
o Document-Driven Programming(DDP)
o Binding
o Archiving
62. How XML is related with other technologies?

There are many XML related technologies directly or non-directly:

o SAX -- Simple API for XML: reads and writes XML data in a server.
o DOM -- Document Object Model: converts an XML document into a
collection of objects.
o JDOM -- Java DOM: processes more data-oriented structures.
o dom4j -- DOM for Java: a factory-based implementation, easier to modify
for complex, special-purpose apps.
o DTD -- Document Type Definition: specifies the kinds of tags that can be
included in XML document.
o Namespace -- writes an XML document that uses two or more sets of
XML tags in modular fashion.
o XSL -- Extensible Stylesheet Language: specifies how to identify data, not
how to display it.
o XSLT -- Extensible Stylesheet Language for Transformations: specifies
what to convert an XML tag into.
o XSL-FO -- Extensible Stylesheet Language for Formatting Objects:
specifies how to link multiple areas on a page.
o SAAJ -- SOAP with Attachments API for Java.
o JAXR -- Java API for XML Registries, used to look and find web services.
o etc.
63. What are major subcomponents of XSL?

The XML Stylesheet Language (XSL) has three major subcomponents:

1.XSL-FO -- The largest subcomponent. It describes font sizes, page layouts,


and how information "flows" from one page to another.
2.XSLT -- A transformation language that lets you define a transformation
from XML into some other format. Like producing HTML, a different XML
structure, a plain text or other document format.
3.XPath -- A specification that lets you specify a path to an element.
2. What is transformation language in XSL?

Extensible Styles Language(XSL) has two parts:

a transformation language(XSLT) and a formatting language.

The transformation language lets you transform documents into different forms,
while the formatting language actually formats and styles documents in various
ways.

3. What is JAX-RPC?

JAX-RPC stands for Java API for XML-based RPC. It is an API for building
Web services and clients that use RPC and XML.

In JAX-RPC, a remote procedure call is represented by an XML-based protocol


such as SOAP. The SOAP specification defines the envelope structure, encoding
rules, and convention for representing remote procedure calls and responses.
These calls and responses are transmitted as SOAP messages (XML files) over
HTTP.
With JAX-RPC, the developer does not generate or parse SOAP messages. It is
the JAX-RPC runtime system that converts the API calls and responses to and
from SOAP messages.

4. What is value type?

A value type is a class whose state may be passed between a client and remote
service as a method parameter or return value. For example, an account class
may have account number, account owner and amount field. These information
may be passed between client and server as a method deposit parameter and a
return value of method account query.

5. What kind of rules do the value type must follow?

To be supported by JAX-RPC, a value type must conform to the following rules:

o It must have a public default constructor.


o It must not implement (either directly or indirectly) the java.rmi.Remote
interface.
o Its fields must be supported JAX-RPC types.
o The value type may contain public, private, or protected fields. The field
of a value type must meet these requirements:
 A public field cannot be final or transient.
 A non-public field must have corresponding getter and setter
methods.

Return to top

64. What is SAAJ?

SAAJ stands for SOAP(Simple Object Access Protocal) with Attachments API
for Java. SAAJ is used mainly for the SOAP messaging that goes on behind the
scenes in JAX-RPC and JAXR implementations.

Secondarily, it is an API that developers can use when they choose to write
SOAP messaging applications directly rather than using JAX-RPC.

65. What is XML Registry?

An XML registry is an infrastructure that enables the building, deployment, and


discovery of Web services. It is a neutral third party that facilitates dynamic and
loosely coupled business-to-business (B2B) interactions. A registry is available
to organizations as a shared resource, often in the form of a Web-based service.
There are many kinds of specifications for XML registries, including ebXML
Registry and Repository and The Universal Description, Discovery, and
Integration (UDDI).

66. What is JAXR?

JAXR stands for Java API for XML Registries. It enables Java software
programmers to use an API to access a variety of XML registries. The current
version of the JAXR specification can be found at
http://java.sun.com/xml/downloads/jaxr.html

67. What is XHTML?

XHTML is an application of XML that tries to make XML documents look and
act like HTML documents. The XHTML specification is a reformulation of
HTML 4.0 into XML. The following is code of XHTML.

1. <?xml version="1.0"?>
2. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
3. "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
4. <html xmlns="http://www.w3.org/2002/xhtml" xml:lang="en"
lang="en">
5. <head><title>Welcome to see xhtml</title></head>
6. <body>
7. <h1 align="center"> Welcome to XHTML!</h1>
8. </body>
9. </html>

The following is the display of your browser.

Here we display data from


XML

Name:

Customer ID:
Department:

Purchase Date:

Product:

<<<>>>

68.How to display data from XML file in a tabular format?


69.Let's use the customer.xml above as a source file and
use table tags to
70.display it.
71.<html>
72.<head><title>Here we display data from XML</title></head>
73.
74.<xml src="customer.xml" ID="customers"></xml>
75.
76.<body>
77.<h1 align=center>Here we display data from XML</h1>
78.<center>
79.<table datasrc="#customers" cellspacing="15">
80. <thead>
81. <tr>
82. <th>Name</th>
83. <th>Customer ID</th>
84. <th>Purchase</th>
85. <th>Department</th>
86. <th>Product</th>
87. </tr>
88. </thead>
89. <tbody>
90. <tr>
91. <td>
92. <span datafld="NAME"></span>
93. </td>
94. <td>
95. <span datafld="CUSTOMER_ID"></span>
96. </td>
97. <td>
98. <span datafld="PURCHASE_DATE"></span>
99. </td>
100. <td>
101. <span datafld="DEPARTMENT"></span>
102. </td>
103. <td>
104. <span datafld="PRODUCT_NAME"></span>
105. </td>
106. </tr>
107. </tbody>
108.</table>
109.
110.</center>
111.</body>
112.</html>
113.
114.The following is the display of the above html file.
If you cannot see the
115.correct result from this page, it is probably that
your browser may not accept
116.that, please copy the above code, save it to a
separate file and see the

result.

Here we display data from


XML

Name Customer ID Purchase Department Product

117.How to use XML DSO applet to connect XML file?


118.Use above customer.xml file as an example. Use
applet tag to make a
119.connection as follows:
120.replace xml tags
121.<xml src="customer.xml" ID="customers"></xml>
122.
123.with
124.
125.<applet
126. code="com.ms.xml.dso.XMLDSO.class"
127. id="customers"
128. width="0" height="0"
129. mayscript="true">
130. <param name="url" value="customer.xml">
131.</applet>
132.Make sure the com.ms.xml.dso.XMLDSO.class is available for
loading.

Return to

top

You might also like