Introduction To XML
Introduction To XML
Introduction To XML
Introduction to XML
XML:
XML stands for Extensible Markup Language. It is a text-based markup language derived
from Standard Generalized Markup Language (SGML).
XML tags identify the data and are used to store and organize the data, rather than
specifying how to display it like HTML tags, which are used to display the data.
Characteristics of XML:
• XML is extensible − XML allows you to create your own self-descriptive tags, or
language, that suits your application.
• XML carries the data does not present it − XML allows you to store the data
irrespective of how it will be presented.
• XML is a public standard − XML was developed by an organization called the World
Wide Web Consortium (W3C) and is available as an open standard.
XML Usage:
• XML can work behind the scene to simplify the creation of HTML documents for
large web sites.
• XML can be used to exchange the information between organizations and systems.
• XML can be used to store and arrange the data, which can customize your data
handling needs.
• XML can easily be merged with style sheets to create almost any desired output.
1 Prepared By:
Prof. Chirag Prajapati – SDJ International College, Vesu, Surat
What is Markup?
XML is a markup language that defines set of rules for encoding documents in a
format that is both human-readable and machine-readable.
More specifically, a markup language is a set of symbols that can be placed in the text of a
document to demarcate and label the parts of that document.
Following example shows how XML markup looks, when embedded in a piece of text −
<message>
<text>Hello, world!</text>
</message>
This snippet includes the markup symbols, or the tags such as <message>...</message> and
<text>... </text>.
The tags <message> and </message> mark the start and the end of the XML code fragment.
The tags <text> and </text> surround the text Hello, world!
These programs instruct the computer to perform specific tasks. XML does not qualify to be
a programming language as it does not perform any computation or algorithms.
It is usually stored in a simple text file and is processed by special software that is capable of
interpreting XML.
2 Prepared By:
Prof. Chirag Prajapati – SDJ International College, Vesu, Surat
XML - Syntax:
The following diagram depicts the syntax rules to write different types of markup
and text in an XML document.
XML Declaration
The XML document can optionally have an XML declaration. It is written as follows −
Where version is the XML version and encoding specifies the character encoding used in the
document.
• The XML declaration is case sensitive and must begin with "<?xml>" where "xml" is
written in lower-case.
• If document contains XML declaration, then it strictly needs to be the first statement
of the XML document.
• An HTTP protocol can override the value of encoding that you put in the XML
declaration.
An XML file is structured by several XML-elements also called XML-nodes or XML-tags. The
names of XML-elements are enclosed in triangular brackets < > as shown below −
<element>
3 Prepared By:
Prof. Chirag Prajapati – SDJ International College, Vesu, Surat
Syntax Rules for Tags and Elements
Element Syntax − Each XML-element needs to be closed either with start or with end
elements as shown below −
<element>....</element>
<element/>
<contact-info>
<company>Google
</contact-info>
</company>
<contact-info>
</contact-info>
Root Element − an XML document can have only one root element. For example, following
is not a correct XML document, because both the x and y elements occur at the top level
without a root element −
<x>...</x>
<y>...</y>
<root>
<x>...</x>
4 Prepared By:
Prof. Chirag Prajapati – SDJ International College, Vesu, Surat
<y>...</y>
</root>
Case Sensitivity − the names of XML-elements are case-sensitive. That means the name of
the start and the end elements need to be exactly in the same case.
XML Attributes
An attribute specifies a single property for the element, using a name/value pair. An XML-
element can have one or more attributes. For example −
• Attribute names in XML (unlike HTML) are case sensitive. That is, HREF and href are
considered two different XML attributes.
• Same attribute cannot have two values in syntax. The following example shows
incorrect syntax because the attribute b is specified twice
• Attribute names are defined without quotation marks, whereas attribute values
must always appear in quotation marks. Following example demonstrates incorrect
xml syntax
<a b = x>....</a>
In the above syntax, the attribute value is not defined in quotation marks.
XML References
References usually allow you to add or include additional text or markup in an XML
document. References always begin with the symbol "&" which is a reserved character and
end with the symbol ";". XML has two types of references −
• Entity References − an entity reference contains a name between the start and the
end delimiters. For example & where amp is name. The name refers to a
predefined string of text and/or markup.
5 Prepared By:
Prof. Chirag Prajapati – SDJ International College, Vesu, Surat
• Character References − These contain references, such as A, contains a hash
mark (“#”) followed by a number. The number always refers to the Unicode code of
a character. In this case, 65 refers to alphabet "A".
XML Text
The names of XML-elements and XML-attributes are case-sensitive, which means the name
of start and end elements need to be written in the same case. To avoid character encoding
problems, all XML files should be saved as Unicode UTF-8 or UTF-16 files.
Whitespace characters like blanks, tabs and line-breaks between XML-elements and
between the XML-attributes will be ignored.
Some characters are reserved by the XML syntax itself. Hence, they cannot be used directly.
To use them, some replacement-entities are used, which are listed below −
XML – Documents
An XML document can contains wide variety of data. For example, database of
numbers, numbers representing molecular structure or a mathematical equation.
<contact-info>
<name>Tanmay Patil</name>
<company>TutorialsPoint</company>
<phone>(011) 123-4567</phone>
6 Prepared By:
Prof. Chirag Prajapati – SDJ International College, Vesu, Surat
</contact-info>
• XML declaration
You can separate a document into multiple sections so that they can be rendered
differently, or used by a search engine.
The elements can be containers, with a combination of text and other elements.
XML – Declaration
XML declaration contains details that prepare an XML processor to parse the XML
document. It is optional, but when used, it must appear in the first line of the XML document.
Syntax
<?xml
version = "version_number"
encoding = "encoding_declaration"
standalone = "standalone_status"
7 Prepared By:
Prof. Chirag Prajapati – SDJ International College, Vesu, Surat
?>
Each parameter consists of a parameter name, an equals sign (=), and parameter value
inside a quote. Following table shows the above syntax in detail −
Rules
An XML declaration should abide with the following rules −
• If the XML declaration is present in the XML, it must be placed as the first line in the
XML document.
• The order of placing the parameters is important. The correct order is: version,
encoding and standalone.
8 Prepared By:
Prof. Chirag Prajapati – SDJ International College, Vesu, Surat
XML Declaration Examples
Following are few examples of XML declarations −
<?xml >
9 Prepared By:
Prof. Chirag Prajapati – SDJ International College, Vesu, Surat