Unit 5
Unit 5
XML: Defining data for web applications, basic XML, document type definition, presenting XML, document
object model. Web Services
Extensible Markup Language(XML) is used to describe data. The XML standard is a flexible way
to create information formats and electronically share structured data via the public Internet, as well as via
corporate networks.
XML code, a formal recommendation from the World Wide Consortium (W3C), is similar to
Hypertext Markup Language (HTML). Both XML and HTML contain markup symbols to describe page or file
contents. HTML code describes web page content (mainly text and graphic image) only in terms of how it is to be
displayed and interacted with.
XML data is known as self-describing or self –defining, meaning that the structure of the data is
embedded with the data, thus when the data arrives there is no need to pre-build the structure to store the data. It
is dynamically understood within the XML. The XML format can be used by any individual or group of
individuals or companies that want to share information in a consistent way. XML is actually a simpler and
easier-to-use subset of the Standard Generalized Markup Language (SGML), which is the standard to create a
document structure.
The basic building block of an XML document is an element, defined by tags. An element has a
beginning and an ending tag. All elements in an XML document are contained in an outermost elements known
as the root element. XML can also support nested elements, or elements within elements. This ability allows
XML to support hierarchical structures. Element names describe the content of the element, and the structure
describes the relationship between the elements.
An XML documents is considered to be “Well formed” (that is, able to be read and understood by
an XML parser) if its format complies with XML specification, if it is properly marked up, and if elements are
properly nested. XML also supports the ability to define attributes for elements and describe the characteristics of
the elements in the beginning tag of an element.
For example, XML documents can be very simple, such as the following,
<?xml version=”1.0”>
<conversation>
<greeting>Hello World !</greeting>
<response>Have a nice day…</response>
</conversation>
Applications for XML are endless. For example, computer makers might agree upon a standard
or common way to describe the information about a computer product(processor speed, memory size etc.) and
then describe the product information format with XML code. Such a standard way of describing data would
enable a user to send an intelligent agent(a program) to each computer maker’s website ,gather data, and then
make a valid comparison.
XML’s power resides in its simplicity. It can take large chunks of information and consolidate
them into an XML document – meaningful pieces that provide structure and organization to the information.
2.Basic XML
In computing, Extensible Markup Language (XML) is a markup language that defines a set of rules for
encoding documents in a format that is both human-readable and machine-readable.
There are three important characteristics of XML that make it useful in a variety of systems and solutions:
What is MarkUp ?
Markup is information added to a document that enhances its meaning in certain ways, in that it identifies the
parts and how they relate to each other. More specifically, a markup language is a set of symbols that can be
placed in the text of a document to demarcate and label the parts of that document.
Following example shows how XML markup looks, when embedded in a piece of text −
<message>
<text>Hello, world!</text>
</message>
XML Syntax:- The simple syntax rules to write an XML document. Following is a complete
XML document −
<contact-info>
<name>Rahul</name>
<company>Designer World</company>
<phone>(011) 123-4567</phone>
</contact-info>
You can notice there are two kinds of information in the above example −
• The text, or the character data, Designer World and (040) 123-4567.
The following diagram depicts the syntax rules to write different types of markup and text in an XML document.
Let us see each component of the above diagram in detail.
XML Declaration:-
The XML document can optionally have an XML declaration. It is written as follows −
Where version is the XML version and encoding specifies the character encoding used in the document.
• If document contains XML declaration, then it strictly needs to be the first statement of the XML
document.
• The XML declaration strictly needs be the first statement in the XML document.
• An HTTP protocol can override the value of encoding that you put in the XML declaration.
<element>
<element>....</element> (or)
<element/>
Nesting of Elements − An XML-element can contain multiple XML-elements as its children, but the children
elements must not overlap. i.e., an end tag of an element must have the same name as that of the most recent
unmatched start tag.
<contact-info>
<company>Designer World
<contact-info>
</company>
<contact-info>
<company>Designer World</company>
<contact-info>
Root Element − An XML document can have only one root element. For example, following is not a correct
XML document, because both the x and y elements occur at the top level without a root element −
<x>...</x>
<y>...</y>
<root>
<x>...</x>
<y>...</y>
</root>
Case Sensitivity − The names of XML-elements are case-sensitive. That means the name of the start and the end
elements need to be exactly in the same case.
• Same attribute cannot have two values in a syntax. The following example shows incorrect syntax
because the attribute b is specified twice
• Attribute names are defined without quotation marks, whereas attribute values must always appear in
quotation marks. Following example demonstrates incorrect xml syntax
<a b = x>....</a>
In the above syntax, the attribute value is not defined in quotation marks.
The XML Document Type Declaration, commonly known as DTD, is a way to describe XML language
precisely. DTDs check vocabulary and validity of the structure of XML documents against grammatical
rules of appropriate XML language.
An XML DTD can be either specified inside the document, or it can be kept in a separate document and
then liked separately.
Syntax
Basic syntax of a DTD is as follows −
declaration1
declaration2
........
]>
In the above syntax,
• The DTD starts with <!DOCTYPE delimiter.
• An element tells the parser to parse the document from the specified root element.
• DTD identifier is an identifier for the document type definition, which may be the path to a file
on the system or URL to a file on the internet. If the DTD is pointing to external path, it is
called External Subset.
• The square brackets [ ] enclose an optional list of entity declarations called Internal Subset.
Two Types of Document Type Definition:-
1.Internal Document Type Definition and 2.External Document Type Definition.
1. Internal DTD:-
A DTD is referred to as an internal DTD if elements are declared within the XML files. To refer it as
internal DTD, standalone attribute in XML declaration must be set to yes. This means, the declaration
works independent of an external source.
Syntax
Following is the syntax of internal DTD −
<!DOCTYPE address [
]>
<address>
<name>Rahul</name>
<company>Designer World</company>
<phone>(011) 123-4567</phone>
</address>
Let us go through the above code −
Start Declaration − Begin the XML declaration with the following statement.
<!DOCTYPE address [
The DOCTYPE declaration has an exclamation mark (!) at the start of the element name. The
DOCTYPE informs the parser that a DTD is associated with this XML document.
DTD Body − The DOCTYPE declaration is followed by body of the DTD, where you declare
elements, attributes, entities, and notations.
Several elements are declared here that make up the vocabulary of the <name> document.
<!ELEMENT name (#PCDATA)> defines the element name to be of type "#PCDATA". Here
#PCDATA means parse-able text data.
End Declaration − Finally, the declaration section of the DTD is closed using a closing bracket and a
closing angle bracket (]>). This effectively ends the definition, and thereafter, the XML document
follows immediately.
Rules:
• The document type declaration must appear at the start of the document (preceded only by the
XML header) − it is not permitted anywhere else within the document.
• Similar to the DOCTYPE declaration, the element declarations must start with an exclamation
mark.
• The Name in the document type declaration must match the element type of the root element.
2. External DTD:-
In external DTD elements are declared outside the XML file. They are accessed by specifying the
system attributes which may be either the legal .dtd file or a valid URL. To refer it as external
DTD, standalone attribute in the XML declaration must be set as no. This means, declaration includes
information from the external source.
Syntax
Following is the syntax for external DTD −
Example
The following example shows external DTD usage −
<address>
<name>Rahul</name>
<company>Designer World</company>
<phone>(011) 123-4567</phone>
</address>
Note:-
Types
You can refer to an external DTD by using either system identifiers or public identifiers.
System Identifiers
A system identifier enables you to specify the location of an external file containing DTD declarations. Syntax is
as follows −
Public Identifiers
Public identifiers provide a mechanism to locate DTD resources and is written as follows −
Representing data with XML opens up new possibilities for transport and distribution. XML
presentation technologies provide a modular way to deliver and display content to a variety of devices.
Here we examine some technologies for display, including CSS, XSL, Xforms, and VoiceXML.
➢ Cascading style sheets is an XML-supporting technology for adding style display properties such as
fonts, colors, or spacing to Web documents.
➢ CSS origins may be traced to the SGML world, which used a style sheet technology called DSSSL to
control the display of SGML documents.
➢ Style sheet technology is important because it lets developers separate presentation from content,
which greatly enhances software's longevity.
As Figure shows, a style sheet tells a browser or other display engine how to display content.
Figure HTML or XML may be delivered to a browser with CSS, which controls how data is
presented on the screen.
Each rule is made up of a selector ”typically an element name such as an HTML heading ( H1 ) or
paragraph ( P ), or a user -defined XML element ( Book ) ”and the style to be applied to the selector.
The CSS specification defines numerous properties ( color , font style, point size , and so on) that may
be defined for an element. Each property takes a value which describes how the selector should be
presented.
2. XSL:- eXtensible Stylesheet Language
XSL 1.0 is a W3C Recommendation that provides users with the ability to describe how
XML data and documents are to be formatted. XSL does this by defining "formatting objects," such as
footnotes, headers, or columns .
1.XSL began as an effort to provide a better CSS - The XSL Working Group split into two subgroups,
one focused on trying to build a better display-oriented style sheet technology and a second group trying
to define a transformation language that could be used to transform XML into a variety of target
languages including HTML, other dialects of XML, or any text document, including a program.
➢ The patterns are similar to CSS's selectors, but the action part may create an arbitrary number of
"objects." The action part of the rule is called the "template" in XSL, and a template and a pattern
together are referred to as a "template rule."
➢ The result of applying all matching patterns to a document recursively is a tree of objects, which is
then interpreted top-down according to the definition of each object.
Figure: Options for using CSS and XSLT with XML and HTML.
3. XForms:-
Forms are widely used in all aspects of E-commerce:-
5. VoiceXML:-
➢ VoiceXML uses XML text to drive voice dialogs.
➢ VoiceXML is an emerging standard for speech-enabled applications. Its XML syntax defines elements to
control a sequence of interaction dialogs between a user and an implementation platform.
➢ The elements defined as part of VoiceXML control dialogs and rules for presenting information to and
extracting information from an end-user using speech.
➢ For ZwiftBooks, VoiceXML opens up options for extending its service to voice through cellular networks.
Figure: VoiceXML documents are used to drive voice interactions over conventional or wireless
phones.
"The W3C Document Object Model (DOM) is a platform and language-neutral interface that allows programs
and scripts to dynamically access and update the content, structure, and style of a document."
DOM defines the objects and properties and methods (interface) to access all XML elements. It is separated into
3 different parts / levels −
The HTML DOM defines a standard way for accessing and manipulating HTML documents. It presents an
HTML document as a tree-structure.
The XML DOM defines a standard way for accessing and manipulating XML documents. It presents an XML
document as a tree-structure.
In other words: The XML DOM is a standard for how to get, change, add, or delete XML elements.
This code retrieves the text value of the first <title> element in an XML document:
txt = xmlDoc.getElementsByTagName("title")[0].childNodes[0].nodeValue;
This example reads "books.xml" into xmlDoc and retrieves the text value of the first <title> element in
books.xml:
example reads "books.xml" into xmlDoc and retrieves the text value of the first <title> element in books.xml:
Example
<!DOCTYPE html>
<html>
<body>
<p id="demo"></p>
<script>
var parser, xmlDoc;
var text = "<bookstore><book>" +
"<title>Everyday Italian</title>" +
"<author>Giada De Laurentiis</author>" +
"<year>2005</year>" +
"</book></bookstore>";
parser = new DOMParser();
xmlDoc = parser.parseFromString(text,"text/xml");
document.getElementById("demo").innerHTML =
xmlDoc.getElementsByTagName("title")[0].childNodes[0].nodeValue;
</script>
</body> </html>
Example Explained
Programming Interface
The DOM models XML as a set of node objects. The nodes can be accessed with JavaScript or other
programming languages. In this tutorial we use JavaScript.
The programming interface to the DOM is defined by a set standard properties and methods.
Methods are often referred to as something that is done (i.e. delete "book").
• XML DOM is traversable - Information in XML DOM is organized in a hierarchy which allows
developer to navigate around the hierarchy looking for specific information.
• XML DOM is modifiable - It is dynamic in nature providing the developer a scope to add, edit, move or
remove nodes at any point on the tree.
• Due to the extensive usage of memory, its operational speed, compared to SAX is slower.
6. Web Services
Definition:-
A web service is any piece of software that makes itself available over the internet and uses a
standardized XML messaging system. XML is used to encode all communications to a web service. For
example, a client invokes a web service by sending an XML message, then waits for a corresponding
XML response. As all communication is in XML, web services are not tied to any one operating system
or programming language—Java can talk with Perl; Windows applications can talk with Unix
applications.
(or)
Web services are XML-based information exchange systems that use the Internet for direct application-
to-application interaction. These systems can include programs, objects, messages, or documents.
A web service enables communication among various applications by using open standards such as HTML, XML,
WSDL, and SOAP. A web service takes the help of:
You can build a Java-based web service on Solaris that is accessible from your Visual Basic program
that runs on Windows.
You can also use C# to build new web services on Windows that can be invoked from your web
application that is based on JavaServer Pages (JSP) and runs on Linux.
The basic web services platform is XML + HTTP. All the standard web services work using the following
components:
1. SOAP:- The best way to communicate between applications is over HTTP, because HTTP is supported by all
Internet browsers and servers. SOAP was created to accomplish this.
<?xml version="1.0"?>
<soap:Envelope
xmlns:soap="http://www.w3.org/2003/05/soap-envelope/"
soap:encodingStyle="http://www.w3.org/2003/05/soap-encoding">
<soap:Header>
...
</soap:Header>
<soap:Body>
...
<soap:Fault>
...
</soap:Fault>
</soap:Body>
</soap:Envelope>
2. WSDL:- WSDL is an XML – based language for describing web services and how to access them.
<definitions>
<types>
data type definitions........
</types>
<message>
definition of the data being communicated....
</message>
</definitions>
3. RDF:- RDF was designed to provide a common way to describe information so it can be read and understood
by computer applications.
<RDF>
<Description about="https://www.designerworld.com/rdf">
<author>Rahul</author>
<homepage>https://www.designerworld.com</homepage>
</Description>
</RDF>
4. RSS:- With RSS it is possible to distribute up-to-date web content from one web site to thousands of other web
sites around the world. RSS allows fast browsing for news and updates.
Example:
</rss>
5. UDDI:- UDDI is an XML-based standard for describing, publishing and finding web services.
Security is critical to web services. However, neither XML-RPC nor SOAP specifications make any explicit
security or authentication requirements. There are three specific security issues with web services:
1. Confidentiality
2. Authentication and
3. Network Security
1. Exposing the Existing Function on the Network - Once it is exposed on the network, other applications can use
the functionality of your program.
2. Interoperability - Web services allows various applications to talk to each other and share data and services
among themselves.
3. Standardized Protocol - It gives the wide range of choices, reduction in the cost due to competition and
increase in the quality.
4. Low Cost Communication - Web services use SOAP over HTTP protocol, so you can use your existing low-
cost internet for implementing web services.
1. Loosely Coupled
2. Ability to be Synchronous (or) Asynchronous
3. Supports Remote Procedure Calls ( RPCs ) and
4. Supports Document Exchange
HTML XML
HTML is an abbreviation for HyperText Markup XML stands for eXtensible Markup Language.
Language.
HTML was designed to display data with focus on how XML was designed to be a software and hardware
data looks. independent tool used to transport and store data, with
focus on what data is.
HTML is a markup language itself. XML provides a framework for defining markup
languages.
HTML is used for designing a web-page to be rendered XML is used basically to transport data between the
on the client side. application and the database.
HTML has it own predefined tags. While what makes XML flexible is that custom tags
can be defined and the tags are invented by the author
of the XML document.
HTML is not strict if the user does not use the closing XML makes it mandatory for the user the close each tag
tags. that has been used.
HTML does not preserve white space. XML preserves white space.
HTML is about displaying data,hence static. XML is about carrying information,hence dynamic.