WT - Unit 1 (HTML)
WT - Unit 1 (HTML)
• XML (EXtensible Markup Language): XML is a markup language designed to store and
transport data. It's both human-readable and machine-readable, making it widely used for
exchanging data over the internet. XML documents are structured by using tags to define
data elements and their relationships.
Characteristics of XML:
• XML is extensible − XML allows you to create your own self-descriptive tags, or
language, that suits your application.
• XML carries the data, does not present it − XML allows you to store the data irrespective
of how it will be presented.
• XML is a public standard − XML was developed by an organization called the World
Wide Web Consortium (W3C) and is available as an open standard.
• Usages of XML:
• XML can work behind the scene to simplify the creation of HTML documents for large web
sites.
• XML can be used to exchange the information between organizations and systems.
• XML can be used for offloading and reloading of databases.
• XML can be used to store and arrange the data, which can customize your data handling
needs.
• XML can easily be merged with style sheets to create almost any desired output.
• Virtually, any type of data can be expressed as an XML document.
• Rules for XML Declaration: The XML declaration is case sensitive and must begin with
"<?xml>" where "xml" is written in lower-case.
• If document contains XML declaration, then it strictly needs to be the first statement of the
XML document.
• The XML declaration strictly needs be the first statement in the XML document.
• An HTTP protocol can override the value of encoding that you put in the XML declaration.
• If the XML declaration is present in the XML, it must be placed as the first line in the XML
document.
• If the XML declaration is included, it must contain version number attribute.
• The Parameter names and values are case-sensitive.
• The names are always in lower case.
• The order of placing the parameters is important. The correct order is: version, encoding and
standalone.
• Either single or double quotes may be used.
• The XML declaration has no closing tag i.e. </?xml>
• Tags: XML documents are made up of elements enclosed within start and end tags. Tags are
used to define the structure and hierarchy of the document's content.
• Start tags begin with a less-than symbol (<), followed by the element name, and end with a
greater-than symbol (>).
• End tags have a similar syntax, but also include a forward slash (/) before the element name.
• Elements: Elements are the building blocks of an XML document. They represent
individual pieces of information within the document. Each element has a start tag, an end
tag, and may contain nested elements and/or text content.
• Attributes: Attributes provide additional information about an element. They are placed
within the start tag of an element and consist of a name-value pair. Attribute values are
enclosed in quotation marks.
DTD (Document Type Definition): DTD is a formal specification that defines the structure
and the legal elements and attributes of an XML document. It acts as a blueprint for the XML
document, specifying what elements can appear, their order, and their attributes. DTDs are
declared within the XML document or in a separate file with a .dtd extension.
XML Schemas:
• XML Schemas: XML Schema (XSD) is another way to define the structure and constraints
of XML documents. It provides more powerful and flexible validation rules compared to
DTDs. XML Schemas use XML syntax to define elements, attributes, data types, and their
relationships.
Object Models:
• Presenting and Using XML: XML data can be presented and used in various ways, such as
displaying it in web browsers, processing it with programming languages, transforming it
using XSLT (eXtensible Stylesheet Language Transformations), or querying it with XPath
(XML Path Language).
• XML Processors: DOM and SAX: XML processors are software libraries or APIs used to
parse and manipulate XML documents. There are two main types: DOM and SAX.
DOM
DOM (Document Object Model): DOM represents the XML document as a tree structure of
nodes, where each node corresponds to an XML element, attribute, or text. DOM parsers load
the entire XML document into memory, allowing for random access and manipulation of the
document's contents.
SAX
• SAX (Simple API for XML): SAX, on the other hand, is an event-driven XML parser. It
processes XML documents sequentially, generating events (e.g., start element, end element,
text) as it encounters elements in the document. SAX parsers are memory-efficient but less
convenient for random access to XML data compared to DOM.