0% found this document useful (0 votes)
54 views34 pages

XML Extensible Markup Language: Aleksandar Bogdanovski Programing Enviroment Laboratory

XML is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It allows users to define their own element names and attributes, and impose constraints on them using a document type definition. The key components of XML include elements, attributes, entities, comments, processing instructions, and the DTD. XML provides a way to store and transport structured data between different systems.

Uploaded by

Nitin Sanap
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views34 pages

XML Extensible Markup Language: Aleksandar Bogdanovski Programing Enviroment Laboratory

XML is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It allows users to define their own element names and attributes, and impose constraints on them using a document type definition. The key components of XML include elements, attributes, entities, comments, processing instructions, and the DTD. XML provides a way to store and transport structured data between different systems.

Uploaded by

Nitin Sanap
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 34

XML Extensible Markup Language

Aleksandar Bogdanovski Programing Enviroment LABoratory [email protected]

What is XML ?

Unformal definition:

XML is a markup language for representation of documents which contain stuctured information. New in XML ?
http://www.w3.org/XML/1999/XML-in-10-points.html

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

Document

The word "document" refers not only to traditional documents, but also to the countless number of other XML "data formats". These include vector graphics, e-commerce transactions, mathematical equations, object meta-data, server APIs, and a thousand other kinds of structured information.

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

SGML XML HTML

SGML provides arbitrary structure. Full SGML systems solve large, complex problems that justify their expense. XML is defined as an application profile of SGML, or roughly speaking, a restricted form of SGML. XML specifies neither semantics nor a tag set. XML provides a facility to define tags and the structural relationships between them. In HTML, both the tag semantics and the tag set are fixed.

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

10 Commandmends of XML

1. 2. 3. 4. 5.

XML shall be straightforwardly usable over the Internet. XML shall support a wide variety of applications. XML shall be compatible with SGML. It shall be easy to write programs which process XML documents. The number of optional features in XML is to be kept to the absolute minimum, ideally zero. 6. XML documents should be human-legible and reasonably clear. 7. The XML design should be prepared quickly. 8. The design of XML shall be formal and concise. 9. XML documents shall be easy to create. 10. Terseness in XML markup is of minimal importance.
Aleksandar Bogdanovski : XML Extensible Markup Language February 2002

Look and feel of XML Document

A Simple XML Document :


<?xml version="1.0"?> <oldjoke> <burns>Say <quote>goodnight</quote>, Gracie.</burns> <allen><quote>Goodnight, Gracie.</quote></allen> <applause/> </oldjoke>

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

Markup in XML

There are six kinds of markup in XML: Elements Entity references Comments Processing instructions (Pis) Marked (CDATA) sections Document type declarations (DTD)

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

Elements
Elements are the most common form of markup. Elements identify the nature of the content they surround. Some elements may be empty, in which case they have no content. Element begins with a start-tag, <element>, ends with an end-tag, </element>, and has some content in between. Example :
<burns>Say <quote>goodnight</quote>, Gracie.</burns>

Attributes are name-value pairs that occur in start-tags after the element name. Example:
<volvo type=s40>

In XML, all attribute values must be quoted.


Aleksandar Bogdanovski : XML Extensible Markup Language February 2002

Entity references

To represent special characters in XML, entities are used. Entities are also used to refer to often repeated or varying text and to include the content of external files.

Every entity must have a unique name.


A special form of entity reference, called a character reference, can be used to insert arbitrary Unicode characters into your document. Character references take one of two forms: decimal references, &#8478;, and hexadecimal references, &#x211E;.
Aleksandar Bogdanovski : XML Extensible Markup Language February 2002

Comments

Comments begin with <!-- and end with -->. Comments can contain any data except the literal string --. Comments can be placed anywhere between markup in the document. Comments are not part of the textual content of an XML document. An XML processor is not required to pass them along to an application.

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

Processing instructions (PIs)

Like comments, Pis are not textually part of the XML document, but the XML processor is required to pass them to an application. Processing instructions have the form: <?name pidata?>. The name, called the PI target, identifies the PI to the application. Any data that follows the PI target is optional. The names used in PIs may be declared as notations in order to formally identify them. PI names beginning with xml are reserved for XML standardization.
Aleksandar Bogdanovski : XML Extensible Markup Language February 2002

CDATA sections

CDATA section instructs the parser to ignore most markup characters. Example:
<![CDATA[ *p = &q; b = (i <= 3); ]]>

Between the start of the section, <![CDATA[ and the end of the section, ]]>, all character data is passed directly to the application, without interpretation. The only string that cannot occur in a CDATA section is ]]>.

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

Document Type Declaration (DTD)

One of the greatest strengths of XML is that it allows you to create your own tag names. But it is not meaningful for tags to occur in a completely arbitrary order Example:
<gracie><quote><oldjoke>Goodnight, <applause/>Gracie</oldjoke></quote> <burns><gracie>Say <quote>goodnight</quote>, </gracie>Gracie.</burns>

This doesnt make any sence, but syntactically theres nothing wrong. For the document to have a meaning, some constraints on the sequence and nesting of tags should be imposed.

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

Document Type Declaration (DTD)

Constarints are expresed in the Declarations. Declarations allow a document to communicate meta-information to the parser about its content. There are four kinds of declarations in XML: Element Type Declarations Attribute List Declarations Entity Declarations Notation Declarations

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

Element Type Declaration

Element type declarations identify the names of elements and the nature of their content. Example:
<!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT oldjoke burns allen quote applause (burns+, allen, applause?)> (#PCDATA | quote)*> (#PCDATA | quote)*> (#PCDATA)*> EMPTY>

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

Attribute List Declarations


Attribute list declarations identify which elements may have attributes, what attributes they may have, what values the attributes may hold, and what value is the default. Example:
<!ATTLIST oldjoke name ID #REQUIRED label CDATA #IMPLIED status ( funny | notfunny ) 'funny'>

Each attribute in a declaration has 3 parts: a name, a type, and default value.
Aleksandar Bogdanovski : XML Extensible Markup Language February 2002

Attribute List Declarations


There are six possible attribute types ENTITY CDATA or ENTITIES
An ENTITY attribute's value must text is name of CDATA attributes are strings, any be the allowed. a single entity. The value of an ENTITIES attribute may contain multiple entity names separated by white space.

ID The value of an NMTOKENS be a name. All of the ID values used in document NMTOKEN or ID attribute must
must be different. IDs uniquely identify individual elements inconsist of a single Restricted form of string attribute. NMTOKEN attribute must a document. Elements can have only a are no ID attribute. word , but there single additional constraints. The value of an NMTOKENS attribute may contain multiple NMTOKEN values separated by white space.

IDREF or IDREFS An IDREF attribute's value must be the value of a single ID attribute on some element A list of names
in the document. The value of an IDREFS attribute may contain multiple IDREF The value of an attribute must be taken from a specific list of names. This is values separatedan enumerated type. Alternatively, you can specify that the frequently called by white space. names must match a notation name.
Aleksandar Bogdanovski : XML Extensible Markup Language February 2002

Attribute List Declarations


There are four possible default values: #REQUIRED
The attribute must have an explicitly specified value on every occurrence of the element in the document.

#IMPLIED
The attribute value is not required, and no default value is provided. If a value is not specified, the XML processor must proceed without one.

"value"
An attribute can be given any legal value as a default. The attribute value is not required on each element in the document, and if it is not present, it will appear to be the specified default.

#FIXED "value"
An attribute declaration may specify that an attribute has a fixed value. In this case, the attribute is not required, but if it occurs, it must have the specified value. If it is not present, it will appear to be the specified default.
Aleksandar Bogdanovski : XML Extensible Markup Language February 2002

Entity Declarations
Entity declarations allow you to associate a name with some other fragment of content. That construct can be a chunk of regular text, a chunk of the document type declaration, or a reference to an external file containing either text or binary data. Example:
<!ENTITY ATI "ArborText, Inc."> <!ENTITY boilerplate SYSTEM "/standard/legalnotice.xml"> <!ENTITY ATIlogo SYSTEM "/standard/logo.gif" NDATA GIF87A>

There are three kinds of entities: Internal Entities External Entities Parameter Entities
Aleksandar Bogdanovski : XML Extensible Markup Language February 2002

Internal Entities
The internal entities associate a name with a string of literal text. Example:
<!ENTITY ATI "ArborText, Inc.">

Internal entities allow to define shortcuts for frequently typed text or text that is expected to change, such as the revision status of a document. The XML specification predefines five internal entities:
1. 2. 3. 4. 5. &lt; produces the left angle bracket, < &gt; produces the right angle bracket, > &amp; produces the ampersand, & &apos; produces a single quote character (an apostrophe), ' &quot; produces a double quote character, "
Aleksandar Bogdanovski : XML Extensible Markup Language February 2002

External Entities

External entities associate a name with the content of another file. External entities contain either text or binary data. Example:
<!ENTITY boilerplate SYSTEM "/standard/legalnotice.xml"> <!ENTITY ATIlogo SYSTEM "/standard/logo.gif" NDATA GIF87A>

The textual content of the external file is inserted at the point of reference and parsed as part of the referring document. Binary data is not parsed and may only be referenced in an attribute.

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

Parameter Entities

Parameter entities can only occur in the document type declaration. Parameter entity references are immediately expanded in the Document type declaration and their replacement text is part of the declaration. Example:
<!ENTITY % personcontent "#PCDATA | quote"> <!ELEMENT burns (%personcontent;)*> <!ELEMENT allen (%personcontent;)*>

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

Notation Declaration

Notation declarations identify specific types of external binary data. This information is passed to the processing application, which may make whatever use of it it wishes. Example:
<!NOTATION GIF87A SYSTEM "GIF">

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

Including a DTD
The DTD must be the first thing in the document after the optional processing instructions and comments. The DTD identifies the root element of the document and may contain additional declarations. Example:
<?XML version="1.0" standalone="no"?> <!DOCTYPE chapter SYSTEM "dbook.dtd [ <!ENTITY %ulink.module "IGNORE"> <!ELEMENT ulink (#PCDATA)*> <!ATTLIST ulink xml:link CDATA #FIXED "SIMPLE" xml-attributes CDATA #FIXED "HREF URL" URL CDATA #REQUIRED> ]> <chapter>...</chapter>
Aleksandar Bogdanovski : XML Extensible Markup Language February 2002

Well-formed Documents

A document can only be well-formed if it obeys the syntax of XML. The document must meet all of the following conditions:
The document instance must conform to the grammar of XML documents. Non-empty tags must be properly nested. The replacement text forbe declared before they are used.inside a markup Parameter entities must all parameter entities referenced declaration consists of zero or more complete markup declarations. All entities except the following: amp, lt, gt, apos, and quot must be declared. No attribute may appear more than once on the same start-tag. A binary entity cannot be referenced in the flow of content, it can only be used in String attribute values cannot contain references to external entities. an attribute declared as ENTITY or ENTITIES.

Neither text nor parameter entities are allowed to be recursive, directly or indirectly.
Aleksandar Bogdanovski : XML Extensible Markup Language February 2002

Valid Documents

A well-formed document is valid only if it contains a proper document type declaration and if the document obeys the constraints of that declaration (element sequence and nesting is valid, required attributes are provided, attribute values are of the correct type, etc.). The XML specification identifies all of the criteria in detail.

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

The XML family

Set of modules that offer useful services to accomplish important and frequently demanded tasks. The XML family consists of: XLink XPointer XSchemas CSS XSL XSLT DOM etc.

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

XLink

Some of the highlights of XLink are: XLink gives you control over the semantics of the link XLink introduces Extended Links. Extended Links can involve more than two resources. There are two types of links: Simple Links Extended Links

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

Simple Links

A Simple Link strongly resembles an HTML <A> link: Example


<link xml:link="simple" href="locator">Link Text</link>

A Simple Link identifies a link between two resources, one of which is the content of the linking element itself. This is an in-line link. The locator identifies the other resource. The locator may be a URL, a query, or an Extended Pointer.

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

Extended Links
Extended Links allow you to express relationships between more than two resources: Example:
<elink xml:link="extended" role="annotation"> <locator xml:link="locator" href="text.loc"> The Text</locator> <locator xml:link="locator" href="annot1.loc"> Annotations </locator> <locator xml:link="locator" href="annot2.loc"> More Annotations</locator> <locator xml:link="locator" href="litcrit.loc"> Literary Criticism</locator> </elink>

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

XPointer

XPointers offer a syntax that allows you to locate a resource by traversing the element tree of the document containing the resource. Example:
child(2,oldjoke).(3,.)

locates the third child (whatever it may be) of the second oldjoke in the document. XPointers can span regions of the tree. Example:
span(child(2,oldjoke),child(3,oldjoke))

selects the second and third oldjoke s in the document.

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

XSchemas

XML Schemas help developers to precisely define the structures of their own XML-based formats.

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

CSS XSL XSLT

CSS the style sheet language, is applicable to XML as it is to HTML. XSL is the advanced language for expressing style sheets. It is based on XSLT a transformation language used for rearranging, adding and deleting tags and attributes.

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

Conclusion

XML isn't always the best solution, but it is always worth considering.

Aleksandar Bogdanovski : XML Extensible Markup Language

February 2002

You might also like