0% found this document useful (0 votes)

478 views86 pages

XML Tutorial

This document provides an introduction to XML, discussing what markup is, different types of computer markup, and what makes XML declarative. It describes XML as the Extensible Markup Language, a standard for representing documents in a flexible way. The document outlines two main uses of XML - for markup of documents and for data exchange/protocol design. It discusses opportunities that XML provides and how XML systems can be architected, including server-side publishing approaches.

Uploaded by

Abhinav Bajpai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

478 views86 pages

XML Tutorial

Uploaded by

Abhinav Bajpai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 86

Introduction to XML

David G. Durand
Director of Electronic Publishing Services
Ingenta Inc.
Adjunct Associate Professor
Brown University
Thanks
Steven J. DeRose, Eve Mahler
What Is Markup?

Information added to a text to make its

structure comprehensible
Pre-computer markup (punctuational
and presentational)
Word divisions
Punctuation
Copy-editor and typesetters marks
Formatting conventions
The Friendly letter

This shows something about what third

graders learn about reading and writing
That documents are alike in key ways
That they have parts, with names
That those parts are (usually)
distinctively displayed
Computer markup

Any kind of codes added to a document

Typesetting (presentational markup)
MS Word and its ilk, TeX, Scribe,
Lout, Script, nroff, XYVision
Declarative markup
HTML (sometimes)
XML
What do we mean by
declarative?

Names and structure

Framework for indirection
Finer level of detail (most human-
legible signals are overloaded)
Independent of presentation (abstract)
People often call this “semantic”
XML

The Extensible Markup Language

XML is a standard, interoperable way to
represent documents for flexible
processing
Multi-format delivery
Schema-aware information retrieval
Transformation and dynamic data
customization
Archival: standardized, self-describing
The two worlds of XML

Markup of documents: the original

This perspective is our focus here
Document representation was the primary
problem XML was created to solve
Data exchange and protocol design
XML turned out to fill important gaps
Relational databases needed a way to
share records and multi-table data
Protocol designers wanted a way to
encapsulate structured data
The two worlds united

•
Documents and “semi-structured” data share features
–
Hierarchical structure
–
String content
–
Variations in structure
•
Their applications also share needs
–
Need for a lingua franca, independent of APIs
–
Ability to cope with international characters
–
“Fit” with WWW and HTTP.
XML is more general

•
Tags label arbitrary information units
–
More suited to multiple purposes
–
“Looking right” is needed but not enough
•
Supports custom information structures
–
If you have “price” or “procedure”, you can make a tag for it, and validate its usage
–
Can support many different information models
•
E.g., molecular models, vector graphics, etc.
•
More “teeth” to enforce consistent syntax
–
Works hard to avoid semi-interoperable docs
Better rendering than
HTML
•
Fully internationalized
–
Also better for visually-impaired users
•
Supports multiple renderings
–
Customize to the user, time, situation, device
–
Separates formatting from structure
–
And processing other than rendering
•
Large documents don’t break it
–
Easy to trade off server/client work
–
Artificial “next tiny bit” links no longer necessary
–
No searches that fail because big doc was split
•
XHTML is XML-conforming flavor of HTML
–
Clean existing HTML is already close...
XML treats documents like
databases
XML brings benefits of DBs to
documents
Schema to model information directly
Formal validation, locking, versioning,
rollback...
But
Not all traditional database concepts
map cleanly, because documents are
fundamentally different in some ways
What is structure

•
To Relational Database theorists, structure is:
–
Tables with fixed sets of non-repeating named fields, that have little internal
structure
–
E-R diagrams with fixed number of nodes
•
Structured documents are different:
–
The order of SECs, Ps, etc. matters (a lot)
–
Many hierarchical layers (which text crosses)
–
Text/graphic data mixes with aggregate objects
–
Optional or repeatable sub-parts abound
–
Interaction with natural language phenomena
•
These are very different requirements
When structure is
essential

•
Large scale data
•
Data with individual parts you care about
–
(like price-tag, tool-list, citation, author,...)
•
Need for good navigation tools
•
Mission-critical information
•
Information that must last
•
Multi-author publishing process
•
Multiple delivery media
What’s the difference?

Without structure
Data conversion is far more expensive
Multi-platform and/or multi-media delivery
require re-authoring and hand-work
Paper production is inconsistent
Late format changes are far more risky
Retrieval is prone to many false hits
“Pay me now, or pay me later”
XML design principles

•
Straightforwardly usable over the Internet
•
Support for a wide variety of applications
•
Compatible with SGML
•
Make writing XML programs easy
•
Avoid optional features
•
Human-readable (if not terse) markup
•
Formal and concise design
•
Design produced quickly
Opportunities with XML

•
Scalability and openness of Web solutions
•
“Rich clients” for complex information
–
Dynamic user views
•
XML as interprocess communication protocol for “data” (as opposed to “text”)
•
eCommerce integration
•
New methods of creation
–
Schema combination/composition
–
Free-form, schema-less data development
Web usage
XML works with familiar Web paradigms
Locations are expressed as URIs
High interoperability because of few
options
Easily implementable and usable
Robust against network failures
Avoids serving schemas every time with
documents
(but can do better validation anyway,
when needed)
Some additional XML
details

Well-formedness
Error handling
Case sensitivity
HTML compatibility
Well-formedness

•
Document has a single root element, and
•
Elements nest properly
–
Try foobarbaz in your browser!
•
Entities are whole subtrees (not )
•
No tag omission (close what you open)
•
Attributes must be quoted
•
< and & must always be escaped in some way
•
A document can be well-formed (and parsable) whether or not it fits a given
schema
Partial and missing DTDs

•
DTDs (schemas) are needed for validation
•
DTD processing adds a burden
•
Because of Well-formedness,
–
DTDs are not needed just to parse
–
Even subtrees can be parsed in isolation
•
One exception: Default attributes
•
Very handy for development/experimentation
Error handling

•
“Draconian error handling”
–
Major errors cause processor to stop passing data in the “normal way”
•
Fatal errors:
–
Ill-formed document
–
Certain entity references in incorrect places
–
Misplaced character-encoding declarations
•
This helps save huge $ on error-recovery
–
Hopefully, the $ will go to better features instead
–
NS and MS wanted this (détente?)
Case sensitivity

•
HTML is
–
Case-insensitive for tag names: = 
–
Case-sensitive for entity names: &LT; ≠ <
•
XML is case-sensitive for both!
–
Unicode standard advises against case-folding
–
Folding is not well-defined for all languages
•
Turkish has two lower-case i’s, only one upper
•
In languages with no accented caps, can’t reverse
•
Error-prone for programmers
•
XHTML uses lower case
Summary

•
XML has:
–
Representational power and extensibility
•
Custom tags, order constraints, etc.
–
Validation and consistency (several ways)
–
Much of HTML’s simplicity for users/implementors
•
XML trashes:
–
SGML’s syntax/feature complexity
–
SGML’s high startup costs
–
HTML’s inflexibility
–
ASCII legacy
XML System
Architectures
First, an HTML system

HTML •Web
document Server

Internet
Web
Client
Parser,
formatter,
interface
How do you get
the data?
Documents, stylesheets, and other data can
all be expressed in XML. Any application can
plug in via an API
But their information is accessed directly. called “Document
Object Model”

XML Parser Information

data structure DOM Interface
(tree+links)

This model can work locally or over a

DTD/ network. Parsing, tree-building, and access
Schema can shift between client/server
Server side XML
publishing
Server transforms to HTML/CSS;
Ship to client browser for display

Browser/
XML XSLT HTML
http Interface
data +CSS

Stylesheet

Very common current strategy;

Leverages current technology
XML everywhere

XML separates representation from structure

So you can use the same parsers, network protocols,
tree managers, and APIs to access documents,
stylesheets, search and query, etc.
XML allows separating application parts
So you can mix and match formatters, search
engines, networks and protocols, etc.
XML separates out semantics
So you can control style or search semantics without
having to mangle your documents to do it
What are the parts?

Header stuff
The XML Processing Instruction
<?xml version="1.0" standalone="yes"?>

Schema/DTD (referenced or included)

The DOCTYPE
<!DOCTYPE catalog SYSTEM
"http://www.xyz.com/DTDs/catalog.dtd">
Main document stuff

–
Elements: <title>...</title>
–
Attributes: <xref tgt="#h185">
–
Text or other content: Tools, computer
–
Entity references: <…®
–
Comments 
Anatomy of an element
Element type

Element type
Attribute
(character)
entity
Attribute Attribut
e reference
name
value
Use a hyphen: .

Start-tag Content End-tag

Element
Audiences XML aims to
help

Parser writers
The Mythical CS Grad Student
Application writer
The Desperate Perl Hacker
Document creators
Newbies of all stripes
The World Wide Web itself
HTML compatibility

XHTML is an XML application

One schema among many (probably a
popular one, of course)
Web browser should start supporting
generic XML regardless of tag-set.
Don’t hard-code sizes and names
Open eBook spec has a nice
compromise that accommodates XML,
HTML, CSS, and MIME
What are the parts of an
XML Document?
Comments
The DTD
Marked sections
Elements
Processing
Attributes instructions
General entities Notations
Character Identifiers and
references catalogs
Schema Languages

•
3 Leading contenders (all can win):
•
XML Schema
–
Backed by the W3C
–
Very powerful
–
Very large + Complex theory
•
Relax/NG
–
Backed by ISO
–
Based on tree automata
–
Very small
•
Schematron
–
Independent effort
–
Validation tool, not complete language
The DTD (schema)

•
A DTD is a simple schema, based on SGML
•
They consist of declarations for the parts:
–
<!ELEMENT CHAP (TI, SEC*, SUM)>
–
<!ATTLIST P ID ID #IMPLIED>
–
<!ELEMENT P (#PCDATA)>
•
Can reference from DOCTYPE, or include:
•
<!DOCTYPE book SYSTEM “book.dtd” [
<!ELEMENT P (#PCDATA)>…
]>
•
Other schema languages are available
–
They use XML syntax (why not?)
Elements

Identify structural/semantic components

Can (usually do) have children
Represented by start-tags and end-tags:
Hello, world.
Some elements are EMPTY
Special syntax so parser knows: <HR/>
Schemas control what sub-element patterns can occur
with any given type of element
Order matters / Context does not
Attributes

•
Specify properties/characteristics of elements
–
That generally apply to the elements as wholes
•
Values are atomic strings
–
Though applications may impose more structure
•
Represented by assignments within start-tags:
–

•
Schemas control what attributes can occur on any given type of element
•
One special type: ID, unique per document
•
Attributes are not ordered
General Entities

•
A lexical mechanism for inclusion
–
But, constrained to including subtrees
–
This preserves fragment parsability
–
This allows lazy evaluation of structure nodes
•
Also used for referring to graphic or other non-directly-XML data objects
•
References occur in the document instance:
–
<PROCEDURE TYPE="REPAIR">
&warn37;&warn12;...</PROCEDURE>
•
Declarations associate the name with a URI or a “public identifier”
Predefined entities

Used for escaping markup characters

In XML, tags start with “<”.
Represented just like other entities:
–
< “<“
–
& “&”
–
> “>” (more for symmetry than need)
–
'“'”
–
&quo; “"”
Schemas may not redefine these names
Character references

Can be used to obtain untypable characters

Such as Kanji for users with English keyboards
Map directly to a Unicode code point
Represented much like entity references:
Decimal: ㋱
Hex: 뻯
Schemas do not affect these
Comments

Can go most anywhere

(though not inside tags)
Represented as:
–

Have simpler syntax than in SGML/HTML
Not 
Not <!-- foo -- >
Schemas can contain comments, too
Marked sections

Two purposes:
Escaping a lot of markup
Conditional inclusion
In XML:
Escaping only in the document instance:
•
<![CDATA[ Hello ]]>
Conditional content only in schemas:
•
<![IGNORE[ ... ]]>
•
<![INCLUDE[ ... ]]>
Processing instructions

•
Form/example:
–
<?target-name target-specific-stuff ?>
–
<?xmleditor insertionpoint?>
•
Used to insert instructions to processors
–
Not commonly needed
–
No way to escape “?>” inside
–
May declare targets in DTD as Notations
•
One special one: to identify XML documents
–
<?xml version="1.0"?>
The “XML Declaration” PI

At top of each XML document:

<?XML version="1.0"
standalone="yes"
encoding="UTF-8"?>
This marks the document as being XML
“Encoding” can be double-checked
You can detect the encoding from the first few
bytes, for many common ones (even EBCDIC)
MIME types also can signal encoding
(watch out if server re-encodes document)
Notations
Used to name foreign data formats referenced
Ties a notation name to a URI (presumably
pointing to the format’s specification)
Entities can state their data’s notation
Processing instructions can (should) use them
as target names
Declared in the schema
<!NOTATION gif SYSTEM
“http://specs.com/gif10.html”>
Can also use PUBLIC
Identifiers

Used in entity declarations to state where the data to

be included later can be found
<!ENTITY warning SYSTEM
"http://www.warnsource.com/w993.xml">
Uses a URI reference
Probably will later allow referencing subtrees directly
by appending an XPointer
Accommodates persistent naming schemes under
development; but doesn’t define one.
XML 1.0 DTDs

DTDs let you say:

What element types can occur and where
What attributes each element type can have
What notations are in use
What external entities can be referenced
Standard DTDs exist in almost every domain
Robin Cover’s oasis.org site has references
Some repositories exist, such as xml.org
An Example DTD

–

–

<!ELEMENT LETTER (DATE, GREET, BODY, SIG)>
<!ELEMENT DATE (#PCDATA)>
<!ELEMENT GREET (#PCDATA)>
<!ELEMENT BODY (P)*>
<!ELEMENT SIG (#PCDATA)>
<!ELEMENT P (#PCDATA | EMPH | FIG)*>
<!ELEMENT EMPH (#PCDATA)>
<!ATTLIST EMPH TYPE NAME ”WOW">
<!ELEMENT FIG EMPTY>
<!ATTLIST FIG HREF CDATA #REQUIRED>
Another Example

–
<!ENTITY % inline “emph | strong”>
–
<!ELEMENT doc (chap*)>
–
<!ELEMENT chap (title, section*)>
–
<!ELEMENT title (#PCDATA | %inline;)*>
–
<!ELEMENT section P+>
–
<!ELEMENT p (#PCDATA|%inline;)*>
–
<!ATTLIST p ID ID #IMPLIED>
–
<!ELEMENT emph (#PCDATA)>
–
<!ELEMENT strong (#PCDATA)>
A corresponding
document

–
<?xml version="1.0">
<!DOCTYPE LETTER PUBLIC
"-//sjd//DTD Friendly letter//EN"
–
[]>
<LETTER><DATE>October 3, 1998</DATE>
<GREET>Sammy</GREET>
<BODY>
How <EMPH>are</EMPH> you doing?
This is my dog:
<FIG HREF=”http://www.me.com/dog.gif”/>
</BODY>
<SIG>Todd</SIG>
</LETTER>
Content Models

These are modeled on regular

expressions
In DTD, each element has one content
model for all time
Similarly, each element has one set of
attributes for all time
Attributes and content models are
completely independent
Basic Operators

Joining
Sequence a,b,c
Alternation a | b | c
Grouping (a)
Repetition
0 or more a*
1 or more a+
Optional a?
Data

#PCDATA
Element names
Model groups
Mixed content (#PCDATA | x | …)*
ANY
EMPTY
Not quite regular
expressions

Ambiguity restriction
No alternatives must be found for any
model group
This restriction is preserved in W3C
Schema, relaxed in RelaxNG
Handy terminology
decoder ring
Element: a text feature distinguished by
markup
Tag: a string in angle brackets. <a> or </a>.
Two tags delimit an element
Content: anything in an element (children in
the parse tree) tags and characters between an
element’s tags
Attribute: a (name, value) pair associated with
an element
Element Type Name: a string like “p” or “img”
that identifies the type of an element
Decoder ring…

Entity: abstraction of an item of data storage.

General entity: entity whose text is contained
in its declaration.
External entity: entity whose content is stored
externally to its declaration
Declaration: meta-markup that declares
entities, content models, etc.
Document instance: the tags and content in an
XML document, not counting declarations
Decoder…

Document Type declaration (DOCTYPE):

declaration of root element of a document
instance, can refer to:
External subset: DTD (XML declarations)
stored as an external entity.
Internal subset: declarations contained
within a DOCTYPE declaration. ATTLIST
declarations must be parsed, and
interpreted.
Decoder…

•
Content Model: description of restrictions on the content of an element
•
Model Group: content model subexpression in parentheses
•
Repetition indicator: *, +, ?
•
Prolog: All of the stuff before the document instance starts.
Ambiguity

A content model is ambiguous if it

contains an alternation (a | b) where
the content models a and b cannot be
distinguished by their first element.
A content model is ambiguous if an
optional occurrence indicator is
followed by a submodel whose first
element is not different.
Attributes

Data types
Default values / omissability
<!ATTLIST p
type (summary | body) “body”
id ID #IMPLIED
prefix CDATA “”>
<!ATTLIST syntax

•
<!ATTLIST element-name
att-name type defaults
att-name type defaults
…>
•
<!ATTLIST element-group
att-name type defaults
att-name type defaults
…>
Attribute Data Types

CDATA
NMTOKEN / NMTOKENS
Enumeration Type (a | b)
ENTITY / ENTITIES
ID / IDREF / IDREFS
NOTATION
Attribute defaults

#REQUIRED
#IMPLIED
#FIXED “value”
Literal default value
Parameter Entities

• Declaring
<!ENTITY % pent “value”>
<!ENTITY % include-file SYSTEM
“http://www.w3.org//”>
Using
%include-file;
<![ option [ <!… optional
declaration …> ]]>
General Entities

Simple
<!ENTITY ent “value”>
External
<!ENTITY include-file SYSTEM
“http://www.w3.org//”>
Notations

•
declaring
•
<!NOTATION blob SYSTEM “application/binary”>
•
Using (to declare entity datatypes)
•
<!ENTITY something SYSTEM http://blob.org/blobel
–
NDATA blob>
•
Using an NDATA entity
•
<!ATTLIST img ref ENTITY #REQUIRED>
•
… in instance …
•
<img ref=“something”>
•
Or one can just use URIs and MIME types in software… less
validation, more simplicity
Processing instructions

Escape to procedural markup

•
<!NOTATION my-app SYSTEM “http://my.com/”>
•
<?my-app does something, anything …. ?>

Escape hatch
Way to add declarations to XML in
some cases
Way to “pickle” application state in a
document.
Namespaces

Helps to “uniquify” markup names

Colon delimiter allowed in names
–
<cals:table>
<html:table xyz:key="2">
Attributes associate a prefix with a
namespace URI
–
<div xmlns:xhtml=
"http://www.w3.org/1999/xhtml">
Sets default for element and
descendants
Things namespace almost
do

Allow arbitrary mixing of DTDs

/schemas
Provide a “type system” for referents of
markup
Allow automatic processing of foreign
markup
Pros and Cons of
Namespaces
You can uniquely label element types in a
global way
You can must change the element name to
take advantage of this
Attempts to re-use large numbers of
namespace-qualified elements are often
clumsy/redundant
Detection of a namespace is very easy
There can only be one namespace for an
instance of an element
Things are confusing
about namespaces
The URI reference in a namespace is just a string
The URI reference in a namespace may not exist,
it’s just a string
The URI reference in a namespace may exist and
contain something irrelevant or unexpected: it’s
just a string
Relative URI references in namespaces are well-
defined, but don’t do what you might expect,
because they are just strings…
Fragment identifiers are allowed in namespace
URIs, if you want to use them.
Namespace URI
dereferencing

There are applications within which this

has been defined
There isn’t anything yet which works
across arbitrary domains
RDF, DAML/OIL, other semantic web
efforts may also address this in time.
XML Information Set
What data in an XML document “counts”?
Elements, attributes, content
Order and hierarchy of elements
No whitespace within tags
All whitespace within elements
Not which kind of quotes around attributes
Required for interoperability
Applications must not count nodes differently
W3C “Document Object Model” is related
DOM is an API for XML, not an O.M.
XML and related specs
XML: The basic syntax, plus namespaces
XML Namespaces: disambiguation
XML-Information Set: What counts
XML-Schemas: datatyping and structure
XPath: Expressions to find whole nodes
XPointer: XPath++ for hyperlink addressing
XLink: hypermedia
XML Base (relative URLs)
XSL: stylesheets and transforms
DOM: API to the Information Set
XML specification

A “Recommendation” since 2/1998

The highest level for a W3C specification
Defines the syntax/grammar
Schemas or DTDs then define particular applications
(poetry, manuals, eCommerce,…)
All these can be parsed by generic XML, just as
new words can be readily fitted into existing
sentence structures
Schemas are political as well as technical
The W3C standards*
process
World Wide Web Consortium (W3C)
Development is organized into WGs.
Working Group (~10) - set agenda /decide
Special Interest Group (~100) -
discuss/recommend
W3C members (~500) - vote
W3C Director (TimBL) - may veto
The public--comment on public WDs;
adopt/reject
The beginning of XML

Originally chartered to work on a suite:

XML (Extensible Markup Language)
XML-Linking (Extensible Linking Language)
XSL (Extensible Style Language)
Founder/chair: Jon Bosak (Sun);
W3C contact: Dan Connolly (W3C)
First presented 11/ 1996; ratified 2/1998
Quickly added XML Namespaces spec
The current XML
organization

Work products done by several WGs

“XML Plenary” coordinates these WGs
Document analysis

Cycle of steps; repeat until out of time

Identify project requirements/audience
Using those, identify information items in the
document that could be important
Make sure you have a way to use that information
Identify restrictions on those items
Identify structural constraints that may be needed
Identify non-semantic features that may be
important for presentation, etc.
Project requirements

Know the audience/readers

Know the authors
Don’t forget the editorial/clerical staff
These 3 groups are the experts, you are
the detail person
Don’t make a lifetime commitment to
your processing model, but have one in
mind; analysis without limitations is
dangerous
Identifying information
items

This is pretty much a manual process

Often best done with paper and
highlighters and post-its
In later stages, adding tags to a text
transcript can be useful.
The more documents you’ve looked at
and thought about, the easier this
becomes.
Issues to think about

Cross-references
Structural divisions (headings, blurbs,
ambiguities)
Tradeoff between freedom and
processing
Normalization of data items
What external data and catalogs may
exist
Restrictions on data items

Content model
Data values (are there controlled or
semi-controlled vocabularies?)
Are there “authority files” for large
open sets (like lists of authors)
How variable is the content, and how
realistic the idea to normalize it.
Presentation issues

Some text can be auto-generated,

some cannot
Some test can be “almost” auto-
generated (you can’t avoid special
cases)
Punctuation can kill you, either when
you leave it to authors, or when you
take it away from them

Unit - I
No ratings yet
Unit - I
112 pages
UNIT 1 Introduction To XML: 1 Prepare By: Dr. A. GNANASEKAR ASP/CSE R.M.D. Engineering College
No ratings yet
UNIT 1 Introduction To XML: 1 Prepare By: Dr. A. GNANASEKAR ASP/CSE R.M.D. Engineering College
28 pages
4020 Week 3
No ratings yet
4020 Week 3
75 pages
What You Should Already Know: Home Page
No ratings yet
What You Should Already Know: Home Page
56 pages
XML: A Guide for Developers
No ratings yet
XML: A Guide for Developers
17 pages
Chapter 4
No ratings yet
Chapter 4
22 pages
Unit 2
No ratings yet
Unit 2
296 pages
5 XML (Unit 2)
No ratings yet
5 XML (Unit 2)
40 pages
Unit 1: Benefits of XML 1.structured Document
No ratings yet
Unit 1: Benefits of XML 1.structured Document
26 pages
Extensible: Markup Language
No ratings yet
Extensible: Markup Language
33 pages
XML Interview Guide: Q&A Insights
No ratings yet
XML Interview Guide: Q&A Insights
15 pages
Sgmlandxml 200806091332
No ratings yet
Sgmlandxml 200806091332
12 pages
XML Basics and Applications Guide
No ratings yet
XML Basics and Applications Guide
52 pages
Unit - 4 XML
No ratings yet
Unit - 4 XML
82 pages
Unit 2 - XML
No ratings yet
Unit 2 - XML
48 pages
Module 2 PDF
No ratings yet
Module 2 PDF
25 pages
DSS01
No ratings yet
DSS01
118 pages
XML and Applications
No ratings yet
XML and Applications
39 pages
A Technical Introduction To XML: Start Here
No ratings yet
A Technical Introduction To XML: Start Here
18 pages
Unit-III Introduction To XML
No ratings yet
Unit-III Introduction To XML
25 pages
Unit V
No ratings yet
Unit V
11 pages
XML (BScCSIT 5th Semester)
No ratings yet
XML (BScCSIT 5th Semester)
39 pages
XML
No ratings yet
XML
3 pages
Unit III
No ratings yet
Unit III
39 pages
Upload 4
No ratings yet
Upload 4
13 pages
Chapter 3 Detail
No ratings yet
Chapter 3 Detail
106 pages
XML Basics for Developers
No ratings yet
XML Basics for Developers
6 pages
XML (Extensible Markup Language)
No ratings yet
XML (Extensible Markup Language)
4 pages
Features and Advantages of XML
No ratings yet
Features and Advantages of XML
21 pages
Chapter 5 XML
No ratings yet
Chapter 5 XML
18 pages
WP Unit5
No ratings yet
WP Unit5
17 pages
Unit 3 XML
No ratings yet
Unit 3 XML
95 pages
Chap 2 XML
No ratings yet
Chap 2 XML
18 pages
Pec-Cs801d
No ratings yet
Pec-Cs801d
15 pages
XML Basics for Tech Learners
No ratings yet
XML Basics for Tech Learners
41 pages
XML Extensible Markup Language
No ratings yet
XML Extensible Markup Language
27 pages
CS8651-Internet Programming Department of CSE 2022-2023
No ratings yet
CS8651-Internet Programming Department of CSE 2022-2023
30 pages
Web Technology (CSC-353) : (Unit 3: XML)
No ratings yet
Web Technology (CSC-353) : (Unit 3: XML)
50 pages
11 PHP Lecture
No ratings yet
11 PHP Lecture
12 pages
Technical X ML
No ratings yet
Technical X ML
20 pages
Lecture-4 Internet and Web Programming Architecture - Part 1
No ratings yet
Lecture-4 Internet and Web Programming Architecture - Part 1
54 pages
XML PPT
No ratings yet
XML PPT
37 pages
LM Unit-1
No ratings yet
LM Unit-1
9 pages
Extensible Markup Language
No ratings yet
Extensible Markup Language
29 pages
Web Technology
No ratings yet
Web Technology
44 pages
Unit 3
No ratings yet
Unit 3
50 pages
WSX Unit I 2023-24
No ratings yet
WSX Unit I 2023-24
41 pages
Chapter 01 XML
No ratings yet
Chapter 01 XML
14 pages
Introduction to XML Basics
No ratings yet
Introduction to XML Basics
83 pages
Proejct Part C Homework 3: About
No ratings yet
Proejct Part C Homework 3: About
60 pages
XML Basics Extensible Markup Language: Divya Panta 21109
No ratings yet
XML Basics Extensible Markup Language: Divya Panta 21109
17 pages
4..lect-09 XML Languages & Applications
No ratings yet
4..lect-09 XML Languages & Applications
14 pages
XML
No ratings yet
XML
40 pages
WT Unit 3
100% (1)
WT Unit 3
57 pages
Module 4
No ratings yet
Module 4
25 pages
Wd Assignment 2 - Copy (2)
No ratings yet
Wd Assignment 2 - Copy (2)
6 pages
Overview of HTML and XML
No ratings yet
Overview of HTML and XML
22 pages

XML Tutorial

Uploaded by

XML Tutorial

Uploaded by

Introduction to XML

Information added to a text to make its

This shows something about what third

Any kind of codes added to a document

Names and structure

The Extensible Markup Language

Markup of documents: the original

XML Parser Information

This model can work locally or over a

Very common current strategy;

XML separates representation from structure

Schema/DTD (referenced or included)

Start-tag Content End-tag

XHTML is an XML application

Identify structural/semantic components

Used for escaping markup characters

Can be used to obtain untypable characters

Can go most anywhere

At top of each XML document:

Used in entity declarations to state where the data to

DTDs let you say:

These are modeled on regular

Entity: abstraction of an item of data storage.

Document Type declaration (DOCTYPE):

A content model is ambiguous if it

Escape to procedural markup

Helps to “uniquify” markup names

Allow arbitrary mixing of DTDs

There are applications within which this

A “Recommendation” since 2/1998

Originally chartered to work on a suite:

Work products done by several WGs

Cycle of steps; repeat until out of time

Know the audience/readers

This is pretty much a manual process

Some text can be auto-generated,

You might also like