0% found this document useful (0 votes)
53 views

1 - XML 2020 Lab.01 - XML Standard (v3.0)

This document provides an overview of XML including: - A definition of XML, its purpose for structuring and exchanging data, and its history and evolution over time. - A comparison of XML to HTML, noting XML focuses on data structure while HTML focuses on display. - An outline of the main components that make up an XML document including elements, attributes, and namespaces. - Syntax rules for XML documents including requirements for elements to be properly nested and attributes to have values. - An introduction to creating and viewing XML documents using text editors, browsers, and IDEs like Eclipse.

Uploaded by

Alex Alex
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views

1 - XML 2020 Lab.01 - XML Standard (v3.0)

This document provides an overview of XML including: - A definition of XML, its purpose for structuring and exchanging data, and its history and evolution over time. - A comparison of XML to HTML, noting XML focuses on data structure while HTML focuses on display. - An outline of the main components that make up an XML document including elements, attributes, and namespaces. - Syntax rules for XML documents including requirements for elements to be properly nested and attributes to have values. - An introduction to creating and viewing XML documents using text editors, browsers, and IDEs like Eclipse.

Uploaded by

Alex Alex
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 4

XML 2020 Lab.01 - XML standard (v3.

0)
Content
1.1. XML definition, purpose and history
1.2. XML versus HTML (comparison)
1.3. XML document components
1.4. XML documents syntax rules: “well-formed”
1.5. XML Namespaces
1.6. XML Document processing - Use Cases
1.7. How to create and view XML Documents
1.8. Working environments
1.9. Things to try
1.10. References

1.1. XML definition, purpose and history


1.1.1. XML (eXtensible Markup Language)
Hardware/Software platform-independent language used for data structuring, storage, exchange and handling.
XML is defined by Standards / Specifications (named Recommendations). These are proposed, developed,
maintained and released by the World Wide Web Consortium (W3C) organization.
XML is somehow related to the better known HTML (HyperText Markup Language) and they both have a
common ancestor: SGML (Standard Generalized Markup Language). Both of the latter are also W3C
Standards (Recommendations).
1.1.2. What does “markup” mean?
 (original meaning taken from printing technology) “the process or result of correcting text in
preparation for printing.”
 (actual, extended meaning in software development) A construct which gives indications about ways
of handling data; equivalent with directive/command/instruction given to a processing facility.
1.1.3. XML standard history
1998 – XML 1.0 (several revisions followed; the 5th was released on 26 Nov 2008)
2004 – XML 1.1
XML 2.0 – Proposals only:
– elimination of DTDs from syntax,
– integration of namespaces, XML Base and XML Information Set (infoset) into the base standard.
Note: during this Lab we are going to use only XML 1.0 Standard (Recommendation).
1.2. XML versus HTML (comparison)

HTML XML
Created for formatting and presenting data Created for structuring and storage of data
Used to indicate how data in the document will be displayed Describes data, focuses on what data is
The tags are defined by/in HTML Standard (specification) The tags are user-defined
Fixed number of tags => limitation Unlimited number of tags

Note:
• HTML 5.1 was released as a W3C Recommendation, on 01.11.2016
• HTML 5.2 was released as a W3C Recommendation, on 14.12.2017
• HTML 5.3 is W3C Working Draft, since 18.10.2018

1.3. XML Document Components


The possible components of an XML document are listed below:
1. XML declaration
2. Processing instructions

1
3. Comments
4. Elements
5. Attributes
6. Entities
7. DOCTYPE sections
8. CDATA sections
9. XML Namespaces
Notes:
 Please take your time and read more about them in the indicated References (on-line Tutorials &
book).
 DOCTYPE sections are used to define/reference a Document Type Definition (DTD) for XML validation
purposes (will be presented during Lab. 03).
 The XML document components preceding the root element (see below) are known as the prolog of
the XML document.

1.4. XML Document Syntax Rules


1.4.1. General syntax rules
Following are the syntax rules for XML documents to adhere:
1. An XML document should begin with the XML declaration
2. An XML document must have one unique root element (to include the entire content of the document)
3. Start-tags must have matching end-tags (i.e. end tags with same name are required)
4. All elements must be closed
5. All elements must be properly nested (i.e. elements can’t overlap)
6. Attributes must have values
7. Attribute values must be quoted (using either " or ' separator characters)
8. No duplicate names are allowed for attributes of an element
9. XML Names (element and attribute) must observe XML naming rules (see below)
10. Entities must be used for special characters

Note:
• Exception for Rule 3: empty elements.
An XML document observing these rules is considered “well-formed” (in W3C terminology!) that is
“syntactically correct”.
1.4.2. XML Naming Rules
XML names have to observe the following rules:
• XML Names are case-sensitive
• XML Names cannot start with a number or punctuation character (except (“:”, “_”)
• XML Names cannot start with the letters xml (or XML, or Xml, etc.)
• XML Names can contain letters, digits, and other characters (“:”, “_”, “-”, “.”, …)
• XML Names cannot contain spaces

Note:
• An XML document observing these rules (1.4.1. & 1.4.2.) is considered “well-formed” (in W3C
terminology!) that means “syntactically correct”.
1.4.3. The content of an Element
The information found/located between the start tag of an element and its end tag is called the content of the
element. This content is described by the so called content model. There are four kinds of this model:
- empty content (i.e. there is nothing between the tags of the element)
- simple text (i.e. only data is stored in the elemnt)
- child elements (i.e. describing the hierarchical structure of the element)

2
- mixed content (i.e. both data and structure)

1.5. XML Namespaces


The main purpose of XML namespaces is to help avoiding name conflicts. They may be used to qualify
elements, attributes and datatypes names which belong to a specific namespace.

1.6. XML Document Processing - Use Cases


• Creation, Editing, Searching/Navigating, Viewing, Saving
• Parsing => is an XML document “well-formed” (or not) (see Lab.02)
• Validation => is an XML document “valid” (see Lab.03 & 04)
• Transforming and Formatting
– XSLT/XSL-FO (see Lab.06 & 07)
• Additional Standards and Applications:
– XPath/XLink/XPointer/XQuery/XForms/XEvents (see Lab. 05 & 11, 12, 14)
– Data storage, retrieval, conversion and exchange
– XML in Databases: relational and native (see Lab. 13 & 14)
– Web Applications and Web Services

1.7. How to create and view XML Documents


• by hand, using a plain Text Editor or specialized XML Editors:
– Notepad, Notepad++, …
• using Internet browsers or specialized XML Viewers:
– most Internet browsers with basic formatting
• using an IDE (Integrated Development Environment) like Eclipse, IntelliJ IDEA, NetBeans, Visual
Studio or … others (Altova XML IDE, oXygen, Lyquid XML Studio)

1.8. Working Environments


For writing/developing, running and debugging the examples/exercises we will use:
• As language: Java
• As operating system: Windows is preferred, but any flavour of Linux or Mac OS is also usable.
• As IDE:
– the recommended option is Eclipse.
Notes:
o for most of the topics in this Lab you can use any version and variant of the Eclipse IDE.
o however, we will have topics that require the JEE variant of the IDE, and others that will not
work on versions more recent than Luna.
o as such my strong recommendation is to install Eclipse IDE (version) Luna (4.4.x), (variant)
JEE!!!
– Alternative choice is IntelliJ IDEA: Ultimate or Community
Notes:
o for those of you which will choose to use this IDE a word of warning: you will have to find
your way for using it, as I had only short access to an older version of the Community Edition
of this software!
– Other possibilities are: NetBeans, Microsoft Visual Studio, Altova IDE, Oxygen

1.9. Things to try


1. Download, install and run the IDE: Eclipse Luna JEE (recommended!).
Note: https://www.eclipse.org/downloads/packages/release/luna
2. Create a Java project and name it L01 or Lab01 (or whatever will help you keep track of your practical
work).
3
Note: I would recommend creating a separate project for each Lab!
3. Create an XML Document of your own, describing and storing whatever data you want, but taking care to
use as many XML components as possible.
Note: You can create the document inside the IDE or outside (using a text editor).
4. View the created XML document using various browsers Google Chrome, Firefox Mozilla, Microsoft
Internet Explorer/Edge, Opera, … and maybe in Microsoft Word. Notice the differences (if any).
5. Create a CSS (Cascading Style Sheet) file in order to format the XML Document.
6. Add a Processing Instruction (similar to the one below) in the prolog of the XML Document to format it
with a CSS file:
<?xml-stylesheet type="text/css" href="c:/xml/L01_02.css"?>

7. View it again using various browsers Google Chrome, Firefox Mozilla, Microsoft IE/Edge, Opera, … and
maybe in Microsoft Word. Notice the differences (if any).

1.10. References:
1. W3C XML 1.0 Recommendation Extensible Markup Language (XML) 1.0 (Fifth Edition)
W3C Recommendation 26 Nov 2008, https://www.w3.org/TR/2008/REC-xml-20081126/
2. W3C Namespaces Namespaces in XML 1.0 (Third Edition)
W3C Recommendation 8 Dec. 2009, https://www.w3.org/TR/2009/REC-xml-names-20091208/
3. XML Tutorial (from Introduction to Display)
https://www.w3schools.com/xml/
4. XML Tutorial (from Overview to White-spaces)
https://www.tutorialspoint.com/xml/index.htm
5. Beginning XML, 5th Edition, Joe Fawcett, Liam Quin, Danny Ayers, John Wiley & Sons, Inc., 2012
(Chapters 1, 2, 3 – but without Advanced Parsing, XML Infoset, XML Schema & Common Namespaces)

Note: There are plenty of free books on the Lab topics to be found on Internet. If the proposed References do
not satisfy your needs you are encouraged to search and find the one that suites better your way of learning!

You might also like