1.1 XML
1.1 XML
Markup languages
2021 - 2022
Joan Puigcerver
[email protected]
❏ XML
❏ Elements
❏ Comments
❏ Relations and structures
XML ❏ Attributes
❏ XML declaration
❏ Entity relationship
❏ Well-structured XML
❏ Namespaces
XML (eXtensible Markup Language)
❏ XML (eXtensible Markup Language) is a language developed by W3C (World Wide
Web Consortium) that is based on SGML.
❏ XML is a language used for storing and exchanging structured data between
different platforms.
❏ XML is a metalanguage, meaning, it can be used for defining other languages
called XML dialects. Some languages based on ML are:
❏ GML (Geography Markup Language).
❏ MathML (Mathematical Markup Language).
❏ RSS (Really Simple Syndication).
❏ SVG (Scalable Vector Graphics).
❏ XHTML (eXtensible HyperText Markup Language).
❏ XML
❏ Elements
❏ Comments
❏ Relations and structures
XML ❏ Attributes
❏ XML declaration
❏ Entity relationship
❏ Well-structured XML
❏ Namespaces
Elements
XML documents are plain text documents (with no format) and they contain marks
(tags) defined by the developer. These marks are surrounded with the less-than “<” and
greater-than “>” characters. In order to close a tag, the slash “/” character is used.
Exmaple: We can store the name “Mireia” in a XML document with the following syntax:
<name>Mireia</name>
ELEMENTOS: Sintaxis
<tagName>value</tagName>
● tagName: It is the tag identifier. The opening and closing name must be the same.
● value: It is the value we want to store.
Example <name>Mireia</name>
→ A “name” tag has been opened with <name> and then closed with </name>. The
value of the tag has been specified between the opening and closing tag, “Mireia” in this
case.
ELEMENTS: Empty elements
XML documents support empty elements, that don’t contain any value. In that case,
we can write:
<tag></tag>
Or also:
<tag/>
❏ Also, colon ":" may be used, however, its use is reserved to define namespaces.
<country/>
<day>21</dáy>
<month>10<month/>
<city>Barcelona</endcity>
<City>Barcelona</city>
<2colors>Orange and green</2colors>
< red>
< Hobbies >Cinema, Dancing, Swimming</ Hobbies >
<person><name>Maria</person></name>
<favorite color>white</favorite colo>
ELEMENTS: Content
The content of an element will be anything that is contained between the opning and
closing tags. We can find just text or maybe another elements.
Example The element <person> contains two more elements: <name> and <surname>,
that also contain some values:
<person>
<name>Pedro</name>
<surname>Martí</surname>
</person>
ELEMENTS: Mixed content
Also, the two previous cases can be combined so the content of an elements is
composed with text and elements. Mixed content elements.
<person>
Pere Martí
<position>Director</position>
</person>
The elements contained in another elements are called children. Following the
naming of human relationships, children of children elements are called
grandchildren.
Every XML element must have a single root element from which all others descend.
Structure: Inverted tree
The structure of any XML document can be represented as an inverted tree of
elements. The tree start with the root element and each branch correspond to its
children elements, and so on. Elements with no children are called leafs. The
elements give semantic information about the document.
person
<person>
<name>Maria</name>
<woman/>
<birth_date>
<day>21</day>
<month>10</month> name woman birth_date city
<year>1998</year>
</birth_date>
<city>Bilbao</city> day month year
</person>
Exercise 2
Draw the tree that corresponds to the following XML document:
<company>
<employee>
<surname>Perez</surname>
<name>Juan</name>
<employee_id>1234567890</employee_id>
<email>[email protected]</email>
<phone_number>666 555 444</phone_number>
<address>
<road>Carrer de Pau Claris, 121</road>
<city>Barcelona</city>
<postal_code>08009</postal_code>
</address>
</employee>
</company>
❏ XML
❏ Elements
❏ Comments
❏ Relations and structures
XML ❏ Attributes
❏ XML declaration
❏ Entity relationship
❏ Well-structured XML
❏ Namespaces
ATTRIBUTES
Attriibutes can be defined inside the opening tag of elements of an XML document.
Attributes are used to provide with additional information about the element.
<tagName attributeName=“attributeValue”>elementValue</tagName>
Should road be an attribute of address? Should city be? Justify your answer.
Exercise 4
Write the XML code that corresponds to the following tree:
ATTRIIBUTE xml:lang
The xml:lang attribute is a predefined attribute in XML language. Its purpose is
defining the language of the content of a specific element.
Example
<message xml:lang="ca">Bon dia</message>
<message xml:lang="en">Good morning</message>
The xml:lang is inherited by all the children of an element.
Example
<message xml:lang="ca">
<greeting>Bon dia</greeting>
<goodbye>Adéu</goodbye>
</message>
Greeting and goodbye are also defined in Valencian.
Exercise 5
Fix the following XML code snippets:
<runner position=1>Manel Roure</runner>
<student delegate>Jaume Ravent</delegate>
<car fuel type=”gasoline”>Seat Ibiza</car>
<politician position="mayor" position="minister">
Diana Morant
</politician>
<car>
<brand>Seat</brand>
<color>red</color>
<plate>B3456L</plate>
<driver>Jorge</driver>
<driver>Maria</driver>
<rented/>
</car>
❏ XML
❏ Elements
❏ Comments
❏ Relations and structures
XML ❏ Attributes
❏ XML declaration
❏ Entity relationship
❏ Well-structured XML
❏ Namespaces
XML DECLARATION
The XML declaration is a tag that contains details that prepare an XML processor to
parse the XML document. It is optional but if present it must be at the beginning of
the document.
Encoding UTF-8, F-8, UTF-16, Character code in which the text has
ISO-10646-UCS-2, been defined. Encoding
ISO-10646-UCS-4, ISO-8859-1, etc
-------------------------------------------
<?xml standalone="si"?>
</?xml>
-------------------------------------------
<cat></cat>
<?xml version="1.0" encoding="UTF-8"?>
<dog></dog>
❏ XML
❏ Elements
❏ Comments
❏ Relations and structures
XML ❏ Attributes
❏ XML declaration
❏ Entity relationship
❏ Well-structured XML
❏ Namespaces
Reserved characters
In XML there are some characters that are special for their meaning. In order to write
them in an XML document, you can use the references to entities shown in the
following table:
-------------------------------------------------------
<?xml version="1.0" encoding ="UTF-8"?>
<test>
<text>The characters < and & cannot
be written as a element value </text>
</test>
-------------------------------------------------------
<?xml version="1.0" encoding ="UTF-8"?>
<prueba>
<texto><![CDATA["The characters < and & cannot
be written as a element value"]]>
</texto></prueba>
Exercise 7
Fix the following XML code snippets and explain why they are invalid and how you resolve them.
<person>
<name>Pere Garcia</name>
</person>
<dog>
<name>Rufy</name>
</dog>
--------------------------------------
<person>
<name>Pere</name>
<surname> Pérez </surname>
</person>
<person>
<surname2>Garcia</surname2>
<age>22</age>
</person>
❏ XML
❏ Elements
❏ Comments
❏ Relations and structures
XML ❏ Attributes
❏ XML declaration
❏ Entity relationship
❏ Well-structured XML
❏ Namespaces
Well-structured XML
Checking the correctness of an XML document means checking that the XML definition
rules are met (previously seen)
❏ There is only one root element that contains the other elements.
❏ Label names must follow the rules.
❏ The opening, closing and empty labels are properly nested (do not overlap) and no
opening or closing label is missing or left over.
❏ Closing tags names match opening tags (even when using uppercase and
lowercase letters).
❏ Closing tags do not contain attributes.
❏ Attribute values must be enclosed in quotation marks (single or double) and all
attributes have some value.
❏ No tag has two attributes with the same name.
Well-structured XML
The easiest way to check if an XML document is valid is opening it with a browser.
There are other command line tools that allow you to validate XML; libxml2 (GNU) ,
Apache Xerces (java), MSXML (microsoft)
Also you can check online websites such as “XML formatter” to check and format your
XML.
XML Editors
An XML document is a plain text file, so any text editor can be used to create and edit
XML files.
1. The student Carles Femenia was born on February 23, 2001 in Barcelona.
2. Establishments: “Bar Pasqual” is a tapas bar on 232 Diagonal Avenue owned by Joan Alcover.
“The Meravelles Restaurant” is a menu restaurant on Meridiana 111 owned by Maria Garcia. The
“Racó de l'Àvia” restaurant is a menu restaurant on Av. Rome 221 owned by Alba Puig.
3. List of artworks in a museum: Blue on white is an abstract painting by Pere Àguila from 2014
made in oil. Dark Gray is a 2010 abstract Pere Àguila painting made in watercolor. Camino largo is a
1981 marble sculpture by Marta Lambert.
Exercice 9: Recepies
Create an XML that stores cooking recipe information:
1. The book must include the list of authors (with last name and
first name).
2. All items must have a title, except the paragraph that contains
only text. = Capitulo
3. Propose an XML structuring of this document on the image
(with 2 authors, 2 sections, 2 chapters for each section and 2
paragraphs for the first chapter).
4. Check that it is a valid XML document.
Warning: Don’t use attributes.
Exercise 10: Writing a book (II)
1. Attributes
We want to complete the structure of the XML document from the previous exercise with the attributes of the
authors' first and last names and the title of the book, the sections and the chapters.
Analyze the structure of the new document. Are there possible simplifications?
Check, with the help of the editor, that your document is valid.
2. Tree
Draw the structure tree of this previous XML document, including the attributes.
❏ XML
❏ Elements
❏ Comments
❏ Relations and structures
XML ❏ Attributes
❏ XML declaration
❏ Entity relationship
❏ Well-structured XML
❏ Namespaces
NAMESPACES
The namespace is the XML language mechanism of avoiding name collisions between
elements and tags.
XML allows you to mix documents; so maybe there are tags with the same name that
represent different things.
https://www.x3dom.org/