MODULE B1
DATA REPRESENTATION FOR LEXICOGRAPHY
PART 1
INTRODUCTION TO MARKUP LANGUAGES AND XML
ASSIGNMENT
READINGS:
- POWERPOINT PRESENTATION »» INTRODUCTION TO MARKUP LANGUAGES AND XML
- TEXT ENCODING INITIATIVE, A GENTLE INTRODUCTION TO XML:
URL: HTTP://WWW.TEI-C.ORG/RELEASE/DOC/TEI-P5-DOC/EN/HTML/SG.HTML
Check that you understand some of the core principles of Markup Languages and
XML by answering the following questions:
1. PART A:
1. What does the word “Extensible” in the designation “Extensible Markup
Language” mean?
The word “Extensible” in the designation “Extensible Markup Language”
means that XML does not consist of a fixed set of tags, but enables people to
create their own names for the elements.
2. Why is XML considered to be a meta-markup language?
XML is considered to be a meta-markup language because it is a language or
grammar used to define and build other markup languages.
3. Define the concept “descriptive annotation/markup language” and give an
example.
The concept “descriptive annotation/markup language” refers to a processing
instruction and system that describes the document’s structure and the nature
and function of its parts. For example: <front_page>: This tag marks the
beginning of front page data.
4a. What is a well-formed XML document ?
A well-formed XML document refers to a correctly and syntactically written of
an XML document, written in a way that it obeys the XLM rules and
principles.
4b. Name 5 rules to create well-formed XML documents.
5 rules to create well-formed XML documents:
- Single root element containing the entire XML document.
- All elements except empty elements must have an opening tag and a closing
tag.
- XML attribute values must be quoted
- opening and closing tags must be written with the same case
(uppercase/lowercase)
- The elements of XML documents must all nest inside one
another, without overlapping. This means XML.
5. What is an XML declaration?
An XML declaration refers to a processing instruction which is at the beginning of
the XML files and consists of attribute version, encoding and a standalone.
For example: <?xmlversion='1.0' encoding='iso-8859-1' standalone='yes'?>
2. PART B:
1. Below is an extract from an XML document.
<book>
<coverInfo>
<title>The XML Handbook</title>
<author>Charles F. Goldfarb</author>
<author>Paul Prescod</author>
<edition>Second</edition>
<description>The definitive XML resource: applications, products, and technologies.
Revised and expanded ~ over 600 new pages</description>
</coverInfo>
</book>
<book>
<coverInfo>
<title>PHP 5 Power Programming</title>
<author>Andi Gutmans</author>
<author>Stig Saether Bakken</author>
<author>Derick Rethans</author>
<edition>First</edition>
<description>Expert PHP 5 programming techniques, including PEAR, extensions
and database access</description>
</coverInfo>
</book>
1. What extra element would you need to add to the above document to fulfill the
XML requirements?
<Subtitle> .................. </Subtitle>
<Author’s biography>.......................</Author’s biography>
<Editor>.............................</Editor>
<Publisher’s name>.............................</Publisher’s name>
<Publications date>.............................</Publications date>
<ISBN>.............................</ISBN>
2. Draw a hierarchical tree diagram to represent the data relationships.
Book
CoverInfo
Title Author Author Author Edition Description
3. PART C:
1. Check the following document instances for well-formedness.
a. <list> <title>List 1</title> <item>Item 1</list>
The “item” tag is opened but never closed.
b. <item>An Item</item><item>Another Item</item>
There is no root element
c. <Paragraph>Bathing a cat is <emph>relatively</emph> easy provided the cat
is willing!</paragraph>
The Start-tag is uppercase, but the end-tag is lowercase.
d. <biblio><title>How to Bath a Cat<author></title>Peter
Scratchmore<author></biblio>
The tag “author” and “title“ are not in their right place.
e. <segment type="text">some text<segment/>
The slash should be at the beginning of word segment.
f. <segment type=text>some text</segment>
The value of the type attribute should be in double quotes.
2. Correct any errors you may find.
a.
<list>
<title>List 1</title>
<item>Item 1</item>
</list>
b.
<items>
<item>An Item</item>
<item>Another Item</item>
</items>
c. <Paragraph>Bathing a cat is <emph>relatively</emph>easy provided the
cat is willing!</Paragraph>
d.<biblio>
<title>How to Bath a Cat</title>
<author>Peter Scratchmore</author>
</biblio>
e.<segment type="text">some text</segment>
f. <segment type= "text">some text</segment>
4.PART D:
Create a well-formed XML document corresponding to the following entry (word
‘sample’ noun) from the online Longman Dictionary of Contemporary English:
http://www.ldoceonline.com/dictionary/sample_1
Please find the well-formed XML document corresponding to the following entry below,
through the following URL :
file:/C:/Users/herve/Downloads/XML.assignment_Hervé_stephane_Mbea
file:///C:/Users/herve/Downloads/XML.assignment_Herv%C3%A9_stephane_Mbea