Relational Databases For Querying XML Documents
Relational Databases For Querying XML Documents
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
Relational Databases for Querying XML Documents:
Limitations and Opportunities
..., a*, ..., a*, ... � a*, ... The Basic Inlining Technique, hereafter referred to as
..., a*, ..., a?, ... � a*, ... Basic, solves the fragmentation problem by inlining as
..., a?, ..., a*, ... � a*, ... many descendants of an element as possible into a single
..., a?, ..., a?, ... � a*, … relation. However, Basic creates relations for every
…, a, …, a, … � a*, … element because an XML document can be rooted at any
element in a DTD. For example, the author element in
Figure 7 Figure 2 would be mapped to a relation with attributes
The transformations are of three types: (a) flattening firstname, lastname and address. In addition, relations
transformations which convert a nested definition into a would be created for firstname, lastname and address.
flat representation (i.e., one in which the binary operators We must address two complications: set-valued
“,” and “|” do not appear inside any operator – see Figure attributes and recursion. In the example DTD in Figure 2,
5) (b) simplification transformations, which reduce many when creating a relation for article, we cannot inline the
unary operators to a single unary operator (Figure 6) and set of authors because the traditional relational model
does not support set-valued attributes. Rather, we follow Each node is marked as “visited” the first time it is
the standard technique for storing sets in an RDBMS and reached and is unmarked it once all its children have been
create a relation for author and link authors to articles traversed.
using a foreign key. Just using inlining (if we want the If an unmarked node in the DTD graph is reached
process to terminate) necessarily limits the level of during depth first traversal, a new node bearing the same
nesting in the recursion. Therefore, we express the name is created in the element graph. In addition, a
recursive relationship using the notion of relational keys regular edge is created from the most recently created
and use relational recursive processing to retrieve the node in the element graph with the same name as the DFS
relationship. In order to do this in a general fashion, we parent of the current DTD node to the newly created node.
introduce the notion of a DTD graph. If an attempt is made to traverse an already marked
DTD node, then a backpointer edge is added from the
book article monograph most recently created node in the element graph to the
most recently created node in the element graph with the
?
same name as the marked DTD node.
booktitle title
The element graph for the editor element in the DTD
* contactauthor editor graph in Figure 8 is shown in Figure 9. Intuitively, the
element graph expands the relevant part of the DTD graph
authorID *
into a tree structure.
author Given an element graph, relations are created as
name
follows. A relation is created for the root element of the
name
address authorid graph. All the element’s descendents are inlined into that
relation with the following two exceptions: (a) children
?
directly below a “*” node are made into separate relations
firstname lastname – this corresponds to creating a new relation for a set-
valued child; and (b) each node having a backpointer edge
Figure 8 pointing to it is made into a separate relation – this
corresponds to creating a new relation to handle recursion.
editor Figure 10 shows the relational schema that would be
generated for the DTD in Figure 2. There are several
features to note in the schema. Attributes in the relations
* name
are named by the path from the root element of the
relation. Each relation has an ID field that serves as the
monograph key of that relation. All relations corresponding to
element nodes having a parent also have a parentID field
title that serves as a foreign key. For instance, the
author article.author relation has a foreign key
article.author.parentID that joins authors with articles.
name address authorid
The XML document in Figure 1 would be converted to
the following tuple in the book relation:
?
(1, The Selfish Gene, Richard, Dawkins,
firstname lastname <city>Timbuktu</city><zip>99999</zip>, dawkins)
Figure 9 The ANY field, address, is stored as an uninterpreted
A DTD graph represents the structure of a DTD. Its string; thus the nested structure is not visible to the
nodes are elements, attributes and operators in the DTD. database system without further support for XML (see
Each element appears exactly once in the graph, while Section 6). Note that if the author Richard Dawkins has
attributes and operators appear as many times as they authored many books, then the author information will be
appear in the DTD. The DTD graph corresponding to the replicated for each book because it is replicated in the
DTD in Figure 2 is given in Figure 8. Cycles in the DTD corresponding XML documents.
graph indicate the presence of recursion. While Basic is good for certain types of queries, such
The schema created for a DTD is the union of the sets as “list all authors of books”, it is likely to be grossly
of relations created for each element. In order to inefficient for other queries. For example, queries such as
determine the set of relations to be created for a particular
“list all authors having first name Jack” will have to be
element, we create a graph structure called the element
executed as the union of 5 separate queries. Another
graph. The element graph is constructed as follows. disadvantage of Basic is the large number of relations it
Do a depth first traversal of the DTD graph, starting at creates. Our next technique attempts to resolve these
the element node for which we are constructing relations.
problems.
book (bookID: integer, book.booktitle : string, book.author.name.firstname: string, book.author.name.lastname: string,
book.author.address: string, author.authorid: string)
booktitle (booktitleID: integer, booktitle: string)
article (articleID: integer, article.contactauthor.authorid: string, article.title: string)
article.author (article.authorID: integer, article.author.parentID: integer, article.author.name.firstname: string,
article.author.name.lastname: string, article.author.address: string, article.author.authorid: string)
contactauthor (contactauthorID: integer, contactauthor.authorid: string)
title (titleID: integer, title: string)
monograph (monographID: integer, monograph.parentID: integer, monograph.title: string, monograph.editor.name: string,
monograph.author.name.firstname: string, monograph.author.name.lastname: string,
monograph.author.address: string, monograph.author.authorid: string)
editor (editorID: integer, editor.parentID: integer, editor.name: string)
editor.monograph (editor.monographID: integer, editor.monograph.parentID: integer, editor.monograph.title: string,
editor.monograph.author.name.firstname: string, editor.monograph.author.name.lastname: string,
editor.monograph.author.address: string, editor.monograph.author.authorid: string)
author (authorID: integer, author.name.firstname: string, author.name.lastname: string, author.address: string,
author.authorid: string)
name (nameID: integer, name.firstname: string, name.lastname: string)
firstname (firstnameID: integer, firstname: string)
lastname (lastnameID: integer, lastname: string)
address (addressID: integer, address: string)
Figure 10
Figure 11
3.4 The Shared Inlining Techni que one of them is made a separate relation. We can find such
mutually recursive elements by looking for strongly
The Shared Inlining Technique, hereafter referred to as
connected components in the DTD graph.
Shared, attempts to avoid the drawbacks of Basic by
Once we decide which element nodes are to be made
ensuring that an element node is represented in exactly
into separate relations, it is relatively easy to construct the
one relation. The principal idea behind Shared is to
relational schema. Each element node X that is a separate
identify the element nodes that are represented in multiple
relation inlines all the nodes Y that are reachable from it
relations in Basic (such as the firstname, lastname and
such that the path from X to Y does not contain a node
address elements in the example) and to share them by
(other than X) that is to be made a separate relation.
creating separate relations for these elements.
Figure 11 shows the schema derived from the DTD graph
We must first decide what relations to create. In
of Figure 8. One striking feature is the small number of
Shared, relations are created for all elements in the DTD
relations compared to the Basic schema (Figure 10).
graph whose nodes have an in-degree greater than one.
Inlining an element X into a relation corresponding to
These are precisely the nodes that are represented as
another element Y creates problems when an XML
multiple relations in Basic. Nodes with an in-degree of
document is rooted at the element X. To facilitate queries
one are inlined. Element nodes having an in-degree of
on such elements we make use of isRoot fields.
zero are also made separate relations, because they are not
The element sharing in Shared has query processing
reachable from any other node. As in Basic, elements
implications. For example, a selection query over all
below a “*” node are made into separate relations.
authors accesses only one relation in Shared compared to
Finally, of the mutually recursive elements all having in-
five relations in Basic. Despite the fact that Shared
degree one (such as monograph and editor in Figure 8),
addresses some of the shortcomings and shares some of
book (bookID: integer, book.booktitle.isroot: boolean, book.booktitle : string, author.name.firstname: string,
author.name.lastname: string, author.address: string, author.authorid: string)
article (articleID: integer, article.contactauthor.isroot: boolean, article.contactauthor.authorid: string,
article.title.isroot: boolean, article.title: string)
monograph (monographID: integer, monograph.parentID: integer, monograph.parentCODE: integer,
monograph.title: string, monograph.editor.isroot: boolean, monograph.editor.name: string,
author.name.firstname: string, author.name.lastname: string, author.address: string, author.authorid: string)
author (authorID: integer, author.parentID: integer, author.parentCODE: integer, author.name.isroot: boolean,
author.name.firstname.isroot: boolean, author.name.firstname: string, author.name.lastname.isroot: boolean,
author.name.lastname: string, author.address.isroot: boolean, author.address: string, author.authorid: string)
Figure 12
the strengths of Basic, Basic performs better in one about path expressions because we use a relational
important respect – reducing the number of joins starting database which uses joins to process path expressions.
at a particular element node. Thus we explore a hybrid This subsection logically contains “forward
approach that combines the join reduction properties of references” to Section 4, in which we describe how SQL
Basic with the sharing features of Shared queries are generated from semi-structured XML queries.
However, the only point from Section 4 that is necessary
3.5 The Hybrid Inlining Techn ique to understand the results here is that a single semi-
structured query could give rise to a union of several SQL
The Hybrid Inlining Technique, or Hybrid, is the same as
queries, and that each of these queries may contain some
Shared except that it inlines some elements that are not
number of joins. The use of Basic vs. Shared vs. Hybrid
inlined in Shared. In particular, Hybrid additionally
determines how many queries are generated, and how
inlines elements with in-degree greater than one that are
many joins are found in each query. Although Basic and
not recursive or reached through a “*” node. Set sub-
Hybrid reduce the number of joins per SQL query, their
elements and recursive elements are treated as in Shared.
higher degree of inlining could cause more SQL queries
Figure 12 shows the relational schema generated using
to be generated. For each algorithm, each DTD, and a
this hybrid approach. Note how this schema combines
variable number of path lengths, we make the following
features of both Basic and Shared – author is inlined with
measurements:
book and monograph even though it is shared, while
• The average number of SQL queries generated for
monograph and editor are represented exactly once.
path expressions of length N.
So far, we have implicitly assumed that the data model
is unordered, i.e., the position of an element does not • The average number of joins in each SQL query
matter. Order could, however, be easily incorporated into for path expressions of length N.
our framework by storing a position field for each • The total average number of joins in order to
element. process path expressions of length N (the product
of the two previous measurements.)
3.6 A Qualitative Evaluation of the Basic, Shared In Sections 3.6.2 and 3.6.3, we assume that path
and Hybrid Techniques expressions start from an arbitrary element in the DTD.
We relax this assumption in Section 3.6.4.
In this section we qualitatively evaluate our relation-
conversion algorithms using 37 DTDs available from 3.6.2 Evaluation Results for Expr ession Paths of
Robin Cover's SGML/XML Web page [8]. We did not Length 3
pose any criterion for selecting DTDs except for
availability for easy download and validity. Some DTDs In this section we show the results for path expressions of
were excluded because they did not pass our XML parser, length 3, which is the longest path length applicable to all
the IBM alphaWorks xml4j. 37 DTDs. We shall examine the results for other path
lengths in the next section. In the interest of space, we
3.6.1 Evaluation Metric show the results only for a subset of the DTDs and
summarize the others.
Our major concern in evaluating the algorithms is the First we consider whether the Basic approach is
efficiency of query processing. Our metric is the average practical. For 11 of our 37 DTDs, Basic did not run to
number of SQL joins required to process path expressions completion because it ran out of virtual memory. The
of a certain length N. We use this metric because path reason for this is that Basic generates huge numbers of
expressions are at the heart of query languages proposed relations if DTDs have large strongly connected
for semi-structured data. We are particularly concerned components. We can see this effect clearly on some of
the DTDs that Basic did run to completion. One 19 node
DTD has a SCC size of 4, and the number of relations produces at least the number of SQL queries as Shared.
created is 204 times as many as created by Hybrid, Figure 15 shows the total number of joins.
totalling 3462 relations. Due to this severe limitation of Using the average total number of joins required to
Basic, we concentrate on the comparisons between process path expressions of length 3, we can roughly
Shared and Hybrid. categorize the 37 DTDs into four groups:
Group 1. DTDs for which Hybrid reduces a large
Shared Hybrid percentage of joins per SQL query but incurs a smaller
2 increase in the number of SQL queries. The net result is
1.8 Hybrid requires fewer joins than Shared. Example: DTD
1.6
Joins/Query
1.4 “ofx1516”.
1.2
1 Group 2. DTDs for which Hybrid reduces a large
0.8
0.6 percentage of joins per SQL query and incurs a
0.4 comparable increase in the number of SQL queries. The
0.2
0 total number of joins is about the same. Example: DTD
“vrml”.
l
14
6
x
id if
bi l
ej
il
l
tia
am
m
sm
tf-
p
51
at
sa
vr
ps
en
ni
m
x1
s
re
1.2
1 DTD “math”.
0.8
0.6 Hybrid inlines more than Shared in Groups 1, 2 and 3.
0.4
0.2 This reduces the number of joins per SQL query but
0
increases the number of SQL queries. The net increase or
l
14
id if
bi l
il
l
h
m
e
sm
p
tf-
51
at
sa
vr
ps
en
m
ni
x1
s
re
1.5 DTDs
1
0.5 The number of DTDs in each group from all 37 DTDs
0 is summarized in the table above. We can infer that in a
large number of DTDs (Group 4), most of the shared
l
14
l
6
h
si pif
bi l
ej
il
ia
am
m
sm
tf-
51
at
sa
nt
vr
ps
ni
m
de
of
re
7
6
expressions with various operators and wild cards. The
5 challenge is to rewrite these queries in SQL exploiting
4 DTD information. In this section, we consider only
3
2 queries with string values as results. Queries with more
1 complex result formats are dealt with in Section 5. For
0
ease of exposition, we present the translation algorithm
1 2 3 4 5 6 7 8 9 10 11
only in the context of the Shared approach. The
Path Length
generalization to the other approaches is straightforward.
Figure 22 Figure 23
time, the tag attribute in the result tuple can be converted 5.4 Complex Element Construc tion
to the appropriate XML tag (Figure 23).
Unfortunately, returning tag values as tuple attributes
cannot handle all result construction problems. In
5.3 Grouping
particular, queries that are required to return complex
Consider the query in Figure 24 that requires all the XML elements are problematic. Consider a query that
publications of an author (assuming an author is uniquely asks for all article elements in the XML data set, and
identified by his/her last name) to be grouped together, furthermore assume that an article may have multiple
and within this structure, requires the titles of publications authors and multiple titles. In object-relational
to be grouped by the type of the publication. The terminology, article has two set-valued attributes, authors
relational result from the translation of this query will be a and titles, corresponding to two set sub-elements in XML
set of tuples having fields corresponding to last name of terminology.
author, title of publication and type of publication. WHERE <book>
However, we cannot use the relational group-by operator <article> $a </article>
to group by last name and type of publication because the </> IN * CONFORMS TO pubs.dtd
SQL group-by semantics implies that we should apply an CONSTRUCT <article> $a </>
aggregate function to title, which does not make sense. To create the appropriate result, we must retrieve all
Thus, the options are either (a) have the relational engine authors and all titles for each article. This is difficult to do
order the result tuples first by last name and then by type in the relational model because flattening multiple set-
and scan the result in order to construct the XML valued attributes into tuple format gives rise to a multi-
document or (b) get an unordered set of tuples and do a valued dependency [11] and is likely to be very inefficient
grouping operation, by last name and then by type, when the sets are large, for example, if papers have many
outside the relational engine. The first approach is authors and many titles. There appears to be no efficient
illustrated in Figure 25. way to tackle this problem in the traditional relational
Figure 25 illustrates several points. The first is that model. One solution would be to return separate relations,
treating tag variables as attributes in the result relation each flattening one set-valued attribute and “join” these
provides a way of uniformly treating the contents of the relations outside the database while constructing the XML
result XML document. In this case, we are able to group document. However, this requires duplication of database
by the tag variable just like any other attribute. The functionality both in terms of execution and optimization.
second observation is that some relational database This solution would be particularly bad for an element
functionality (hash-based group-by) is either not fully with many set-valued attributes. A related problem occurs
exploited or is duplicated outside. when reconstructing recursive elements. We return to
these issues in Section 6.
WHERE <$p> <author>
<(title|booktitle)> $t </> <name> Darwin </name>
<author> <book>
<lastname> $l </lastname> <title> Origin of Species </title>
</> <title> The Descent of Man </title>
</> IN * CONFORMS TO pubs.dtd </book>
CONSTRUCT <author ID=authorID($l)> <monograph>
<name> $l </name> (Darwin, book, Origin of Species) <title> Subclass Cirripedia </title>
<$p ID=pID($p)> (Darwin, book, Descent of Man) </monograph>
<title> $t </> (Darwin, monograph, Subclass </author>
</> Cirripedia) <author>
</> (Dawkins, book, The Selfish Gene) <name> Dawkins </name>
<book>
<title> The Selfish Gene </title>
</book>
</author>
Figure 24 Figure 25
5.5 Heterogeneous Results Our qualitative evaluation based on real DTDs from
diverse domains raises some performance concerns –
Consider the following XML-QL query that creates a
specifically, in many cases relatively simple XML queries
result document having both titles and authors as elements
require either many SQL queries, or require a few SQL
(this is the heterogeneous result). This is easily handled in
queries with many joins in them. It is an open question
our approach for translating queries because this query
whether semi-structured query processing techniques can
would be split into two queries, one for selecting titles and
do this kind of work more efficiently. The fact that semi-
another for selecting authors. The results of the two
structured models represent a sequence of joins as a path
queries can be handled in different ways, one constructing
expression, or handle what is logically a union of queries
title elements and another constructing author elements.
by using wildcards and “or” operators, does not
The results can then be merged together.
automatically imply more efficient evaluation strategies.
WHERE <article> Our experience has shown that relational systems
<$p> $y </> could more effectively handle XML query workloads with
</article> IN * CONFORMING TO pubs.dtd
CONSTRUCT <$p> $y </>
the following extensions:
Support for Sets: Set-valued attributes would be useful
in two important ways. First, storing set sub-elements as
5.6 Nested Queries
set-valued attributes [19,21] would reduce fragmentation.
XML-QL is structured in terms of query blocks and one This is likely to be a big win because most of the
query block can be nested under another. These nested fragmentation we observed in real DTDs was due to sets.
queries can be rewritten in terms of SQL queries, using Second, set-valued attributes, along with support for
outer joins (and possibly skolem function ids) to construct nesting [13], would allow a relational system to perform
the association between a query and a sub-query. The more of the processing required for generating complex
details are complex and we omit it in the interest of space. XML results.
Untyped/Variable-Typed References: IDREFs are not
6. Conclusions typed in XML. Therefore, queries that navigate through
IDREFs cannot be handled in current relational systems
With the growing importance of XML documents as a without a proliferation of joins – one for each possible
means to represent data in the World Wide Web, there has reference type.
been a lot of effort on devising new technologies to Information Retrieval Style Indices: More powerful
process queries over XML documents. Our focus in this indices, such as Oracle8i’s ConText search engine for
paper, however, has been to study the virtues and XML [17], that can index over the structure of string
limitations of the traditional relational model for attributes would be useful in querying over ANY fields in
processing queries over XML documents conforming to a a DTD. Further, under restricted query requirements,
schema. The potential advantages of this approach are whole fragments of a document can be stored as an
many – reusing a mature technology, using an existing indexed text field, thus reducing fragmentation.
high performance system, and seamlessly querying over Flexible Comparisons Operators: A DTD schema
data represented as XML documents or relations. We treats every value as a string. This often creates the need
have shown that it is possible to handle most queries on to compare a string attribute with, say, an integer value,
XML documents using a relational database, barring after typecasting the string to an integer. The traditional
certain types of complex recursion. relational model cannot support such comparisons. The
problem persists even in the presence of DCDs or XML
Schemas because different DTDs may represent Unstructured Data”, Proceedings of the ACM
“comparable” values as different types. A related issue is SIGMOD Conference, Montreal, Canada, June 1996.
that of flexible indices. Techniques for building such 6. V. Christophides, S. Abiteboul, S. Cluet, M. Scholl,
indices have been proposed in the context of semi- “From Structured Documents to Novel Query
structured databases [14]. Facilities”, Proceedings of the ACM SIGMOD
Multiple-Query Optimization/Execution: As outlined Conference, Minneapolis, Minnesota, May 1994.
in Section 4, complex path expressions are handled in a 7. G. Copeland, S. Khoshafian, “A Decomposition
relational database by converting them into many simple Storage Model”, Proceedings of the ACM SIGMOD
path expressions, each corresponding to a separate SQL Conference, Austin, Texas, May 1985.
query. Since these SQL queries are derived from a single 8. R. Cover, “The SGML/XML Web Page”,
regular path expression, they are likely to share many http://www.oasis-open.org/cover/xml.html.
relational scans, selections and joins. Rather than treating 9. Deutsch, M. Fernandez, D. Florescu, A. Levy, D.
them all as separate queries, it may be more efficient to Suciu, “XML-QL: A Query Language for XML”,
optimize and execute them as a group [20]. http://www.w3.org/TR/NOTE-xml-ql.
More Powerful Recursion: As mentioned in Section 4, 10. Deutsch, M. Fernandez, D. Suciu, “Storing Semi-
in order to fully support all recursive path expressions, structured Data with STORED”, Proceedings of the
support for fixed point expressions defined in terms of ACM SIGMOD Conference, Philadelphia,
other fixed point expressions (i.e., nested fixed point Pennslyvania, May 1999.
expressions) is required. 11. R. Fagin, “Multi-valued Dependencies and a New
These extensions are not by themselves new and have Normal Form for Relational Databases”, ACM
been proposed in other contexts. However, they gain new Transactions on Database Systems, 2(3), pp. 262-278,
importance in light of our evaluation of the requirements 1977.
for processing XML documents. Another important issue 12. M. Fernandez, D. Suciu, “Optimizing Regular Path
to be considered in the context of the World Wide Web is Expressions Using Graph Schemas”, Proceedings of
distributed query processing – taking advantage of the Fourteenth ICDE Conference, Orlando, Florida,
queryable XML sources. Further research on these February 1998.
techniques in the context of processing XML documents 13. Jaeschke, H. J. Schek, “Remarks on the Algebra of
will, we believe, facilitate the use of sophisticated Non First Normal Form Relations”, Proceedings of
relational data management techniques in handling the the ACM Symposium on Principles of Database
novel requirements of emerging XML-based applications. Systems, Los Angeles, California, March 1982.
14. J. McHugh, S. Abiteboul, R. Goldman, D. Quass, J.
7. Acknowledgements Widom, “Lore: A Database Management System for
Semistructured Data”, SIGMOD Record, 26(3), pp.
Funding for this work was provided by DARPA through 54-66, September 1997.
Rome Research Laboratory Contract No. F30602-97-2 15. J. McHugh, J. Widom, “Compile-Time Path
0247 and NSF through NSF Award CDA-9623632. Expansion in Lore”, Workshop on Query Processing
for Semistructured Data and Non-Standard Data
8. References Formats, Jerusalem, Israel, January 1999.
16. Microsoft Corporation, XML Schema,
1. S. Abiteboul, D. Quass, J. McHugh, J. Widom, J.
http://www.microsoft.com/xml/schema/reference/star
Wiener, “The Lorel Query Language for
.asp.
Semistructured Data”, International Journal on
17. Oracle Corporation, “XML Support in Oracle 8 and
Digital Libraries, 1(1), pp. 68-88, April 1997.
beyond”, Technical white paper,
2. J. Bosak, T. Bray, D. Connolly, E. Maler, G. Nicol,
http://www.oracle.com/xml/documents.
C. M. Sperberg-McQueen, L. Wood, J. Clark, “W3C
18. The Query Languages Workshop (QL’98),
XML Specification DTD”,
http://www.w3.org/TandS/QL/QL98/, December
http://www.w3.org/XML/1998/06/xmlspec-report
1998.
19980910.htm.
19. K. Ramasamy, J. F. Naughton, D. Maier, “Storage
3. T. Bray, J. Paoli, C. M. Sperberg-McQueen, Representations for Set-Valued Attributes”, Working
“Extensible Markup Language (XML) 1.0”, Paper, Department of Computer Sciences, University
http://www.w3.org/TR/REC-xml.
of Wisconsin-Madison.
4. T. Bray, C. Frankston, A. Malhotra, “Document
20. T. Sellis, “Multiple-Query Optimization”, ACM
Content Description for XML”, Transactions on Database Systems, 12(1), pp. 23-52,
http://www.w3.org/TR/NOTE-dcd. June 1990.
5. P. Buneman, S. Davidson, G. Hillebrand, D. Suciu, 21. Zaniolo, “The Database Language GEM”,
“A Query Language and Optimization Techniques for Proceedings of the ACM SIGMOD Conference, San
Jose, California, May 1983.