R Taha
R Taha
1
Department of Mathematics
Faculty of Science
Ain Shams University
Cairo, Egypt
2
Central Lab. for Agricultural Expert System
Agricultural Research Center
Giza, Egypt
3
College of Business
University of Jeddah
Jeddah, Saudi Arabia
4
Faculty of Computers and Artificial Intelligence
Helwan University
Helwan, Cairo, Egypt
email: [email protected]
Abstract
1 Introduction
The World Wide Web was basically designed for human usage, and in spite of
the fact that everything on it is machine-readable, the data is not machine-
understandable [1]. The semantic web and semantic web technologies provide
a new approach for managing and processing data; the semantic web’s main
idea is the creation and usage of semantic metadata [2].
The basic data model for writing simple statements about web resources
is the Resource Description Framework (RDF). One of the representations
of the RDF is the graph data model, RDF graph, which is a standard model
for data interchange on the web. It is utilized for expressing data about
data resources on the semantic web to become suitable for processing by
applications [3]. Therefore, the good creation of the RDF is the base of the
success of the semantic web [4].
One of the essential features of the RDF graph model is its ability to
interconnect resources of the RDF in an extensible way. The basic notions of
graph theory like node, edge, path, neighborhood, connectivity, distance, and
RDF-BF-Hypergraph Representation for Relational Database 43
degree play a central role in expressing the RDF with graph-like structure.
Introducing graphs as a modeling tool for a connected data has many
advantages. Graph structures are visible to the user and permit a natural
way of handling applications data. Queries can refer directly to this graph
structure. Associated with graphs are graph operations in the query language
algebra, like finding shortest paths and determining certain subgraphs using
graph traversal algorithms [5, 6].
An active field of research during the last decade is making data hosted
in relational databases (RDB) machine understandable to the semantic web
[7]. Relational databases should be represented as RDF, in order to integrate
relational databases into semantic web applications.
Relational database schemes have been defined by Codd [8] as a collection
of table skeletons. These tables can be represented as hypergraphs, where
every attribute of a database scheme R corresponds to a node in a hypergraph
H, also every relation scheme R in R corresponds to a hyperedge in H [9, 10].
In the last decades a class of acyclic database scheme and different de-
grees of acyclicity has been introduced [11], such as α-acyclicity, β-acyclicity,
and γ-acyclicity [10]. The least restrictive degree of hypergraph acyclicity
is α-acyclicity and in the literature it has more studies than the other two
acyclicity degrees. A database scheme is called α-acyclic if the corresponding
hypergraph is [9, 12].
In the case of acyclic hypergraphs database the query optimization be-
comes easier than in the case of cyclic database, and might be recognized
in linear time. Furthermore, acyclic hypergraphs is preferred in order to
minimize the time of answering query and decrease its space efficient access
paths. Also the result of answering query will be reduced specially in the
case of representing RDB as graph data model [13, 14, 15, 16, 17].
In the literature, there have been some contributions to represent the
RDB as a graph model. For example, Berners-Lee [18] introduced an informal
model approach to represent the relational database as a graph model as
follows:
• Each cell with a foreign key constraint as the object of an object prop-
erty.
44 F. F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass
into acyclic one if it is possible, which is used in the third step. The second
algorithm generates the RDF-BF-hypergraph schema that corresponds to an
acyclic/cyclic RDB schema according to its set of functional dependency,
which is used in the fourth step.
The paper is organized as follows: in Section 2, the basic definitions
of graphs, RDF graph, hypergraphs, and relational databases are given. In
Section 3, a model to generate the RDF-BF-hypergraph schema is introduced.
In Section 4, the result of the two introduced algorithms and a systematic
discussion are given, Finally, Section 5, illustrates the conclusion of the paper.
2 Preliminaries
In this section the basic definitions of Graphs, RDF Graph, hypergraphs,
and relational databases are given.
Figure 3: Directed
Figure 1: Simple Graph Figure 2: MultiGraph Graph
Sn
set V (H) = i=1 ei of nodes of a hypergraph H, is defined to be the union
Figure 7: H’ = {e1 , e2 , e3 }
Figure 6: An undirected hy- is subhypergraph of the hyper-
pergraph with four edges graph figure 6 and V(H) =
V(H)’
Figure 11:
Definition 2.27. A set F of all Functional Dependencies on the attribute
→
− → −
− → →
−
set N can be represented by a directed hypergraph H = ( V , E ), with V =
52 F. F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass
→
−
N and E = {(X, Y \X) : F (X, Y ) ∈ F, Y * X} [31]. For example, Figure 11
(a), represents the set of functional dependencies of the SUPPLIER relation,
Figure 11 (b), represents the set of functional dependencies of the PART
relation, and Figure 11 (c), represents the set of functional dependencies of
the SHIPMENT relation in the database schema of Figure 8.
which is used in the third step. The second algorithm generates the RDF-
BF-hypergraph schema that corresponds to an acyclic/cyclic RDB schema
according to its set of functional dependency, which is used in the fourth
step. The four steps are illustrated in Figure 12.
Note that taking a database schemes in the third normal form will guar-
antee that every non-key attribute A in R is fully functionally dependent on
the primary key of R, and no non-key attribute of R is transitively dependent
on the primary key.
54 F. F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass
−
→
V = {S#, Sname, Status, Supplier City, P #, P name, Color,
W eight, P art City , Date, Qty}, and the set of directed hyperedges
→
−
E = ({S#}, {Sname}), ({S#}, {Status}), ({S#}, {Supplier City}), ({P #},
{P name}), ({P #}, {Color}), ({P #}, {W eight}), ({P #}, {P art City}),
({S#, P #, Date, B1 }, {Qty}), as shown in Figure 15.
• Each node will have a label that corresponds to attribute name’s value.
Each hyperedge will have a label that corresponds to that attribute
58 F. F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass
name [32].
• In the case of the many-to-many relational joins, each B-connector will
be replaced by the name of the bridge table as shown in Figure 16 in
which the B-connector B1 of Figure 15, is replaced by shipment which
is the name of the bridge table of the database of Figure 8.
• In the case of one-to-many relation joins, the hyperedge will connect
the value of the primary key in ”many” table to the referenced value
of the foreign key in the one table through a label that corresponds
to the name of ”one” table, for example Figure 19 shows the instance
of the RDF-BF-hypergraph for one-to-many relation joins of the RDF-
BF-Hypergraph schema of Figure 18, that corresponds to the acyclic
database schema of Figure 17.
queries are expressed as a graph and query evaluation relies on graph match-
ing between the query and the database [34]. SPARQL shares a conceptual
core, which consists of two natural operations in the context of querying
graphs:
The semantic web has an initiative that aims to improve the current state of
the World Wide Web to make data machine-understandable. Semantic web
depends on RDF-graph data model which is a standard model for data inter-
change on the web. Even web contents that are generated automatically from
databases are usually presented without the original structural information
found in databases. In the last decades a special class of database schemes,
called acyclic database schema, was introduced. In the case of acyclic hyper-
graphs the query optimization becomes easier than in the case of cyclicity and
might be recognized in linear time. This paper introduces a BF-hypergraph
representation for the RDF schema that corresponds to RDB schema. The
BF-hypergraph is a suitable model to represent the set of functional depen-
dencies of RDB since the domain and codomain of the functional dependency
may contain more than one attribute. We propose a model, consisting of four
steps, to represent the RDF-BF-hypergraph schema that corresponds to an
acyclic/cyclic RDB schema according to its set of functional dependency.
In the first step a given database schema is represented as hypergraph by
identifying its tables and their attributes. The second step will check for
α-cyclicity by detecting the set of α-nodes. If the hypergraph is α-cyclic,
then we will treat this cycle(s) to see if it can be removed or not, which is
the third step. In the fourth step an RDF-BF-hypergraph schema will be
generated for the resulted acyclic undirected hypergraph in the case if the
cycle(s) can be removed or for the original undirected hypergraph of step
one. Moreover, two algorithms of the proposed model are introduced. The
first algorithm converts a cyclic undirected hypergraph that corresponds to
RDB schema into acyclic one if it is possible, which is used in the third step.
The second algorithm generates the RDF-BF-hypergraph schema that cor-
responds to an acyclic/cyclic RDB schema according to its set of functional
dependency, which is used in the fourth step. A formal representation of the
RDF-graph schema will preserve the integrity constraints to ensure integrity
and data semantics. Moreover, a formal representation of RDF-graph schema
will facilitate the data conversion and will allow graph connectivity which is
essential for querying the RDF by using concepts, techniques and various
traversal algorithms. In the case of acyclic RDF-BF-hypergraph instance,
the time and space needed for query answering will be reduced and, finally,
the semantics of the RDB schema will be maintained. For future work, we
will extend our model to propose a tool to generate the RDF-BF-hypergraph
schema and instance from an existing database.
RDF-BF-Hypergraph Representation for Relational Database 61
References
[1] O. Lassila, R. Swick, Resource Description Framework (RDF) Model
and Syntax Specification, W3C Recommendation, World Wide Web,
http://www.w3.org/TR/1999/REC-rdf-syntax-19990222, 1999.
[8] E. F. Codd, A relational model of data for large shared data banks,
Communication of the ACM, 13, no. 6, (1970), 377–387.