0% found this document useful (0 votes)
33 views24 pages

R Taha

R-Taha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views24 pages

R Taha

R-Taha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

b b

International Journal of Mathematics and M


Computer Science, 15(2020), no. 1, 41–64 CS

RDF-BF-Hypergraph Representation for


Relational Database

Fayed F. M. Ghaleb1 , Azza A. Taha1 , Maryam Hazman2 ,


Mahmoud Abd ElLatif3,4 , Mona Abbass2

1
Department of Mathematics
Faculty of Science
Ain Shams University
Cairo, Egypt

2
Central Lab. for Agricultural Expert System
Agricultural Research Center
Giza, Egypt

3
College of Business
University of Jeddah
Jeddah, Saudi Arabia

4
Faculty of Computers and Artificial Intelligence
Helwan University
Helwan, Cairo, Egypt

email: [email protected]

(Received August 17, 2019, Accepted September 19, 2019)

Abstract

The semantic web is designed to provide machine accessible mean-


ing for its constructs. Semantic web depends on Resource Description
Framework (RDF), which is a standard model for data interchange
on the Web. The RDF can be represented as graph which is known
as RDF graph data model. One of the goals of the semantic web

Key words and phrases: Hypergraph, Hypergraph Acyclicity, Relational


Database Scheme, α-cycle, Semantic Web, RDF.
AMS (MOS) Subject Classifications: 68P15, 05C65, 97R50.
ISSN 1814-0432, 2020, http://ijmcs.future-in-tech.net
42 F. F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass

is to display the huge quantities of data that is stored in the Rela-


tional Databases (RDB) for computer processing. In order to integrate
RDB into semantic web applications, RDB should be represented as
RDF graph data model, which is fundamental for developing the se-
mantic web in the recent decade. In the last decades a special class
of database schemes, called acyclic database schema was introduced.
This class is preferred in order to minimize the time of answering query
and decrease its space efficient access paths. Also the result of answer-
ing a query will be reduced specially in the case of representing RDB
as graph data model. This paper introduces a Backward and Forward
hypergrapg (BF-hypergraph) representation for the RDF schema that
corresponds to RDB schema. The BF-hypergraph is a suitable model
to represent the set of functional dependencies of RDB, since the do-
main and codomain of the functional dependency may contain more
than one attribute. Finally, the paper proposes a model to represent
the RDF-BF-hypergraph schema that corresponds to an acyclic/cyclic
RDB schema according to its set of functional dependency. A formal
representation of RDF graph schema will preserve the integrity con-
straints and data semantics. Moreover, a formal representation of
RDF graph schema will facilitate the data conversion and will allow
graph connectivity which is essential for querying the RDF using var-
ious traversal algorithms.

1 Introduction
The World Wide Web was basically designed for human usage, and in spite of
the fact that everything on it is machine-readable, the data is not machine-
understandable [1]. The semantic web and semantic web technologies provide
a new approach for managing and processing data; the semantic web’s main
idea is the creation and usage of semantic metadata [2].
The basic data model for writing simple statements about web resources
is the Resource Description Framework (RDF). One of the representations
of the RDF is the graph data model, RDF graph, which is a standard model
for data interchange on the web. It is utilized for expressing data about
data resources on the semantic web to become suitable for processing by
applications [3]. Therefore, the good creation of the RDF is the base of the
success of the semantic web [4].
One of the essential features of the RDF graph model is its ability to
interconnect resources of the RDF in an extensible way. The basic notions of
graph theory like node, edge, path, neighborhood, connectivity, distance, and
RDF-BF-Hypergraph Representation for Relational Database 43

degree play a central role in expressing the RDF with graph-like structure.
Introducing graphs as a modeling tool for a connected data has many
advantages. Graph structures are visible to the user and permit a natural
way of handling applications data. Queries can refer directly to this graph
structure. Associated with graphs are graph operations in the query language
algebra, like finding shortest paths and determining certain subgraphs using
graph traversal algorithms [5, 6].
An active field of research during the last decade is making data hosted
in relational databases (RDB) machine understandable to the semantic web
[7]. Relational databases should be represented as RDF, in order to integrate
relational databases into semantic web applications.
Relational database schemes have been defined by Codd [8] as a collection
of table skeletons. These tables can be represented as hypergraphs, where
every attribute of a database scheme R corresponds to a node in a hypergraph
H, also every relation scheme R in R corresponds to a hyperedge in H [9, 10].
In the last decades a class of acyclic database scheme and different de-
grees of acyclicity has been introduced [11], such as α-acyclicity, β-acyclicity,
and γ-acyclicity [10]. The least restrictive degree of hypergraph acyclicity
is α-acyclicity and in the literature it has more studies than the other two
acyclicity degrees. A database scheme is called α-acyclic if the corresponding
hypergraph is [9, 12].
In the case of acyclic hypergraphs database the query optimization be-
comes easier than in the case of cyclic database, and might be recognized
in linear time. Furthermore, acyclic hypergraphs is preferred in order to
minimize the time of answering query and decrease its space efficient access
paths. Also the result of answering query will be reduced specially in the
case of representing RDB as graph data model [13, 14, 15, 16, 17].
In the literature, there have been some contributions to represent the
RDB as a graph model. For example, Berners-Lee [18] introduced an informal
model approach to represent the relational database as a graph model as
follows:

• Each row of a table as a resource of the RDF,

• The RDB column names as an RDF predicates,

• Each cell with a literal value as the object of a data property,

• Each cell with a foreign key constraint as the object of an object prop-
erty.
44 F. F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass

Applying the above set of rules automatically generate mappings between


RDB and RDF graph model, which is called a directed mapping. Byrne [19],
presented the Simple Knowledge Organization System (SKOS) framework,
which is used to transform the Royal Commission on the Ancient and His-
torical Monuments of Scotland (RCAHMS) thesauri to the semantic web. In
SKOS framework the size of the RDF dataset has been reduced by about
2.8 million triples. Reducing graph size makes a colossal difference to per-
formance in terms of storage, loading, and query times, but the result is not
a graph according to the mathematical definition.
Most of the academic works that have been proposed to represent the
RDB as RDF graph data model are depending on transforming the data that
is stored in the relational model not the schema. Furthermore, none of the
existing models represents a formal graph model for the RDF graph schema
or instance. A formal representation of the RDF graph schema should pre-
serve the integrity constraints to ensure integrity and data semantics, which
is captured from a database schema. Moreover, a formal representation of
RDF graph schema should facilitate the data conversion and should allow
graph connectivity which is essential for querying the RDF using various
traversal algorithms.
This paper introduces a Backward and Forward hypergrapg (BF-hypergraph)
representation for the RDF schema that corresponds to RDB schema. The
BF-hypergraph is a suitable model to represent the set of functional depen-
dencies of RDB, since the domain and codomain of the functional depen-
dency may contain more than one attribute. Finally, the paper proposes a
model to represent the RDF-BF-hypergraph schema that corresponds to an
acyclic/cyclic RDB schema according to its set of functional dependency to
preserve the integrity constraints and data semantics. Moreover, a formal
representation of RDF graph schema should facilitate the data conversion
and should allow graph connectivity which is essential for querying the RDF
using various traversal algorithms. The model consists of four steps. In the
first step a given database schema is represented as hypergraph by identi-
fying its tables and their attributes. Second step will check for α-cyclicity
by detecting the set of α-nodes. If the hypergraph is α-cyclic, then we will
treat this cycle(s) to see if it can be removed or not, which is the third step.
In the fourth step, an RDF-BF-hypergraph schema will be generated for
the resulted acyclic undirected hypergraph in the case if the cycle(s) can be
removed or for the original undirected hypergraph of step one. Moreover,
two algorithms of the proposed model are introduced. The first algorithm
converts a cyclic undirected hypergraph that corresponds to RDB schema
RDF-BF-Hypergraph Representation for Relational Database 45

into acyclic one if it is possible, which is used in the third step. The second
algorithm generates the RDF-BF-hypergraph schema that corresponds to an
acyclic/cyclic RDB schema according to its set of functional dependency,
which is used in the fourth step.
The paper is organized as follows: in Section 2, the basic definitions
of graphs, RDF graph, hypergraphs, and relational databases are given. In
Section 3, a model to generate the RDF-BF-hypergraph schema is introduced.
In Section 4, the result of the two introduced algorithms and a systematic
discussion are given, Finally, Section 5, illustrates the conclusion of the paper.

2 Preliminaries
In this section the basic definitions of Graphs, RDF Graph, hypergraphs,
and relational databases are given.

2.1 Graphs and RDF Graph


Definition 2.1. A graph G is a pair (V, E), where V is a set of nodes, and E
is a set of edges. Each edge e ∈ E is an unordered pairs {u, v}, u, v ∈ V [21].
For example, Figure 1, shows a graph with set of nodes V = {A, B, C, D}
and set of edges E = {e1 , e2 , e3 , e4 , e5 }, where e1 = {A, B}, e2 = {B, C}, e3 =
{C, D}, e4 = {A, C}, and e5 = {B, D}.
Definition 2.2. Two edges ei and ej are said to be incident if they share a
node [21]. For example, the two edges e1 and e2 of the graph of Figure 1 are
incident.
Definition 2.3. A graph G is a multigraph if multiple edges are permitted
between two nodes [21]. For example, Figure 2, shows a multigraph with
set of nodes V = {A, B, C, D} and set of edges E = {e1 , e2 , e3 , e4 , e5 , e6 },
where e1 = {A, D}, e2 = {A, C}, e3 = {B, D}, e4 = {B, C}, e5 = {B, C},
and e6 = {D, D} and two multiple edges e4 and e5 .
Definition 2.4. A directed graph or digraph D is an ordered pair (V, E)
with V is a set of nodes, and E a set of ordered pairs of nodes, called edges,
directed edges, or arrows as shown in Figure 3. An edge e = (A, B) is con-
sidered directed from A to B; A is called the head and B is called the tail of
the edge [22].
Definition 2.5. A path in a graph G = (V, E) is a sequence of edges e1 , ..., en
where each edge ei is incident to ei+1 , for 1 ≤ i < n [21]. A path is said to
be simple if ei 6= ej for i, j ≤ n, i 6= j. For example, the simple path from
46 F. F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass

node A to node D in the graph of Figure 1 is (e1 , e2 , e3 ).

The RDF statements are triples <Sub><Pred><Obj>, consisting of


three parts a subject <Sub>, a predicate <Pred> and an object <Obj>,
where the subject is the resource to be described, the predicate is a prop-
erty, and the object is a propertys value. For example, The RDF statement
<kareem><is a><doctor>, gives Kareem as the subject of the triple, ”is a”
as the predicate of the triple, and doctor as the object of the triple. An
RDF graph T is the set of RDF triples represented as graph. Figure 4,
shows the RDF graph data model [20].
Definition 2.6. let T be an RDF triple, the node-edge-labeled multigraph
δ(T ) is (V, E, lv , le ) where,
1. V = {vx : x ∈ subj(T ) ∪ obj(T )}, lv (vx ) = x} and
2. E = {es,p,o : (s, p, o) ∈ T }, such that there is an edge e from vs to
vo , le (es,p,o) = p [21].
For example, Figure 5, shows the node-edge-labeled multigraph for the RDF
T1 = {(P icasso, paints, Guernica), (Guernica, type, P aint), (Zapata, type, P aint),
(P aints, range, P aint), (P aints, domain, P ainter)}, where
V = {v1 , v2 , v3 , v4 , v5 , v6 },lv = {P icasso, Guernica, P aint, Zapata, paints, P ainter}
, E = {e1 , e2 , e3 , e4 }, and le = {paints, type, range, domain}.
Note that the set of edge labels and node labels might not be disjoint.
Definition 2.7. Two nodes x, y are connected if there exists a path e1 , ..., en
with x ∈ e1 and y ∈ en . The length of a path is the number of edges it
contains [21]. For example, the length of the path (e1 , e2 , e3 ) from node A to
node D in the graph of Figure 2 is 3.

Figure 3: Directed
Figure 1: Simple Graph Figure 2: MultiGraph Graph

2.2 Hypergraphs and Relational Databases


Definition 2.8. An undirected hypergraph H = {e1 , ..., en } is a finite set of
non-empty finite sets ei , which is called its hyperedge (or simply edge), the
RDF-BF-Hypergraph Representation for Relational Database 47

Sn
set V (H) = i=1 ei of nodes of a hypergraph H, is defined to be the union

Figure 4: RDF triple

Figure 5: The node-edge-labeled multigraph for the RDF T1 of definition


2.6
of all its hyperedges [23]. For example, Figure 6, shows a hypergraph with
four edges {e1 , e2 , e3 , e4 }, where e1 = {ABC}, e2 = {CDE},e3 = {EF A},
and e4 = {ACE} (Note that ABC is abbreviation of {A, B, C}, etc.), and
V (H) = {A, B, C, D, E, F }.
Definition 2.9. The size of a hypergraph |H| is defined to be the number
of edges in it [1]. For example, the size of the hypergraph H in Figure 6 is 4.
Also, the size of an edge |e| is defined to be the number of nodes in it. For
example, the size of any edge in the hypergraph of Figure 6 is 3.
Definition 2.10. An edge e ∈ H is said to be a singleton, if |e| = 1, i.e., if
it contains exactly one node [10].
Definition 2.11. A hypergraph H ′ is said to be a subhypergraph of a hy-
pergraph H if H ′ ⊆ H [23]. Note that, the suphypergraph H ′ , in H ′ ⊆ H,
is obtained from H by removing edges, and not removing nodes from edges.
For example, Figure 7 shows that H ′ = {e1 , e2 , e3 } is subhypergraph of the
hypergraph in Figure 6 where V (H) = V (H ′ ).
Definition 2.12. let V be a finite set of database attributes, a database
scheme R = {R1 , ..., Rp }, where each relation schemes Ri is a set of subsets
of V, i.e., Ri = {Ai1 , Ai2 , ..., Aini } , and i = 1, ..., p : ni is the number of
the attributes of the relation scheme Ri . Each attribute Aij is associated
with a domain Dij . This database scheme can be represented as hypergraph
Sn=
H {B1 , B2 , ..., Bp } where
Sp Bi , i = 1, ..., p are its hyperedges such that, Bi =
j=1 Aij and V (H) = i=1 Bi . For example, in the hypergraph of Figure 9,
i

H = {{S#, Sname, Status, City}, {P #, P name, Color, W eight, City},


{S#, P #, Date, Qty}}. The hypergraph corresponds to the SUPPLIER-
PART database schemes [24] of Figure 8, where R = {SUPPLIER, PART,
SHIPMENT}. The database R consists of three relation schemes: a SUP-
48 F. F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass

PLIER relation scheme with attributes S# (for supplier number), Sname


(for supplier name), Status, and City; a P ART relation scheme with at-
tributes P # (for part number), P name (for part name), Color, W eight,
and City; and a SHIP MENT relation scheme with attributes S#, P # ,
Date, and Qty (for quantity).
Definition 2.13. Two edges e and f of a hypergraph are called properly
intersecting if e * f, f * e and e ∩ f 6= ∅ [20]. For example, the two edges
e1 , e2 in Figure 6 are properly intersecting.
Definition 2.14. Let H be a hypergraph and v ∈ V (H), the star H(v), of
the node v, is defined to be the set of all edges containing v ,i.e., H(v) =
{e ∈ H|v ∈ e}. For example, the star of the node A in the hypergraph H of
Figure 6 is H(A) = {e1 , e3 , e4 } [23].
Definition 2.15. Let e and f be two properly intersecting hyperedges of
a hypergraph H. A sequence (e = e1 , ..., ep = f ), such that p > 2 is called
a sequence of neighborhoods between e and f , if e ∩ f ( ek ∩ ek+1 , for
k = 1, ..., p − 1 [25]. For example, (e1 , e4 , e2 ) in Figure 6 is a sequence of
neighborhoods connecting e1 and e2 .

Figure 7: H’ = {e1 , e2 , e3 }
Figure 6: An undirected hy- is subhypergraph of the hyper-
pergraph with four edges graph figure 6 and V(H) =
V(H)’

Figure 8: A cyclic SUPPLIER-PART database schemes


Definition 2.16. Let e and f be two properly intersecting hyperedges
of a hypergraph H, the two edges e and f are α−neighboring, if there is no
sequence of neighborhoods between them [23]. For example, the two inter-
secting edges e1 and e2 of the hypergrapgh in Figure 7 are α−neighboring,
because there is no sequence of neighborhoods connecting e1 and e2 , while
the two intersecting edges e1 and e2 of the hypergrapgh in Figure 6 are not
RDF-BF-Hypergraph Representation for Relational Database 49

Figure 9: The undirected hypergraph corresponding to the cyclic database


schemes of Figure 8

α−neighboring, because (e1 , e4 , e2 ) is a sequence of neighborhoods connect-


ing e1 and e2 .
Definition 2.17. An α−path in a hypergraph H is a sequence of hyperedges
(e1 , ..., ep ), such that ∀j, 1 ≤ j < p; ej and ej+1 are α−neighboring [25]. For
example, the sequence (e1 , e2 , e3 ) in Figure 7 is α−path.
Definition 2.18. An α−cycle in a hypergrph H is an α−path (e1 , ..., ep )
such that p > 3, e1 = ep , ∄1 ≤ a < b < p, such that ea ∩ ea+1 ⊂ eb ∩ eb+1 [25].
For example, the sequence (e1 , e2 , e3 , e1 ) in Figure 7 is α−cycle.
Definition 2.19. An α − path(e1 , ..., eq ), q > 2, is called quasi α−cycle
(αQ −cycle) [26] in a hypergraph H if

1. The two edges e1 and eq are α−neighboring and

2. ∄1 ≤ a < b < q, such that ea ∩ ea+1 ⊂ eb ∩ eb+1 . For example, the


sequence (e1 , e2 , e3 ) in Figure 7 is αQ −cycle.
Definition 2.20. Let H be a hypergraph, the set of all α−cycles in H,
is the set Ć ¸ α−cycle = {C : C is an α−cycle in H}, and for every cycle
C = (e1 , ..., ep ) ∈ Ƹ α−cycle , p > 3, then the set VαC (H) = ∪p−1
i=1 ei ∩ ei+1 is
called the set of α−nodes of the α−cycle C in a hypergraph H, and the set
Vα (H) = ∪ V C (H) is called the set of α − nodes of the all cycles
¸ α−cycle α
C∈Ć

¸ α−cycle in a hypergraph H [26]. For example, Vα (H) = {S#, P #, City} is



the set of α−nodes of the hypergraph of Figure 9.
Lemma 2.1. A hypergraph H is α−acyclic if and only if it contains no
α−node [26].

The GYO algorithm [27] was introduced to determine the acyclicity of a


hyperhraph H = {e1 , ..., en } in linear time. The algorithm conveys to two
rules:

• Rule 1: If an edge ei contains an isolated node, delete this node from


ei .
50 F. F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass

• Rule 2: Delete edge ei if there is another edge ej such that ei ⊆ ej .


These two rules are applied repeatedly until no rules can be applied by getting
either GY O(H) = ∅ and hence H is acyclic or GY O(H) 6= ∅ and hence H is
cyclic.

− → −
− → →

Definition 2.21. A directed hypergraph H = ( V , E ), where V is a finite


set of nodes and E ⊂ {(T (e), H(e)) : T (e), H(e) ∈ P (V ) \ ∅ & T (e) ∩ H(e) =
∅} is the set of directed edges, where P (V ) is the power set of V, T (e) and
H(e) are said to be the tail and the head of e respectively. The head and tail
represent the set of nodes where the hyperedge ends and starts respectively
[28]. It is clear that |T (e)| > 0, |H(e)| > 0. A directed hyperedge e is
simple if |T (e)| = 1 and |H(e)| = 1. For example, Figure 10, shows a

− →

directed hypergraph H with a set of nodes V = {v1 , v2 , v3 , v4 , v5 , v6 , v7 } and


set of hyperedges E = {e1 , e2 , e3 , e4 }, where e1 = ({v1 , v2 }, {v3 , v4 }), e2 =
({v1 , v3 }, {v5 }), e3 = ({v7 }, {v3 , v4 }) and e4 = ({v5 }, {v6 , v7 }).

Figure 10: A BF-hypergraph H with 7 nodes and 4 hyperedges [29]


Definition 2.22. A Backward hyperedge, or B-hyperedge, is a hyperedge
e = (T (e), H(e)) with |H(e)| = 1. Also a forward hyperedge, or F-hyperedge,
is a hyperedge e = (T (e), H(e)) with |T (e)| = 1. For example, in the hy-
pergraph of Figure 10, e2 is a B-hyperedge, e3 and e4 are F-hyperedges in
the hypergraph of Figure 10 [28]. A B-graph (or B-hypergraph) is a hyper-
graph whose hyperedges are B-hyperedges. An F-graph (or F-hypergraph)
is a hypergraph whose hyperedges are F-hyperedges. A BF-graph (or BF-
hypergraph) is a hypergraph whose hyperedges are either B-hyperedges or
F-hyperedges. For example, the hypergraph of Figure 10 is BF-hypergraph.

− → →
− −
Definition 2.23. Let a hypergraph H = ( V , E ), be a directed hypergraph

− →

and let s, t ∈ V . A simple directed hypergraph path Pst from s to t in H is
a sequence (v1 , e1 , v2 , e2 , ..., vn , en , vn+1 ) consisting of
1. Nodes vi , where 1 ≤ i ≤ n + 1, vi ∈ T (ei ) and
2. Distinct hyperedges ej , where 1 ≤ j ≤ n, such that s = v1 , t = vn+1 ,
and for every 1 ≤ i ≤ n, vi ∈ T (ei ) and vi+1 ∈ H(ei ). For exam-
RDF-BF-Hypergraph Representation for Relational Database 51

ple, in Figure 10 Pv1 v4 = (v1 , e2 , v5 , e4 , v7 , e3 , v4 ) is a simple directed


hypergraph path from v1 to v4 .

Definition 2.24. A simple directed hypergraph path




Pst = (v1 , e1 , v2 , e2 , ..., vn , en , vn+1 ) from s = v1 to t = vn+1 in hypergraph H
is called a cycle if |T (e1 )| > 1 and t ∈ T (e1 ). The simple path Pst is called
elementary if all the nodes vi , where 1 ≤ i ≤ n + 1, are distinct [29]. For


example, in Figure 10, Pv1 v3 = (v1 , e2 , v5 , e4 , v7 , e3 , v3 ) is a simple cycle in H
(since v3 ∈ T (e2 ) = {v1 , v3 }).
Definition 2.25. Given a relation R, a set of attributes X in R is said
to functionally determine another set of attributes Y , also in R, (written
X −→ Y ) if and only if each X value in R is associated with precisely one
Y value in R; R is then said to satisfy the functional dependency X −→ Y
[30]. i.e., ∃f : X −→ Y .
Definition 2.26. Let V be the set of attributes of a RDB. A Functional
Dependency F (X, Y ), with both X and Y are subsets of V , defines uniquely
the value of the attributes in Y once the value of the attributes in X is
given [31]. For example, the functional dependency (FD) of the database
schema of Figure 8, is {P #} −→ {{P name}, {City}, {Color}, {W eight}}
for the PART relation scheme, {S#} −→ {{Sname}, {Status}, {City}} for
the SUPPLIER relation scheme, and {S#, P #, Date} −→ {Qty} for the
SHIPMENT relation scheme.

(a) FD F-hypergraph (b) FD F-hypergraph (c) FD B-hypergraph


for SUPPLIER relation for PART relation for SHIPMENT rela-
tion

Figure 11:
Definition 2.27. A set F of all Functional Dependencies on the attribute

− → −
− → →

set N can be represented by a directed hypergraph H = ( V , E ), with V =
52 F. F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass



N and E = {(X, Y \X) : F (X, Y ) ∈ F, Y * X} [31]. For example, Figure 11
(a), represents the set of functional dependencies of the SUPPLIER relation,
Figure 11 (b), represents the set of functional dependencies of the PART
relation, and Figure 11 (c), represents the set of functional dependencies of
the SHIPMENT relation in the database schema of Figure 8.

3 The Proposed Model


In this section we propose a model that represents RDF-BF-hypergraph
schema that corresponds to an acyclic/cyclic RDB schema according to its
set of functional dependency.

Figure 12: The four steps of the proposed model


The model consists of four steps. In the first step a given database schema
is represented as hypergraph by identifying its tables and their attributes.
Second step will check for α-cyclicity by detecting the set of α-nodes. If
the hypergraph is α-cyclic, then we will treat this cycle(s) to see if it can
be removed or not, which is the third step. In the fourth step, an RDF-
BF-hypergraph schema will be generated for the resulted acyclic undirected
hypergraph in the case if the cycle(s) can be removed or for the original
undirected hypergraph of step one. Moreover, two algorithms of the pro-
posed model are introduced. The first algorithm converts a cyclic undirected
hypergraph that corresponds to RDB schema into acyclic one if it is possible,
RDF-BF-Hypergraph Representation for Relational Database 53

which is used in the third step. The second algorithm generates the RDF-
BF-hypergraph schema that corresponds to an acyclic/cyclic RDB schema
according to its set of functional dependency, which is used in the fourth
step. The four steps are illustrated in Figure 12.

3.1 Representing the Relational Database Schema R


as Hypergraph
In this step, the relational database schemes is represented as hypergraphs,
by creating a node for every attribute of a database scheme R and creating
a hyperedge for every relation scheme R in R.

3.2 Checking for α-cycles


In this step, the algorithm A0 (H) which is proposed by Ghaleb et. al [26]
is used. Algorithm A0 (H) depends on the output of the GYO algorithm to
detect the set of α−nodes which is based on the existence of quasi α−cycle(s)
in a given hypergraph H. This algorithm returns the set of α−nodes, Vα (H)
if the hypergraph is cyclic, or returns Vα (H) = ∅ if the hypergraph is acyclic.

3.3 Treatment of α-cycle(s)


In this step, the following proposed algorithm, αRem (H,n, K) is introduced
to convert a cyclic hypergraph H, which corresponds to RDB into an acyclic
one if it is possible or return failure.

Note that taking a database schemes in the third normal form will guar-
antee that every non-key attribute A in R is fully functionally dependent on
the primary key of R, and no non-key attribute of R is transitively dependent
on the primary key.
54 F. F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass

Algorithm 1 αRem (H,n, K)


Input: An undirected hypergraph H which corresponds to a cyclic database
schemes in the third normal form, the set of α−nodes Vα (H) which is returned
from A0 (H) algorithm, and a set K, which is the set of all keys of this
database schemes.
Output: An α−acyclic hypergraph or failure.
Begin
Assume the set of α−nodes Vα (H) = {v1 , ..., vn }
1: H ′ = H
2: Flag = true
3: while Flag do
4: for i = 1 to n do
5: if vi ∈ K then
6: continue
7: else
8: rename vi to table name vi , where table name is the name of the
table it belongs to
9: end if
10: end for
11: Vα (H ′ ) = A0 (H ′ ) // calling algorithm A0 on H ′
Assume the output of A0 (H ′ ) is the set of α−nodes Vα (H ′ ) =
{v1 , ..., vm }
12: if Vα (H ′) = ∅ then
13: Flag = false // H ′ becomes α−acyclic
14: else if (Vα (H ′) 6= ∅ && ∀v ∈ Vα (H ′ ) : v ∈ K) then
15: Flag = false and Return failure
// all the α−nodes are keys which cannot be renamed
16: else
17: αRem (H ′, m, K)
18: end if
19: end while
return H ′
End.
RDF-BF-Hypergraph Representation for Relational Database 55

3.4 Generating the RDF-BF-Hypergraph schema that


corresponds to RDB schema
In the last step, the proposed algorithm, RDF − BF (V (H), F ) is intro-
duced to generate the RDF-BF-hypergraph schema that corresponds to an
acyclic/cyclic RDB schema according to its set of functional dependency.

4 Results and Discussion


In this section, the result of the proposed model is illustrated through the
following example. Consider the undirected hypergraph of Figure 9, which
corresponds to database scheme of Figure 8. After applying algorithm A0 (H)
to this hypergraph, we get the set of α−nodes: V (H) = {S#, P #, City} [26].
Then after applying algorithm αRem (H,n,K ), the attribute City in both the
SUPPLIER relation scheme and the PART relation scheme will be renamed
to be Supplier City and Part City respectively. This gives an α−acyclic
undirected hypergraph of Figure 14, which corresponding to the acyclic
database schemes of Figure 13.

Figure 13: An instance of acyclic database of the database schemes of figure


8

Figure 14: The acyclic hypergraph corresponding to the acyclic database


schemes of Figure 13
After applying algorithm RDF −BF (V (H), F ) to the resulted set of nodes
of the acyclic undirected hypergraph of Figure 14, we will get the RDF-BF-
Hypergraph schema with set of nodes that represent each attribute name:
56 F. F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass

Algorithm 2 RDF -BF (V (H), F )


Input: The set of nodes V (H) of the undirected hypergraph H which is the
output of the algorithm αRem (H,n,K ) in the case of the possibility of re-
moving the α−cycle(s) or the set of nodes V (H) of the original hypergraph
of step one and
The set of functional dependencies F of the form X −→ Y


Output: An RDF − BF −Hypergraph H .
Begin


Assume the set of all nodes of H is V (H) = {v1 , ..., vn } and let( H ) =
→ →
− −
( V , E ).

− →

1: V = ∅ and E = ∅
2: for i = 1 to n do

− →

3: V = V ∪ {vi }
4: end for
5: for all X −→ Y ∈ F do
6: if |X| = 1 && |Y | = 1 then

− →

7: E = E ∪ e = ({x}, {y})
// Add a directed edge from x ∈ X to y ∈ Y
// i.e., {x} ∈ T (e), {y} ∈ H(e),
// where X = {x1 , ..., xm }
8: else if (|X| > 1 && |Y | = 1) then

− →
− →
− →

9: E = E ∪ e = ({x1 , ..., xm }, {y}) && V = V ∪ {Bj }
// Add a directed B-hyperedge from all elements in X to the element
// in Y and create a new connector node Bj to connect all
// {x1 , ..., xm } ∈ X to Y i.e., {x1 , ..., xm } ∈ T (e), y ∈ H(e)
10: else if |X| > 1 && |Y | > 1 then

− →
− →
− →

11: E = E ∪e = ({x1 , ..., xm }, {y1 , ..., yk }) && V = V ∪{BFl }
// Add a directed BF-hyperedge from all elements in X to
// all elements in Y and create a new connector node
// BFl to connect all {x1 , ..., xm } ∈ X to {y1 , ..., yk } ∈ Y
// i.e., {x1 , ..., xm } ∈ T (e), {y1 , ..., yk } ∈ H(e)),
// where X = {x1 , ..., xm } and Y = {y1 , ..., yk }
12: end if
13: end for


return H
End.
RDF-BF-Hypergraph Representation for Relational Database 57



V = {S#, Sname, Status, Supplier City, P #, P name, Color,
W eight, P art City , Date, Qty}, and the set of directed hyperedges


E = ({S#}, {Sname}), ({S#}, {Status}), ({S#}, {Supplier City}), ({P #},
{P name}), ({P #}, {Color}), ({P #}, {W eight}), ({P #}, {P art City}),
({S#, P #, Date, B1 }, {Qty}), as shown in Figure 15.

Figure 15: the RDF-BF-Hypergraph schema that corresponds to the acyclic


database schema of Figure 13

Figure 16: The RDF-BF-hypergraph instance of the RDF-BF-hypergraph


schema of Figure 15
A formal representation of RDF graph schema will preserve the integrity
constraints to ensure integrity and data semantics, which is captured from a
database schema. Moreover, a formal representation of RDF graph schema
will facilitate the data conversion and will allow graph connectivity which is
essential for querying RDF by using concepts, techniques and various traver-
sal algorithms. The RDF-BF-hypergraph instance can be easily obtained as
follows:

• Each node will have a label that corresponds to attribute name’s value.
Each hyperedge will have a label that corresponds to that attribute
58 F. F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass

name [32].
• In the case of the many-to-many relational joins, each B-connector will
be replaced by the name of the bridge table as shown in Figure 16 in
which the B-connector B1 of Figure 15, is replaced by shipment which
is the name of the bridge table of the database of Figure 8.
• In the case of one-to-many relation joins, the hyperedge will connect
the value of the primary key in ”many” table to the referenced value
of the foreign key in the one table through a label that corresponds
to the name of ”one” table, for example Figure 19 shows the instance
of the RDF-BF-hypergraph for one-to-many relation joins of the RDF-
BF-Hypergraph schema of Figure 18, that corresponds to the acyclic
database schema of Figure 17.

Figure 17: An instance of acyclic database

Figure 18: The RDF-BF-Hypergraph schema that corresponds to the acyclic


database schema of Figure 17

Figure 19: The RDF-BF-hypergraph instance of the RDF-BF-hypergraph


schema of Figure 18
One of the earliest languages for the purposes of querying RDF graphs
is Simple Protocol and RDF Query Language (SPARQL) [33]. In SPARQL
RDF-BF-Hypergraph Representation for Relational Database 59

queries are expressed as a graph and query evaluation relies on graph match-
ing between the query and the database [34]. SPARQL shares a conceptual
core, which consists of two natural operations in the context of querying
graphs:

• Graph pattern matching, where a query pattern can be viewed as a


graph P = (Vp, Ep ). Here, the nodes and edges describe query condi-
tions that a subgraph of a given G must satisfy in order to be matched.
Given a graph G, the pattern matching problem is to find all possible
subgraphs of G that match a given pattern P [35].

• Graph navigation that relies on path existence, which asks if there is


some directed path between two nodes, irrespective of edge labels [36].

Figure 20: The RDF-BF-hypergraph instance of the cyclic database of Fig-


ure 8
Assume that we have a query to find all suppliers that ships all parts in City
Giza from the RDF-BF-hypergraph instance of Figure 20 that corresponds to
the cyclic database of Figure 8. By applying the two operations of SPARQL
then City is a constant that will only match edges with the corresponding
label in the original graph [36]. Hence the result will consist of two steps.
In the first step those subgraphs of the original graph, which consist of all
such valid matches corresponding to the label ”City” will be return. In the
second step the supgraphs that match all parts in giza city are chosen to be
the final result. Whereas for the RDF-BF-hypergraph instance of Figure 19
that corresponds to the RDF-BF-hypergraph schema of Figure 18, the result
will be reduced to those subgraphs which consist of all matches corresponding
to the label ”Part City”. Hence in the case of acyclic RDF-BF-hypergraph
instance, the time and space that needed for query answering will be reduced
and finally the semantics of the RDB schema will be maintained.
60 F. F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass

5 Conclusion and Future Work

The semantic web has an initiative that aims to improve the current state of
the World Wide Web to make data machine-understandable. Semantic web
depends on RDF-graph data model which is a standard model for data inter-
change on the web. Even web contents that are generated automatically from
databases are usually presented without the original structural information
found in databases. In the last decades a special class of database schemes,
called acyclic database schema, was introduced. In the case of acyclic hyper-
graphs the query optimization becomes easier than in the case of cyclicity and
might be recognized in linear time. This paper introduces a BF-hypergraph
representation for the RDF schema that corresponds to RDB schema. The
BF-hypergraph is a suitable model to represent the set of functional depen-
dencies of RDB since the domain and codomain of the functional dependency
may contain more than one attribute. We propose a model, consisting of four
steps, to represent the RDF-BF-hypergraph schema that corresponds to an
acyclic/cyclic RDB schema according to its set of functional dependency.
In the first step a given database schema is represented as hypergraph by
identifying its tables and their attributes. The second step will check for
α-cyclicity by detecting the set of α-nodes. If the hypergraph is α-cyclic,
then we will treat this cycle(s) to see if it can be removed or not, which is
the third step. In the fourth step an RDF-BF-hypergraph schema will be
generated for the resulted acyclic undirected hypergraph in the case if the
cycle(s) can be removed or for the original undirected hypergraph of step
one. Moreover, two algorithms of the proposed model are introduced. The
first algorithm converts a cyclic undirected hypergraph that corresponds to
RDB schema into acyclic one if it is possible, which is used in the third step.
The second algorithm generates the RDF-BF-hypergraph schema that cor-
responds to an acyclic/cyclic RDB schema according to its set of functional
dependency, which is used in the fourth step. A formal representation of the
RDF-graph schema will preserve the integrity constraints to ensure integrity
and data semantics. Moreover, a formal representation of RDF-graph schema
will facilitate the data conversion and will allow graph connectivity which is
essential for querying the RDF by using concepts, techniques and various
traversal algorithms. In the case of acyclic RDF-BF-hypergraph instance,
the time and space needed for query answering will be reduced and, finally,
the semantics of the RDB schema will be maintained. For future work, we
will extend our model to propose a tool to generate the RDF-BF-hypergraph
schema and instance from an existing database.
RDF-BF-Hypergraph Representation for Relational Database 61

References
[1] O. Lassila, R. Swick, Resource Description Framework (RDF) Model
and Syntax Specification, W3C Recommendation, World Wide Web,
http://www.w3.org/TR/1999/REC-rdf-syntax-19990222, 1999.

[2] J. Davies, R. Studer, P. Warren, Semantic Web Technologies Trends


and Research in Ontology-based Systems, 1st edition, John Wiley &
Sons Ltd, 2006.

[3] E. Miller, R. Swick, D. Brickley, Resource Description Frame-


work (RDF) / W3C Semantic Web Activity, World Wide Web,
http//www.w3.org/RDF/, 2004.

[4] T. Berners-Lee, Semantic Web Road map, World Wide Web,


http://www.w3.org/DesignIssues/Semantic.html, 1998.

[5] R. Angles, C. Gutierrez, Survey of graph database models, ACM Com-


puter Survey, 40, no. 1, (2008), 1–39.

[6] R. Virgilio, A. Maccioni, R. Torlone, Converting Relational to Graph


Databases, Proceedings of the First International Workshop on Graph
Data Management Experience and Systems, (2013), 1–3.

[7] F. Michel, J. Montagnat, C. Farozucker, A survey of RDB to RDF trans-


lation approaches and tools, Research Report number I3S/RR 2013-04-
FR, 25 pages, I3S laboratory, Sophia Antipolis, France, 2013.

[8] E. F. Codd, A relational model of data for large shared data banks,
Communication of the ACM, 13, no. 6, (1970), 377–387.

[9] M. C. Yeow, J. Lijun, L. Andrew, K. H. Anthony, Arboricity: An acyclic


hypergraph decomposition problem motivated by database theory, Dis-
crete Applied Mathematics, 160, (2012), 100–107.

[10] R. Fagin, Degrees of acyclicity for hypergraphs and relational database


schemes, Communication of the ACM, 30, no. 3, (1983), 514-550.

[11] C. Beeri, R. Fagin, M. David, Y. Mihalis, On the desirability of acyclic


database schemes, Communication of the ACM, 30, no. 3, (1983), 479–
513.
62 F. F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass

[12] M. David, D. Ullman, Connections in acyclic hypergraphs, Theoretical


Computer Science, 32, (1984), 185–199.
[13] C. Beeri, R. Fagin, D. Maie, A. O. Mendelzon, J. D. Ullman, M. Yan-
nakakis, Properties of acyclic database schemes, In: Proceedings of the
thirteenth annual ACM symposium on Theory of computing, (1981),
355–362.
[14] S. Chokkalingam, Database and query analysis tools for MySQL: ex-
ploting hypertree and hypergraph decompositions Master Thesis, Fac-
ulty of the Russ College of Engineering and Technology Ohio University,
Ohio, USA, 2006.
[15] K. Ahmet and O. Dan, Covers of Query Results. In: ICDT 2018, 21st
International Conference on Database Theory, 2018.
[16] X. Hu, K. Yi, Instance and Output Optimal Parallel Algorithms for
Acyclic Joins. In: Proceedings of the 38th ACM SIGMOD-SIGACT-
SIGAI Symposium on Principles of Database Systems, (2019).
[17] H. Chen, Y. Yoshida, Testability of Homomorphism Inadmissibility:
Property Testing Meets Database Theory. In: Proceedings of the 38th
ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database
Systems, (2019).
[18] T. Berners-Lee, Relational Databases on the Semantic Web, Internet
note, URL http://www.w3.org/DesignIssues/RDB-RDF.html, 22, 2006
(originally published September 1998).
[19] K. Byrne, Having Triplets Holding Cultural Data as RDF, Proceed-
ings of the ECDL 2008 Workshop on Information Access to Cultural
Heritage, Aarhus, Denmark, (2008), 1–13.
[20] G. Klyne, J. Carroll, Resource Description Framework (RDF): Con-
cepts and Abstract Syntax. W3C Recommendation, World WideWeb,
http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/, 2004.
[21] J. Hayes, C. Gutierrez, Bipartite graphs as intermediate model for RDF,
In: Proceedings of the 3th International Semantic Web Conference
(ISWC), 3298, (2004), 47-61.
[22] J. Bang-Jensen, and G. Gregory, directed graphs: Theory, Algorithms
and Applications, 1st edition, Springer, 2000.
RDF-BF-Hypergraph Representation for Relational Database 63

[23] J. Brault-Baron, Hypergraph acyclicity revisited, ACM Computing Sur-


veys (CSUR), 49.,Issue 3, (2016), 1–32.

[24] C. J. Date, An Introduction to Database System, 3rd edition, 1, Narosa


Publishing House, 2002.

[25] J. A. Philippe, N. N. Samba, On the notion of cycles in hypergraphs,


Discrete Mathematics, 309, (2009), 6535–6543.

[26] F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass,


On Quasi Cycles in Hypergraph Databases, in submission.

[27] M.H. Graham, On the universal relation, Technical Report, University


of Toronto, 1979.

[28] M. R. Moyano, F. Jose, Strong Connectivity in Directed Hyper-


graphs and its Application to the Atomic Decomposition of Ontologies,
Ph.D. Thesis, Artificial intelligent department Polytechnic university of
Madrid, 2016.

[29] M. Thakur, R. Tripathi, Linear connectivity problems in directed hy-


pergraphs, Theoretical Computer Science, 410, (2009) 2592–2618.

[30] R. El masri, B. Sh. Navathe, Fundamental of Database Systems, 6th


edition, Addison-Wesley, 2010.

[31] G. Gallo, G. Longo, S. Nguyen, S. Pallottino, Directed Hypergraphs


and Applications, Journal of Discrete Applied Mathematics - Special
issue: combinatorial structures and algorithms, 42, Issue 2-3, (1993),
177-201.

[32] A. Antonio, M. E. Vidal, Directed hyper-graphs for RDF documents,


Technical Journal of the Faculty of Engineering, 33, no. 1, (2010), 59–
67.

[33] E. Prudhommeaux, A. Seaborne, SPARQL Query Language for


RDF. W3C Recommendation, http://www.w3.org/TR/rdf-sparql-
query/, 2008.

[34] S. Harris, A. Seaborne, SPARQL 1.1 Query Language. W3C Recom-


mendation, http://www.w3.org/TR/sparql11-query/, 2013.
64 F. F. M. Ghaleb, A. A. Taha, M. Hazman, M. Abd ElLatif, M. Abbass

[35] A. Gubichev, M. Then, Graph Pattern Matching Do We Have to Rein-


vent the Wheel?, Proceedings of Workshop on Graph Data management
Experiences and Systems, (2014), 1–7.

[36] R. Angles, M. Arenas, P. Barcelo, A. Hogan, J. L. Reutter, D. Vrgoc,


Foundations of Modern Query Languages for Graph Databases, Com-
put. Surveys, 50, no. 5, (2017).

You might also like