0% found this document useful (0 votes)
13 views10 pages

A Comparative Study of Formal Concept and Rough Sets

This document presents a comparative study of Rough Set Theory and Formal Concept Analysis, highlighting their distinct approaches to data analysis. Rough Set Theory focuses on concepts defined by disjunctions of properties for prediction, while Formal Concept Analysis emphasizes conjunctions of properties for description. The paper introduces new concept lattices and explores the relationships between the two theories within a common framework based on formal contexts.

Uploaded by

Mario Lezoche
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views10 pages

A Comparative Study of Formal Concept and Rough Sets

This document presents a comparative study of Rough Set Theory and Formal Concept Analysis, highlighting their distinct approaches to data analysis. Rough Set Theory focuses on concepts defined by disjunctions of properties for prediction, while Formal Concept Analysis emphasizes conjunctions of properties for description. The paper introduces new concept lattices and explores the relationships between the two theories within a common framework based on formal contexts.

Uploaded by

Mario Lezoche
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

A Comparative Study of Formal Concept

Analysis and Rough Set Theory in Data Analysis

Yiyu (Y.Y.) Yao

Department of Computer Science, University of Regina


Regina, Saskatchewan, Canada S4S 0A2
[email protected]; http://www.cs.uregina.ca/∼yyao

Abstract. The theory of rough sets and formal concept analysis are
compared in a common framework based on formal contexts. Different
concept lattices can be constructed. Formal concept analysis focuses on
concepts that are definable by conjuctions of properties, rough set theory
focuses on concepts that are definable by disjunctions of properties. They
produce different types of rules summarizing knowledge embedded in
data.

1 Introduction

Rough set theory and formal concept analysis offer related and complementary
approaches for data analysis. Many efforts have been made to compare and
combine the two theories [1, 4–8, 11, 13]. The results have improved our under-
standing of their similarities and differences. However, there is still a need for
systematic and comparative studies of relationships and interconnections of the
two theories. This paper presents new results and interpretations on the topic.
The theory of rough sets is traditionally formulated based on an equiva-
lence relation on a set of objects called the universe [9, 10]. A pair of unary
set-theoretic operators, called approximation operators, are defined [15]. A con-
cept, represented by a subset of objects, is called a definable concept if its lower
and upper approximations are the same as the set itself. An arbitrary concept
is approximated from below and above by two definable concepts. The notion
of approximation operators can be defined based on two universes linked by a
binary relation [14, 18].
Formal concept analysis is formulated based on the notion of a formal context,
which is a binary relation between a set of objects and a set of properties or
attributes [3, 12]. The binary relation induces set-theoretic operators from sets
of objects to sets of properties, and from sets of properties to sets of objects,
respectively. A formal concept is defined as a pair of a set of objects and a set
of properties connected by the two set-theoretic operators.
The notion of formal contexts provides a common framework for the study of
rough set theory and formal concept analysis, if rough set theory is formulated
based on two universes. Düntsch and Gediga pointed out that the set-theoretic
operators used in the two theories have been considered in modal logics, and
therefore referred to them as modal-style operators [1, 4, 5]. They have demon-
strated that modal-style operators are useful in data analysis.
In this paper, we present a comparative study of rough set theory and for-
mal concept analysis. The two theories aim at different goals and summarize
different types of knowledge. Rough set theory is used for the goal of predic-
tion, and formal concept analysis is used for the goal of description. Two new
concept lattices are introduced in rough set theory. Rough set theory involves
concepts described by disjunctions of properties, formal concept analysis deals
with concepts described by conjunctions of properties.

2 Concept Lattices Induced by Formal Contexts


The notion of formal contexts is used to define two pairs of modal-style operators,
one for formal concept analysis and the other for rough set theory [1, 4].

2.1 Binary relations as formal contexts


Let U and V be two finite and nonempty sets. Elements of U are called objects,
and elements of V are called properties or attributes. The relationships between
objects and properties are described by a binary relation R between U and V ,
which is a subset of the Cartesian product U × V . For a pair of elements x ∈ U
and y ∈ V , if (x, y) ∈ R, also written as xRy, we say that x has the property y,
or the property y is possessed by object x.
An object x ∈ U has the set of properties:
xR = {y ∈ V | xRy} ⊆ V. (1)
A property y is possessed by the set of objects:
Ry = {x ∈ U | xRy} ⊆ U. (2)
The complement of a binary relation is defined by:
Rc = U × V − R = {(x, y) | ¬(xRy)}, (3)
c c
where denotes the set complement. That is, xR y if and only if ¬(xRy). An
object x ∈ U does not have the set of properties, xRc = {y ∈ V | xRc y} =
(xR)c ⊆ V . A property y is not possessed by the set of objects, Rc y = {x ∈ V |
xRc y} = (Ry)c ⊆ U .
The triplet (U, V, R) is called a binary formal context. For simplicity, we only
consider the binary formal context in the subsequent discussion.

2.2 Formal concept analysis



For a formal context (U, V, R), we define a set-theoretic operator : 2U −→ 2V :
X ∗ = {y ∈ V | ∀x ∈ U (x ∈ X =⇒ xRy)}
= {y ∈ V | X ⊆ Ry}
\
= xR. (4)
x∈X
It associates a subset of properties X ∗ to the subset of objects X. Similarly, for
any subset of properties Y ⊆ V , we can associate a subset of objects Y ∗ ⊆ U :

Y ∗ = {x ∈ U | ∀y ∈ V (y ∈ Y =⇒ xRy)}
= {x ∈ U | Y ⊆ xR}
\
= Ry. (5)
y∈Y

They have the properties: for X, X1 , X2 ⊆ U and Y, Y1 , Y2 ⊆ V ,

(1) X1 ⊆ X2 =⇒ X1∗ ⊇ X2∗ , Y1 ⊆ Y2 =⇒ Y1∗ ⊇ Y2∗ ,


(2) X ⊆ X ∗∗ , Y ⊆ Y ∗∗ ,
(3) X ∗∗∗ = X ∗ , Y ∗∗∗ = Y ∗ ,
(4) (X1 ∪ X2 )∗ = X1∗ ∩ X2∗ , (Y1 ∪ Y2 )∗ = Y1∗ ∩ Y2∗ .

A pair of mappings is called a Galois connection if it satisfies (1) and (2), and
hence (3).
Consider now the dual operator of ∗ defined by [1]:

X # = X c∗c
= {y ∈ V | ∃x ∈ U (x ∈ X c ∧ ¬(xRy))}
= {y ∈ V | ¬(X c ⊆ Ry)}
= {y ∈ V | X c ∩ (Ry)c 6= ∅}. (6)

For a subset of properties Y ⊆ V , Y # can be similarly defined. Properties of #


can be obtained from the properties of ∗ . For example, we have (X1 ∩ X2 )# =
X1# ∪ X2# .
By definition, {x}∗ = xR is the set of properties possessed by x, and {y}∗ =
Ry is the set of objects having property y. For a set of objects X, X ∗ is the
maximal set of properties shared by all objects in X. Similarly, for a set of
properties Y , Y ∗ is the maximal set of objects that have all properties in Y . For
a subset X ⊆ U , a property in X # is not possessed by at least one object not
in X.
A pair (X, Y ), X ⊆ U, Y ⊆ V , is called a formal concept if X = Y ∗ and
Y = X ∗ . The set of objects X is referred to as the extension of the concept,
and the set of properties is referred to as the intension of the concept. Objects
in X share all properties Y , and only properties Y are possessed by all objects
in X. The set of all formal concepts forms a complete lattice called a concept
lattice [3]. The meet and join of the lattice is given by:

(X1 , Y1 ) ∧ (X2 , Y2 ) = (X1 ∩ X2 , (Y1 ∪ Y2 )∗∗ ),


(X1 , Y1 ) ∨ (X2 , Y2 ) = ((X1 ∪ X2 )∗∗ , Y1 ∩ Y2 ). (7)

By property (3), for any subset X of U , we have a formal concept (X ∗∗ , X ∗ ),


and for any subset Y of V , we have a formal concept (Y ∗ , Y ∗∗ ).
2.3 Rough sets

We consider a slightly different formulation of rough set theory based on a binary


relation between two universes [4, 14, 18].
Given a formal context, we define a pair of dual approximation operators
2 3
, : 2U −→ 2V ,

X 2 = {y ∈ V | ∀x ∈ U (xRy =⇒ x ∈ X)}
= {y ∈ V | Ry ⊆ X}, (8)
3
X = {y ∈ V | ∃x ∈ U (xRy ∧ x ∈ X)}
= {y ∈ V | Ry ∩ X 6= ∅}
[
= xR. (9)
x∈X

Similarly, we define another pair of approximation operators 2 ,3 : 2V −→ 2U ,

Y 2 = {x ∈ U | ∀y ∈ V (xRy =⇒ y ∈ Y )}
= {x ∈ U | xR ⊆ Y }, (10)
3
Y = {x ∈ U | ∃y ∈ V (xRy ∧ y ∈ Y )}
= {x ∈ U | xR ∩ Y 6= ∅}
[
= Ry. (11)
y∈Y

They have the properties: for X, X1 , X2 ⊆ U and Y, Y1 , Y2 ⊆ V ,

(i) X1 ⊆ X2 =⇒ [X12 ⊆ X22 , X13 ⊆ X23 ],


Y1 ⊆ Y2 =⇒ [Y12 ⊆ Y22 , Y13 ⊆ Y23 ],
(ii) X 23 ⊆ X ⊆ X 32 , Y 23 ⊆ Y ⊆ Y 32 ,
(iii) X 323 = X 3 , Y 323 = Y 3 ,
X 232 = X 2 , Y 232 = Y 2 ,
(iv) (X1 ∩ X2 )2 = X12 ∩ X22 , (X1 ∪ X2 )3 = X13 ∪ X23 ,
(Y1 ∩ Y2 )2 = Y12 ∩ Y22 , (Y1 ∪ Y2 )3 = Y13 ∪ Y23 .

Based on the notion of approximation operators, we introduce two new con-


cept lattices in rough set theory.
A pair (X, Y ), X ⊆ U, Y ⊆ V , is called an object oriented formal concept if
X = Y 3 and Y = X 2 . If an object has a property in Y then the object belongs
to X. Furthermore, only objects in X have properties in Y . The family of all
object oriented formal concepts forms a lattice. Specifically, the meet ∧ and join
∨ are defined by:

(X1 , Y1 ) ∧ (X2 , Y2 ) = ((Y1 ∩ Y2 )3 , Y1 ∩ Y2 ),


(X1 , Y1 ) ∨ (X2 , Y2 ) = (X1 ∪ X2 , (X1 ∪ X2 )2 ). (12)
For a set of objects X ⊆ U , we have a formal concept (X 23 , X 2 ). For a set of
properties Y ⊆ V , we have (Y 3 , Y 32 ).
A pair (X, Y ), X ⊆ U, Y ⊆ V , is called a property oriented formal concept
if X = Y 2 and Y = X 3 . If a property is possessed by an object in X then the
property must be in Y . Furthermore, only properties Y are possessed by objects
in X. The family of all property oriented formal concepts forms a lattice with
meet ∧ and join ∨ defined by:
(X1 , Y1 ) ∧ (X2 , Y2 ) = (X1 ∩ X2 , (X1 ∩ X2 )3 ),
(X1 , Y1 ) ∨ (X2 , Y2 ) = ((Y1 ∪ Y2 )2 , Y1 ∪ Y2 ). (13)
For a set of objects X ⊆ U , we can construct a property oriented formal concept
(X 32 , X 3 ). For a set of properties Y ⊆ V , there is a property oriented formal
concept (Y 2 , Y 23 ). The property oriented concept lattice was introduced by
Düntsch and Gediga [4].

2.4 Relationships between operators and other representations


Düntsch and Gediga referred to the four operators ∗ , # , 2 , and 3 as modal-
style operators, called the sufficiency, dual sufficiency, necessity and possibility
operators, respectively [1, 4].
The relationships between four modal-style operators can be stated as follows:
(X)2
R = {y ∈ V | Ry ⊆ X}
= {y ∈ V | X c ⊆ (Ry)c }
= {y ∈ V | X c ⊆ Rc y}
= (X c )∗Rc ; (14)
(X)3
R = {y ∈ V | X ∩ Ry 6= ∅}
= {y ∈ V | X cc ∩ (Ry)cc 6= ∅}
= (X c )#
Rc . (15)
where the subscription R indicates that the operator 2 is defined with respect
#
to the relation R. Conversely, we have (X)∗R = (X c )2 c 3
Rc and (X)R = (X )Rc .
The relationships between binary relations and operators are summarized by:
for x ∈ U , y ∈ V ,
xR = {x}∗ = {x}3 , Ry = {y}∗ = {y}3 ,
xRy ⇐⇒ x ∈ {y}∗ ⇐⇒ y ∈ {x}∗ ,
xRy ⇐⇒ x ∈ {y}3 ⇐⇒ y ∈ {x}3 . (16)
From a binary relation R, we can define an equivalence relation EU on U :
xEU x0 ⇐⇒ xR = x0 R. (17)
Two objects are equivalent if they have exactly the same set of properties [11].
Similarly, we define an equivalence relation EV on V :
yEV y 0 ⇐⇒ Ry = Ry 0 . (18)
Two properties are equivalent if they are possessed by exactly the same set of
objects [11].
Now we define a mapping, j : 2U −→ 2V , called the basic set assignment as
follows:
j(X) = {y ∈ V | Ry = X}. (19)
A property y is assigned to the set of objects that have the property. The fol-
lowing set:
{j(X) 6= ∅ | X ⊆ U }, (20)
is in fact the partition induced by the equivalence relation EV . Similarly, a basic
set assignment j : 2V −→ 2U is given by:

j(Y ) = {x ∈ U | xR = Y }. (21)

The set:
{j(Y ) 6= ∅ | Y ⊆ V }, (22)
is the partition induced by the equivalence relation EV .
In terms of the basic set assignment, we can re-express operators ∗ , # , 2
and
3
as:
[ [
X∗ = j(F ), X# = j(F ),
X⊆F X∪F 6=U
[ [
2 3
X = j(F ), X = j(F ). (23)
F ⊆X F ∩X6=∅

It follows that X ∗ ∩ X 2 = j(X).

3 Data Analysis using Modal-style Operators


Modal-style operators provide useful tools for data analysis [1, 4]. Different op-
erators lead to different types of rules summarizing the knowledge embedded in
a formal context. By the duality of operators, we only consider ∗ and 2 .

3.1 Rough set theory: predicting the membership of an object


based on its properties
For a set of objects X ⊆ U , we can construct a set of properties X 2 . It can be
used to derive rules that determine whether an object is in X. If an object has
a property in X 2 , the object must be in X. That is,

∀x ∈ U [∃y ∈ V (y ∈ X 2 ∧ xRy) =⇒ x ∈ X].

It can be re-expressed as a rule: for x ∈ U ,


_
xRy =⇒ x ∈ X. (24)
y∈X 2
In general, the reverse implication does not hold.
In order to derive a reverse implication, we construct another set of objects
X 23 ⊆ X. For the set of objects, we have a rule: for x ∈ U ,
_
x ∈ X 23 =⇒ xRy. (25)
y∈X 2

This can be shown as follows:


x ∈ X 23 =⇒ xR ∩ X 2 6= ∅
=⇒ ∃y ∈ V (xRy ∧ y ∈ X 2 )
_
=⇒ xRy. (26)
y∈X 2

In general, X is not the same as X 23 , which suggests that one can not establish
a double implication rule for an arbitrary set.
For a set of objects X ⊆ U , the pair (X 23 , X 2 ) is an object oriented formal
concept. From the property X 232 = X 2 and the rule (24), it follows:
_
xRy =⇒ x ∈ X 23 . (27)
y∈X 2

By combining it with rule (25), we have a double implication rule:


_
x ∈ X 23 ⇐⇒ xRy. (28)
y∈X 2

The results can be extended to any object oriented formal concept. For (X =
Y 3 , Y = X 2 ), we have a rule:
_
x ∈ X ⇐⇒ xRy. (29)
y∈Y

That is, the set of objects X and the set of properties Y in (X, Y ) uniquely
determine each other.

3.2 Formal concept analysis: summarizing the common properties


of a set of objects
In formal concept analysis, we identify the properties shared by a set of objects,
which provides a description of the objects. Through the operator ∗ , one can
infer the properties of an object based on its membership in a set X. More
specifically, we have:
∀y ∈ V ∀x ∈ U [(y ∈ X ∗ ∧ x ∈ X) =⇒ xRy)].
This leads to a rule: for x ∈ U ,
^
x ∈ X =⇒ xRy. (30)
y∈X ∗
The rule suggests that an object in X must have all properties in X ∗ . The reverse
implication does not hold in general.
For the construction of a reverse implication, we construct another set of
objects X ∗∗ ⊇ X. In this case, we have:
^
xRy =⇒ x ∈ X ∗∗ . (31)
y∈X ∗

An object having all properties in X ∗ must be in X ∗∗ . For an arbitrary set X,


X may be only a subset of X ∗∗ . One therefore may not be able to establish a
double implication rule for an arbitrary set of objects.
A set of objects X induces a formal concept (X ∗∗ , X ∗ ). By property X ∗∗∗ =

X and rule (30), we have:
^
x ∈ X ∗∗ =⇒ xRy. (32)
y∈X ∗

Combining it with rule (31) results in: for x ∈ U ,


^
x ∈ X ∗∗ ⇐⇒ xRy. (33)
y∈X ∗

In general, for a formal concept (X = Y ∗ , Y = X ∗ ), we have:


^
x ∈ X ⇐⇒ xRy. (34)
y∈Y

That is, the set of objects X and the set of properties Y determine each other.

3.3 Comparison
Rough set theory and formal concept analysis offer two different approaches
for data analysis. A detailed comparison of the two methods may provide more
insights into data analysis.
Fayyad et al. identified two high-level goals of data mining as prediction and
description [2]. Prediction involves the use of some variables to predict the values
of some other variables. Description focuses on patterns that describe the data.
For a set of objects X ⊆ U , the operator 2 identifies a set of properties X 2
that can be used to predict the membership of an object x with respect to X. It
attempts to achieve the goal of prediction. In contrast, the operator ∗ identifies
a set of properties X ∗ that are shared by all objects in X. In other words, it
provides a method for description and summarization. In special cases, the tasks
of prediction and description become the same one for certain sets of objects. In
rough set theory, this happens for the family of object oriented formal concepts.
In formal concept analysis, this happens for the family of formal concepts.
A property in X 2 is sufficient to decide that an object having the property
is in X. The set X 2 consists of sufficient properties for an object to be in X. On
the other hand, an object in X must have properties in X ∗ . The set X ∗ consists
of necessary properties of an object in X. Therefore, rough set theory and formal
concept analysis focus on two opposite directions of inference. The operator 2
enables us to infer the membership of an object based on its properties. On the
other hand, through the operator ∗ , one can infer the properties of an object
based on its membership in X. By combining the two types of knowledge, we
obtain a more complete picture of the data.
By comparing the rules derived by rough set theory and formal concept anal-
ysis, we can conclude that the two theories focus on different types of concepts.
Rough set theory involves concepts described by disjunctions of properties, for-
mal concept analysis deals with concepts described by conjunctions of properties.
They represent two extreme cases. In general, one may consider other types of
concepts.
By definition, ∗ and 3 represent the two extremely cases in describing a set
of objects based on their properties. Assume that xR 6= ∅ and Ry 6= ∅. Then we
have the rules: for x ∈ U ,
x ∈ X =⇒ ∃y ∈ V (y ∈ X 3 ∧ xRy),
x ∈ X =⇒ ∀y ∈ V (y ∈ X ∗ =⇒ xRy). (35)
That is, an object has all properties in X ∗ and at least one property in X 3 . The
pair (X ∗ , X 3 ) with X ∗ ⊆ X 3 thus provides a characterization of X in terms of
properties.

4 Conclusion
Both the theory of rough sets and formal concept analysis formalize in some
meaningful way the notion of concepts. The two theories are compared in a
common framework consisting of a formal context. Different types of concepts
are considered in the two theories. They capture different aspects of concepts.
Rough set theory involves concepts described by disjunctions of properties, for-
mal concept analysis deals with concepts described by conjunctions of properties.
One makes opposite directions of inferences using the two theories. The opera-
tor 2 enables us to infer the membership of an object based on its properties,
and the operator ∗ enables us to infer the properties of an object based on its
membership in X.
The combination of the two theories leads to a better understanding of knowl-
edge embedded in data. One may combine modal-style operators to obtain new
modal-style operators and analyze data using the new operators [1, 4, 5]. Fur-
ther studies on the relationships between the two theories would lead to new
results [16, 17].

References
1. Düntsch, I. and Gediga, G. Approximation operators in qualitative data analysis,
in: Theory and Application of Relational Structures as Knowledge Instruments, de
Swart, H., Orlowska, E., Schmidt, G. and Roubens, M. (Eds.), Springer, Heidel-
berg, 216-233, 2003.
2. Fayyad, U.M., Piatetsky-Shapiro, G. and Smyth, P. From data mining to knowl-
edge discovery: an overview, in: Advances in knowledge discovery and data mining,
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P. and Uthurusamy, R. (Eds.), 1-34,
AAAI/MIT Press, Menlo Park, California, 1996.
3. Ganter, B. and Wille, R. Formal Concept Analysis, Mathematical Foundations,
Springer, Berlin, 1999.
4. Gediga, G. and Düntsch, I. Modal-style operators in qualitative data analysis,
Proceedings of the 2002 IEEE International Conference on Data Mining, 155-162,
2002.
5. Gediga, G. and Düntsch, I. Skill set analysis in knowledge structures, to appear in
British Journal of Mathematical and Statistical Psychology.
6. Hu, K., Sui, Y., Lu, Y., Wang, J. and Shi, C. Concept approximation in concept
lattice, Knowledge Discovery and Data Mining, Proceedings of the 5th Pacific-
Asia Conference, PAKDD 2001, Lecture Notes in Computer Science 2035, 167-173,
2001.
7. Kent, R.E. Rough concept analysis: a synthesis of rough sets and formal concept
analysis, Fundamenta Informaticae, 27, 169-181, 1996.
8. Pagliani, P. From concept lattices to approximation spaces: algebraic structures of
some spaces of partial objects, Fundamenta Informaticae, 18, 1-25, 1993.
9. Pawlak, Z. Rough sets, International Journal of Computer and Information Sci-
ences, 11, 341-356, 1982.
10. Pawlak, Z. Rough Sets, Theoretical Aspects of Reasoning about Data, Kluwer Aca-
demic Publishers, Dordrecht, 1991.
11. Saquer, J. and Deogun, J.S. Formal rough concept analysis, New Directions in
Rough Sets, Data Mining, and Granular-Soft Computing, 7th International Work-
shop, RSFDGrC ’99, Lecture Notes in Computer Science 1711, Springer, Berlin,
91-99, 1999.
12. Wille, R. Restructuring lattice theory: an approach based on hierarchies of con-
cepts, in: Ordered Sets, Rival, I. (Ed.), Reidel, Dordrecht-Boston, 445-470, 1982.
13. Wolff, K.E. A conceptual view of knowledge bases in rough set theory, Rough
Sets and Current Trends in Computing, Second International Conference, RSCTC
2000, Lecture Notes in Computer Science 2005, Springer, Berlin, 220-228, 2001.
14. Wong, S.K.M., Wang, L.S., and Yao, Y.Y. Interval structure: a framework for rep-
resenting uncertain information, Uncertainty in Artificial Intelligence: Proceedings
of the 8th Conference, Morgan Kaufmann Publishers, 336-343, 1992.
15. Yao, Y.Y. Two views of the theory of rough sets in finite universes, International
Journal of Approximation Reasoning, 15, 291-317, 1996.
16. Yao, Y.Y. Concept lattices in rough set theory, to appear in Proceedings of 23rd In-
ternational Meeting of the North American Fuzzy Information Processing Society,
2004.
17. Yao, Y.Y. and Chen, Y.H. Rough set approximations in formal concept analysis, to
appear in Proceedings of 23rd International Meeting of the North American Fuzzy
Information Processing Society, 2004.
18. Yao, Y.Y., Wong, S.K.M. and Lin, T.Y. A review of rough set models, in: Rough
Sets and Data Mining: Analysis for Imprecise Data, Lin, T.Y. and Cercone, N.
(Eds.), Kluwer Academic Publishers, Boston, 47-75, 1997.

You might also like