Journal Rough Set
Journal Rough Set
1
Anshit Mukherjee, 2,*Gunjan Mukherjee &3Kamal Kumar Ghosh
1
Bachelor in Computer Science
Abacus Institute of Engineering and Management
Hooghly,WestBengal,India
2
Department of Computational Sciences
Brainware University
Barasat,WestBengal,India
3
Department of Science and Humanities
Abacus Institute of Engineering & Management
Hooghly, West Bengal,India
*
Corresponding Author Email:[email protected]
Abstract
Rough set theory is a mathematical approach to dealing with uncertainty and vagueness in data. It was
introduced as a way to approximate classical sets using lower and upper bounds. Rough set theory has
been applied to various domains such as data mining, knowledge discovery, machine learning, soft
computing, medical analysis, synthesis of switching circuits, and civil engineering. Rough set theory can
handle imprecise and noisy data by finding structural relationships and dependencies among attributes. It
can also reduce redundant and irrelevant attributes and generate decision rules from data. Rough set
theory is closely related to fuzzy set theory, but differs in that it uses multiple memberships instead of
partial memberships to model uncertainty.Lot of research works have taken place in this domain with
many fruitful outcomes that helped lot in expanding this field to much wider reach of knowledge
irrespective of the domain concerned to mathematical and technological application development. In lieu
of such research-oriented progress in different versatile domain, many areas of research has remained
untouched as many a problem remained unsolved being shrouded under the mystery. This paper deals
mainly with some of the still unsolved questions on Rough Set that are under study for new discovery and
a way to overcome such limitations citing proper techniques, examples, working models, and graphs.
Keyword: consistent, scalable, machine learning, pattern recognition, decision making, artificial
intelligence.
INTRODUCTION
Rough set theory is a mathematical framework for dealing with uncertainty and vagueness in
data analysis. It was introduced by Zdzisław (1982) Pawlak in the early 1980s as a way of
representing and manipulating sets that have imprecise boundaries. Then a number of papers
have been published by Iwinski (1987), Bryniarski (1989), Nieminean(1990), Wiweger(1988)
and others. Rough sets are sets that can be approximated by two other sets, called the lower and
upper approximations, which define the boundary region of the original set. The lower
approximation consists of all the elements that definitely belong to the set, while the upper
approximation consists of all the elements that possibly belong to the set. The difference between
the upper and lower approximations is called the boundary region, which represents the degree
of uncertainty or roughness of the set.
Rough set theory has many applications in fields such as machine learning (1986), data
mining(2009), pattern recognition(1991)(2003), decision analysis(1988), and knowledge
discovery. It can be used to deal with problems such as feature selection, attribute reduction, rule
induction, classification, clustering, and inconsistency analysis. Rough set theory can also handle
incomplete, inconsistent, or noisy data by using different types of approximations and measures
of similarity or indiscernibility. One of the main advantages of rough set theory is that it does not
require any prior knowledge or assumptions about the data or the domain.Recently,the hybrid
concept of fuzzy set theory (1999)(2017)(2010) has been merged with the rough set theory to
make an amalgamation resulting in better classification accuracy.
Rough set theory is a branch of mathematics that deals with the analysis of imprecise or
uncertain data. The typical setpresents a structural relationship betweenthe imprecision and
noises present in the data It is defined to be the collection of multiple membership functions, the
main idea of rough set theory is to approximate a set by two other sets, called the lower and
upper approximations. which are defined by using an equivalence relation on the universe of
discourse. The lower approximation (1991) of a set contains all the elements that definitely
belong to the set i.e. shows certainty towards its belongingto the set , while the upper
approximation contains all the elements that possibly belong to the set ie there lies some
uncertainty(2006) or approximate values. The difference between the upper and lower
approximations is called the boundary region, which represents elements that are indiscernible
from the set.
Formally, let U be a finite non-empty set, called the universe, and let R be an equivalence
relation on U. For any subset X of U, the lower and upper approximations of X with respect to R
can be given in Eq(1) andEq(2).
where [x]_R denotes equivalence class of x under R. The boundary region of X with respect to
R is then given by:
BN_R(X) = RX- R-X; the set BN_R(x) is the set of elements that cannot be classified either to X
or -X having knowledge R.
A set X is said to be rough with respect to R if BN_R(X) ≠ ∅, and exact otherwise. The degree
of roughness of X with respect to R can be measured by the ratio given in Eq(2)
where |.| denotes the cardinality of a set. This ratio ranges from 0 to 1, where 0 means that X is
completely rough and 1 means that X is completely exact.
To illustrate these concepts, let us consider an example. Suppose U is a set of animals, and R is
an equivalence relation that partitions U into four classes: mammals, birds, reptiles and fish. Let
X be a subset of U that contains only warm-blooded animals. Then the four distinct sets can be
defined in the expressions given in eq (4 ) to (5) :
In this example, X is rough with respect to R, because some reptiles are indiscernible from
warm-blooded animals based on the equivalence relation R. The degree of roughness is 2/3,
which means that two-thirds of the elements in the upper approximation definitely belong to X.
Rough set theory is a mathematical tool for dealing with vague, imprecise, inconsistent and
uncertain knowledge(2022). It is widely used in various fields of science and engineering
applications. Rough set theory can discover structural relationships within data and generate
decision rules based on equivalence relations and approximations of concepts
Roughset theory shows large array of merits towards many operations in engineering fields.
Some of the significantadvantages are that it can handle incomplete and noisy data without
requiring any prior knowledge or additional information. For example, it can deal with missing
values, null values, or dynamic data in information systems. It can identify partial or total
dependencies between attributes and reduce redundant or irrelevant attributes. For example, it
can find the minimal subsets of attributes that preservesame information as the original set of
attributes. It can extract meaningful patterns and rules from data that are easy to interpret and
understand by humans. For example, it can generate if-then rules that describe the conditions and
consequences of a concept or a decision.
It can be combined with other methods such as fuzzy sets, neural networks, probabilistic
reasoning, etc. to form hybrid systems for solving complex problems. For example, it can use
fuzzy sets to handle linguistic uncertainty or neural networks to enhance learning performance.
Rough set displays versatile applications in many allied and connected fields of researches in the
engineeringand technological domains. The discussion is presented in some points. Data science
requires the data in pure form in order to achieve the utmost accuracy in terms of the best score
of classification. The theory helps in detecting the missing or null attribute values. Moreover, the
relationship between the different attributes can easily be setup by means of such set theoretical
approach. The different diversified areas for application of rough set theory have been explored
in the following section.
1. Decision-making problems: Rough set theory can help to analyze decision tables and
generate optimal or suboptimal decisions based on certain criteria. For example, it can
help to select the best alternative among several options or rank the alternatives according
to their preference values.
2. Medical diagnosis: Rough set theory can help to diagnose diseases, classify patients, and
suggest treatments based on medical data. For example, it can help to identify the
symptoms and causes of a disease or recommend the most suitable drugs or therapies for
a patient.
3. Machine learning: Rough set theory can help to learn concepts, rules, and classifiers from
data and improve the accuracy and efficiency of learning algorithms. For example, it can
help to induce decision trees, neural networks, or support vector machines from data or
reduce the complexity and size of the learned models.
4. Expert systems and artificial intelligence: Rough set theory can help to model human
knowledge, reasoning, and inference processes and build intelligent systems that can
solve problems or provide advice. For example, it can help to construct knowledge
bases, inference engines, or dialogue systems that can interact with users or other
systems.
5. Pattern recognition: Rough set theory can help to recognize patterns, features, and objects
from images, signals, texts, etc. and perform tasks such as classification, clustering,
segmentation, etc. For example, it can help to detect faces, fingerprints, or characters
from images or group similar objects or texts into clusters or segments.
6. Rough set theory is a powerful and versatile tool that can deal with various types of
uncertainty and complexity in real world applications. It has many advantages over other
methods and has a wide range of applications in different domains.
LIMITATIONS
In spite of the number of merits,this set suffers from some limitations. The limitations
are given below:
1. How to handle incomplete or missing data in a consistent and robust way.
2. How to develop efficient and scalable algorithms for rough set analysis and reasoning.
3. How to assemble rough set theory with other techniques, such as neural networks, fuzzy
sets, and probabilistic reasoning, etc.
These are some of the challenges and opportunities for further research in rough set
theory. Here we have tried to overcome the unsolved limitations by citing proper
techniques, examples, working model, pseudo code and a proper graph of the example.
Some of the challenges and limitations that hinder the robustness of rough set theory towards
incomplete or consistent data handling are:
The dependency of rough set theory on the choice of indiscernibility relation, which may
not capture the true similarity between objects in a data set.
The difficulty of finding optimal or near-optimal reducts, which are minimal subsets of
attributes that preserve the discernibility of objects in a data set.
The lack of efficient and scalable algorithms for computing rough set approximations,
especially for large and high-dimensional data sets.
The sensitivity of rough set theory to noise and outliers, which may affect the quality and
stability of the approximations and reducts.
The limited ability of rough set theory to handle uncertainty, vagueness, and imprecision
in data, which may require extensions or modifications of the original framework.
PROCEDURE TO OVERCOME
An efficient way to handle an incomplete or missing data in rough set theory is to use imputation
methods that replace missing values with estimated values based on the information available
from the existing data. Consider the example, we can use rough set method that imputes missing
values by utilizing domain of that attribute value. Another way to handle it is to efficiently use
rough membership function that gives a degree of membership to each object based on its
similarity to other objects. The missing values can be approximated by using the membership
degrees of similar object.
The strategy that replaces missing values in a data table with estimated values based on the
information available from data is called imputation (2023). Missing value arises due to various
factors errors in data collection, transmission, or entry, due to respondents’ unwillingness or
inability to answer certain questions. Missing value can affect the accuracy and affect the quality
of data analysis and interpretation, and may lead to biased or erroneous results.
One of the imputation methods that uses the concepts of rough set theory is the rough set
method (2018). This method imputes missing values by using the domain of that attribute value.
The domain of an attribute is the set of possible values that the attribute can take. For example, if
an attribute is Fever, its domain could be {Yes, No}. The rough set method assigns a missing
value to one of the values in the domain based on the similarity between the object with the
missing value and other objects in the data table. The similarity is measured by using an
indiscernibility relation that groups objects with the same values for a given set of attributes. For
example, the Table -1 shows the symptoms and diagnosis of some patients:
In this table, P5 has a missing value for Fever. To impute this value, we can use the
indiscernibility relation (2012) with respect to {Cough, Rash}. This relation partitions the objects
into two equivalence classes: {P1} and {P2, P3, P5}. Since P5 belongs to the same equivalence
class as P2 and P3, it is more similar to them than to P1. Therefore, we can assign the missing
value to No, which is the value of Fever for P2 and P3. The imputed data table has been shown
inthe Table -2.
The rough set method can be applied iteratively until all missing values are imputed. It can also
be utilized with other methods such as vaguely-quantified rough sets or fuzzy-rough methods to
handle different types of vagueness and uncertainty in data. This provides us the method of
reconstruction of the table in some feasible approach without suffering of any sort of loss of
data (2020).
From a mathematical standpoint, the rough set method can be formalized as follows:
Let U be a finite and non-empty set called the universe, and let A be a non-empty set of
attributes, such that each attribute a ∈ A has a set of possible values ∀a. Then, each
object x ∈ U can be represented by an information vector x = (x(a))a∈ A, where x(a) ∈
Va for each a ∈ A.
Let R ⊆ A be a subset of attributes that we use to group the objects into classes. Then, we
can define an equivalence relation on U called the indiscernibility relation with respect to
R, denoted by IND, as follows: for any x, y ∈ U, x IND y if and only if x(a) = y(a) for
all a ∈ R. The equivalence classes of U with respect to IND are called the basic sets of R.
Let B ⊆ A be another subset of attributes that we use to estimate the missing values.
Then, we can define two operations on P(U), the power set of U, called the lower and
upper approximations with respect to R and B, denoted by aprB and APRB, as follows:
for any X ⊆ U,
o aprB(X) = {x ∈ U|[x]R ⊆ X}, where [x]R is the equivalence class of x with
respect to IND. --------(8)
o APRB(X) = {x ∈ U|[x]R∩X ≠∅}. --------(9)
The lower approximation aprB(X) is the set of all objects that definitely belong to X
based on the attributes in R and B, while the upper approximation APRB(X) is the set of
all objects that possibly belong to X based on the attributes in R and B. The difference
between the upper and lower approximations is called the boundary region of X with
respect to R and B, denoted by BND(R,B)(X), which is defined as follows: for any X ⊆
U,
o BND(R,B)(X) = APRB(X)\aprB(X). -------(10)
The boundary region BND(R,B)(X) is the set of all objects that are uncertain or vague
about their membership in X based on the attributes in R and B. The ratio of the
cardinality of aprB(X) to the cardinality of APRB(X) is called the accuracy measure of X
with respect to R and B, denoted by GAMMA(R,B)(X), which is defined as follows: for
any X ⊆ U,
o GAMMA(R,B)(X) = |aprB(X)|/|APRB(X)|, where |.| denotes the cardinality of a
set. ------------(11)
The accuracy measure GAMMA(R,B)(X) is a number between 0 and 1 that indicates how
well the attributes in R and B can approximate X. The closer GAMMA(R,B)(X) is to 1,
the more accurate the approximation is.
The rough set method for imputing missing data can be seen as an iterative process of finding
optimal values for R and B that maximize GAMMA(R,B)(U). In each iteration, the method
assigns the missing values in B according to the most frequent values in their corresponding
basic sets of R. If there is no clear most frequent value, the method chooses any value randomly.
WORKING MODEL
The standard rough set model assumes that the data are complete and consistent, which may not
be realistic in many real-world scenarios. One of the methods of imputation that use the concepts
of rough set theory is the rough set model. This work by imputing missing values by using the
domain of that attributes value(2020). The domain of an attribute is the pool of possible values
that the attribute can take. For example, if an attribute is Fever, its domain for the value could be
{Yes, No}.
The mathematical model of rough set theory can be described as a way of filling in the blanks in
a table of data. The table has rows and columns. Each row represents an object, such as a patient.
Each column represents an attribute, such as a symptom or a diagnosis. Each attribute has a set
of possible values, such as Yes or No for Fever.
Sometimes, a few values in the table are going missing. For example, we don’t know if a patient
has Fever or not. We want to estimate the missing values based on the information we have from
the rest of the table.
The rough set method uses two steps to estimate the missing values:
1. First, it groups objects into classes based on some attributes that are chosen. For example, the
patients can be grouped on their Cough and Rash symptoms. The patients who have the same
Cough and Rash symptoms belong to the same class. We call these attributes R.
2. Second, it looks at the most common value for the missing attribute in each class. For
example, if Fever is to be estimated for a patient who has Cough and Rash, The other
patientscould be looked up on as if they have the same Cough and Rash symptoms and
their Fever values are chosen in the way such that the value appears most often in that class
called as attribute B .
The rough set method repeats these steps until all missing values are estimated. It can also use
different attributes for grouping and estimating in different iterations. It can also handle cases
where there is no clear most common value in a class by choosing any value randomly.
The rough set method is based on the idea that objects that are similar in some attributes are
likely to be similar in other attributes as well. It is a simple and effective way of dealing with
missing data without using any extra information or assumptions. The analogy between attribute
values has been utilized for the imputation of missing values for attributes.
Consider the example, suppose we have the following data table Table-3 that shows the
symptoms and diagnosis of some patients:
In this table, P5 has a missing value for Fever. To impute this value, we can use the rough set
method with R = {Cough, Rash} and B = {Fever}. The equivalence classes of U{{U} }with
respect to R{R}are {P1}, {P2, P3, P5}, and {P4}. The subset of {P2, P3, P5} that has value No
for Fever is {P2, P3}, while the subset that has value Yes for Fever is empty. Therefore, the
rough set method assigns the missing value to No, which is the most frequent value in {P2, P3,
P5}. The imputed data table is shown in the Table -4.
The rough set method can be applied iteratively until all missing values are imputed. It can also
be combined with other methods, such as fuzzy-rough methods or vaguely quantified rough sets,
to handle different types of uncertainty and vagueness concerned to the data. The variation of
the input and corresponding output values can be plotted graphically. The graph in the Fig. 1
shows input and output of the rough set method for imputing missing values. The input is a data
table that shows the symptoms and diagnosis of some patients. The output is the same data table
with the missing values replaced by estimated values.
The graph shown in Fig 1 has two axes, the x-axis represents the patient no and the y-axis
represents attributes. Each attribute has a domain of possible values, such as {Yes, No} for Fever
or {Flu, Measles} for Diagnosis. The graph uses different colors and shapes to indicate values of
each attribute for each patient. For example, a red circle means Yes, a blue square means No, a
green triangle means Flu, and a yellow diamond means Measles.
The graph also uses dashed lines to separate equivalence classes of patients with respect to the
subset of attributes R = {Cough, Rash}. The equivalence classes are {P1}, {P2, P3, P5}, and
{P4}. The graph shows that P5 has a missing value for Fever, denoted by a question mark.
The graph also shows imputed value for Fever for P5, which is No. This value is chosen by using
the rough set method with R = {Cough, Rash} and B = {Fever}. The method assigns most
frequent value in the equivalence class of P5 to the missing value. Since P2 and P3 have No for
Fever, and no other patient in the same equivalence class has Yes for Fever, the method imputes
No for P5. The imputed value is shown by a blue square with a dashed border.
Step 1: Define the attributes that are relevant for grouping the objects into classes. These
attributes are called R.
Step 2: Define the attribute that has the missing value to be imputed. This attribute is
called B.
Step 3: For each object with a missing value in B, find the set of objects that have the
same values in R. This set is called the equivalence class of the object with respect to R.
Step 4: For each equivalence class, find the most frequent value of B among the objects
in that class. If there is a tie, choose any value randomly. This value is called the lower
approximation of B in that class.
Step 5: Assign the lower approximation of B to the object with the missing value in that
class. Repeat this step for all objects with missing values in B.
Step 6: Return the imputed data table with no missing values in B.
# Define the attributes that are relevant for grouping the objects into classes
R = list of attributes
# For each object with a missing value in B, find the set of objects that have the same values in R
FUNCTION find_equivalence_class(object):
equivalence_class = set()
# Check if the object and the other object have the same values in R
IF object[R] = = other_object[R]:
equivalence_class.add(other_object)
END IF
END FOR
RETURN equivalence_class
END FUNCTION
# For each equivalence class, find the most frequent value of B among the objects in that class
FUNCTION find_lower_approximation(equivalence_class):
frequency = {}
value = object[B]
frequency[value] =frequency.get(value, 0) + 1
END IF
END FOR
max_frequency = max(frequency.values())
# Choose a random value from the max values as the lower approximation
lower_approximation =random.choice(max_values)
RETURN lower_approximation
END FUNCTION
# Assign the lower approximation of B to the object with the missing value in that class
FUNCTION impute_missing_value(object):
IF object[B] is None:
equivalence_class =find_equivalence_class(object)
lower_approximation =find_lower_approximation(equivalence_class)
object[B] =lower_approximation
END IF
END FUNCTION
FUNCTION impute_data_table():
# Call impute_missing_value function to fill in any missing value in B for that object
impute_missing_value(object)
END FOR
RETURN data_table
END FUNCTION
This algorithm is based on the rough set theory, which can handle incomplete and inconsistent
data by using lower and upper approximations of concepts.
Let U be a finite and non-empty set called the universe, and let A be a non-empty set of
attributes, such that each attribute a ∈ A has a set of possible values Va. Then, each
object x ∈ U can be represented by an information vector x = (x(a))a∈A, where x(a) ∈
Va for each a ∈ A.
Let R ⊆ A be a subset of attributes that we use to group the objects into classes. Then, we
can define an equivalence relation on U called the indiscernibility relation with respect to
R, denoted by IND, as follows: for any x, y ∈ U, x IND y if and only if x(a) = y(a) for
all a ∈ R. The equivalence classes of U with respect to IND are called the basic sets of R.
Let B ⊆ A be another subset of attributes that we use to estimate the missing values.
Then, we can define two operations on P(U), the power set of U, called the lower and
upper approximations with respect to R and B, denoted by aprB and APRB, as follows:
for any X ⊆ U,
o aprB(X) = {x ∈ U| [x]R⊆X}, where [x]R is the equivalence class of x with
respect to IND. --------(12)
o APRB(X) = {x ∈ U | [x]R∩X≠ ∅ }. --------(13)
The lower approximation aprB(X) is the set of all objects that definitely belong to X
based on the attributes in R and B, while the upper approximation APRB(X) is the set of
all objects that possibly belong to X based on the attributes in R and B. The difference
between the upper and lower approximations is called the boundary region of X with
respect to R and B, denoted by BND(R,B)(X), which is defined as follows: for any X ⊆
U,
o BND(R,B)(X) = APRB(X)\aprB(X). --------(14)
The boundary region BND(R,B)(X) is the set of all objects that are uncertain or vague
about their membership in X based on the attributes in R and B. The ratio of the
The algorithm for imputing missing values in a data table by using lower and upper
approximations can be seen as an iterative process of finding optimal values for R and B that
maximize GAMMA(R,B)(U). In each iteration, the algorithm assigns the missing values in B
according to the most frequent values in their corresponding basic sets of R. If there is no clear
most frequent value, the algorithm chooses any value randomly.
WHAT ARE THE APPLICATIONS THIS ALGORITHM FINDS IN REAL WORLD
AND HOW THIS ALGORITHM HELPED TO SOLVE SOME UNSOLVED
CHALLENGES THAT ARE NOT SOLVED TODAY?
Some of the applications of this algorithm in the real world are:
Data imputation for medical data, such as symptoms and diagnosis of patients. This can
help to improve the accuracy and reliability of medical decision making and diagnosis.
For example, the algorithm can be used to impute the missing values of fever for some
patients based on their cough and rash symptoms.
Data imputation for power transformer fault detection and prediction using dissolved gas
analysis (DGA). This can help to prevent failures and accidents in power systems and
ensure the safety and stability of power supply. For example, the algorithm can be used to
impute the missing values of some gas concentrations in a DGA dataset.
Data imputation for incomplete information systems, such as customer preferences,
product ratings, or social network data. This can help to enhance the quality and
completeness of data and enable better data analysis and mining. For example, the
algorithm can be used to impute the missing values of some attributes in an incomplete
information system.
This algorithm helped to solve some unsolved challenges that are not solved today by:
Providing a simple and effective way of dealing with missing data without using any
extra information or assumptions. The algorithm only uses the information available in
the data table and does not require any prior knowledge or distribution of the data.
Combining the advantages of rough set theory and fuzzy set theory to handle different
types of uncertainty and vagueness in data. The algorithm can handle incompleteness,
inconsistency, and imprecision by using lower and upper approximations of concepts and
fuzzy membership functions.
Generating concise and interpretable rules from data that can explain the imputation
process and the decision making process. The algorithm uses logical operators and
quantifiers to represent the rules, which can be easily understood by humans.
Enabling a flexible and adaptive way of dealing with missing data by allowing different
attributes for grouping and estimating in different iterations. The algorithm can use
different subsets of attributes as R and B in each iteration, depending on the data
characteristics and the imputation goal. This can help to improve the imputation accuracy
and robustness.
Providing a general framework for data imputation that can be applied to various
domains and scenarios. The algorithm can handle any type of data, such as numerical,
categorical, ordinal, or mixed, and any type of missingness, such as missing completely
at random (MCAR), missing at random (MAR), or missing not at random (MNAR). The
algorithm can also be extended or modified to suit different application requirements and
constraints.
Offering a novel perspective for data analysis and mining based on the concept of rough
sets. The algorithm can reveal the hidden patterns and dependencies among the attributes
and the objects in the data, which can be useful for feature selection, classification,
clustering, association rule mining, and other tasks. The algorithm can also provide
insights into the data quality and completeness, which can help to identify and correct the
sources of errors and noise in the data.
The computation of the indiscernibility relation, which is the basis of rough set theory, is
very expensive and time-consuming. The indiscernibility relation IND is an equivalence
relation on the universe U with respect to a subset of attributes R ⊆ A. It partitions U into
equivalence classes called the basic sets of R. The computation of IND requires
comparing all pairs of objects in U based on their values in R, which has a complexity of
O(|U|^2|A|), where |.| denotes the cardinality of a set. This becomes impractical when |U|
or |A| is very large.
The determination of the optimal subset of attributes for grouping and estimating the
objects, which is essential for rough set analysis and reasoning, is very hard and complex.
The optimal subset of attributes R* ⊆ A is the one that maximizes the quality measure
QLTY(X) for any subset of objects X ⊆ U. QLTY(X) is the ratio of the cardinality of the
lower approximation apr®(X) to the cardinality of U. It indicates how well the attributes
in R can describe or approximate X. The determination of R* requires searching through
all possible subsets of A, which has a complexity of O(2^|A|), where |.| denotes the
cardinality of a set. This becomes infeasible when |A| is very large.
The generation of concise and interpretable rules from data, which is one of the main
goals of rough set analysis and reasoning, is very challenging and uncertain. The rules are
logical expressions that can explain the imputation process and the decision making
process based on the attributes and the objects in the data. The generation of rules
requires finding the minimal subsets of attributes that are sufficient to determine the
values or classes of the objects, which are called reducts. The computation of reducts is
equivalent to solving the set covering problem, which is NP-hard . This means that there
is no efficient algorithm that can find reducts in polynomial time.
These challenges make it difficult to achieve efficient and scalable algorithms for rough set
analysis and reasoning that can handle large-scale and high-dimensional data sets in a reasonable
time and with a satisfactory accuracy. Several approaches have been proposed to overcome these
challenges, such as using extensions, preprocessing techniques, hybrid methods, parallel
computing frameworks, approximation methods, heuristic methods, etc. However, these
approaches also have some trade-offs and limitations, such as losing information, increasing
complexity, requiring extra assumptions or parameters, sacrificing accuracy or interpretability,
etc.
WAY TO OVERCOME
The computational complexity of some rough set operations, such as reduct(2019) computation
and rule induction, may be prohibitive for large-scale data sets. Rough set analysis and reasoning
is a mathematical framework for dealing with imprecise and uncertain data. It uses two basic
operations called approximations namely the lower and upper approximation to represent a set
of objects by their attribute values. The Rough sets can be utilized for knowledge discovery, data
mining, etc. One probable algorithm for rough set analysis and reasoning is QuickReduct
Algorithm(2017). It is used to find a minimal subset of attributes that preserves the dependency
between attributes and the decision attribute. Some ways to build efficient and scalable
algorithms for rough set analysis and reasoning are given as follows
For example, the Apache Spark to implement QuickReduct algorithm for finding a minimal
reduct of attributes is one such algorithms and is being described below.
1. Input: A decision table S = (U, A union {d}), where U is a set of objects, A is a set of
conditional attributes, and d is the decision attribute.
2. Output: A minimal reduct R of A that preserves the dependency between A and d.
3. Steps:
1. Load the decision table S as a Spark DataFrame and cache it in memory for faster access.
2. Initialize R as an empty set.
3. Calculate the dependency degree of A on d using a user-defined function (UDF) that computes
the positive region of A based on equivalence classes. This can be done by grouping the
DataFrame by A and d columns and counting the distinct values of d for each group.
4. For each attribute a in A, calculate the significance of a with respect to R and d using another
UDF that computes the positive region of R union {a} based on equivalence classes. This can be
done by grouping the DataFrame by R union {a} and d columns and counting the distinct values
of d for each group.
5. Find the attribute a* that has maximum significance among all attributes in A using the Spark
SQL max function.
6. If SIG( a*, R) >0, then add a* to R and remove it from A. Go to step 4.
7. Else, return R as the minimal reduct.
Some more examples to explain the concept of scalable rough set algorithms are:
1. Using a scalable and effective rough set theory-based approach for big data pre-
processing, specifically for feature selection, under the Spark framework. This approach
uses a novel data structure called compressed decision table (CDT) to store only the
essential information of the input data and reduce the memory consumption. It also uses
an optimization technique called candidate elimination algorithm (CEA) to avoid
redundant calculations and improve the performance. The approach can handle data sets
with up to 10,000 attributes and achieve a good speedup and accuracy.
2. Using a scalable feature selection method using rough set theory on Hadoop MapReduce
framework. This method uses attribute-value pairs (AVPs) to represent the input data and
reduce the data size. It also uses a parallel algorithm to compute the significance of each
attribute based on positive regions and equivalence classes. The method can handle large-
scale data sets and find a minimal reduct of attributes.
3. Using a dominance-based rough set approach (DRSA) for multi-criteria decision analysis
on large-scale data sets. DRSA is an extension of rough set theory that can handle
preference-ordered attributes and incomparable objects. It uses a parallel algorithm to
compute the lower and upper approximations of decision classes based on dominance
relations. It also uses a parallel algorithm to generate decision rules based on reducts and
discernibility matrices. The idea can deal with complex decision problems and provide
consistent and smart solutions.
The Table5 below shows the information system S = (U, A), where U is the set of objects {x1,
x2, …, x10} and A is the set of attributes {A1, A2, A3, A4, A5, D}. The set of conditional
attributes is C = {A1, A2, A3, A4, A5} and the set of decision attributes is D = {D}. The
symbol ? represents missing values in the data.
Object A1 A2 A3 A4 A5 D
x1 1 0 0 0 1 c1
x2 0 0 0 1 0 c2
x3 0 1 0 0 1 c2
x4 1 0 1 0 0 c1
x5 0 0 1 1 0 c2
x6 0 1 1 0 1 c2
x7 ? ? ? ? ? c1
x8 ? ? ? ? ? c2
x9 ? ? ? ? ? c2
x10 ? ? ? ? ? c2
The first step is to construct IND for each subset of C. IND partitions U into equivalence classes,
which are denoted by [x]C for any x ∈ U. The equivalence class [x]C contains all the objects
that are indiscernible from x with respect to C. For example, for the subset {A1}, we have three
equivalence classes: [x1]{A1} = {x1,x4,x7}, [x2]{A1} = {x2,x5}, and [x3]{A1} =
{x3,x6,x8,x9,x10}. For the subset {A2}, we have two equivalence classes: [x1]{A2} =
{x1,x2,x4,x5}, and [x3]{A2} = {x3,x6,x7,x8,x9,x10}. And so on.
The second step is to calculate B(X) and B*(X) for each subset of C and each decision class X.
B(X) contains the objects that definitely belong to X, and B*(X) contains the objects that
possibly belong to X. For example, for the subset {A1}, B(c1) = {x1,x4,x7}, and B*(c1) =
{x1,x4,x7}. B(c2) = {x2,x5}, and B*(c2) = {x2,x5,x3,x6,x8,x9,x10}.
The third step is to evaluate the quality of each subset of C using some criteria, such as
dependency degree γ(C,D), consistency degree α(C,D), significance σ(C,D), or discernibility
β(C,D). These criteria measure how well a subset of C can distinguish the decision classes and
reduce the uncertainty or ambiguity of the data. For example, the dependency degree γ(C,D) of a
subset of C is the ratio of the number of objects in B(D) to the total number of objects. The
higher the dependency degree γ(C,D), the better the subset of C. For the subset {A1}, γ({A1},D)
= (7/10), since there are (7/10) objects in B(D). For the subset {A2}, γ({A2},D) = (6/10), since
there are (6/10) objects in B(D).
The fourth step is to find a minimal subset of C that has the same or similar quality as C. This
subset is called a reduct of C relative to D, which is denoted by R(C,D). There may be more than
one reduct for a given information system, and the intersection of all reducts is called the core of
C relative to D, which is denoted by CORE(C,D). The core contains the most important
attributes that cannot be removed without affecting the quality of C. There are different methods
to find reducts, such as exhaustive search, heuristic search, genetic algorithms, etc. For example,
one possible reduct for this information system is R(C,D) = {A1,A3}, which has a dependency
degree γ(R(C,D),D) = (9/10), which is close to the dependency degree γ(C,D) = (10/10).
Another possible reduct is R(C,D) = {A2,A4}, which also has a dependency degree
γ(R(C,D),D) = (9/10).
The fifth step is to generate decision rules from the reducts. A decision rule is an expression of
the form:
IF P THEN Q
IF A1 = 1 AND A3 = 0 THEN D = c1
This rule means that if an object has value 1 for attribute A1 and value 0 for attribute A3, then it
belongs to class c1.
The final step is to use the decision rules for analysis and reasoning. For example, we can use the
decision rules to classify new objects, explain existing objects, or discover new patterns in data.
WORKING MODEL
The mathematical model is more explanatory towards building scalable and robust algorithms for
rough set analysis and reasoning. The algorithm given below illustrates the mathematical
approach.
2. Calculate gamma(A).
3. For each attribute a in A, calculate its significance with respect to R and d, denoted by SIG(a, R),
as follows:
SIG(a,R) = γ(R∪{a})−γ(R)
4. Find the attribute a* that has the maximum significance among all attributes in A.
5. If SIG( a*, R) >0, then add a* to R and remove it from A. Go to step 3.
6. Else, return R as a reduct.
Define an algorithm for generating decision rules based on reducts and approximations. The
algorithm should be scalable and robust to handle large-scale and noisy data sets. For example,
one possible algorithm is:
1. For each reduct R of A, generate all possible combinations of attribute-value pairs from R,
denoted by C.
2. For each combination c in C, find its lower approximation with respect to R, denoted by cR.
3. If cR is not empty and belongs to only one decision class d(x), then generate a decision rule of
the form:
c→d(x)
4. Return all generated decision rules.
Here is a possible pseudocode for the above algorithm:
# Define an information system and a target set
DATA = table of objects, attributes, and decision class
TARGET = subset of objects to be approximated
FUNCTION equivalence_classes(relation):
# Return the list of values in the relation dictionary as equivalence classes
RETURN list(relation.values())
END FUNCTION
# Define lower and upper approximations of a target set based on an equivalence relation
FUNCTION lower_approximation(TARGET, relation):
# Create an empty set for the lower approximation
lower = set()
# Loop through all the equivalence classes in the relation
FOR each class in relation:
# Check if the class is a subset of the target set
IF class.issubset(TARGET):
# Add the class to the lower approximation
lower =lower.union(class)
END IF
END FOR
# Return the lower approximation
RETURN lower
END FUNCTION
# Define boundary region and accuracy of a target set based on lower and upper approximations
FUNCTION boundary_region(upper, lower):
# Return the difference between upper and lower approximations as boundary region
RETURN upper - lower
END FUNCTION
# Define dependency degree and reduct of an attribute subset based on positive region and
heuristic criteria
FUNCTION positive_region(TARGET, classes):
# Create an empty set for the positive region
positive = set()
# Loop through all the equivalence classes
FOR each class in classes:
# Check if the class is a subset of the target set
IF class.issubset(TARGET):
# Add the class to the positive region
positive =positive.union(class)
END IF
END FOR
# Return the positive region
RETURN positive
END FUNCTION
# Define an algorithm for generating decision rules based on reducts and approximations
FUNCTION generate_rules(DATA, REDUCTS, classes):
# Create an empty list for rules
rules = []
# Loop through all reducts
FOR each reduct in REDUCTS:
# Generate all possible combinations of attribute-value pairs from reduct
combinations =generate_combinations(reduct)
# Loop through all combinations
FOR each combination in combinations:
# Find lower approximation of combination with respect to reduct
lower =lower_approximation(combination, reduct)
# Check if lower approximation is not empty and belongs to only one decision class
IF lower != set() and len(set(object[decision] for object in lower)) = = 1:
The graph in the Fig 2 shows a simplified example of the rough set model for a data set with two
conditional attributes (X and Y) and one decision attribute (D). The input is a decision table with
six objects (A, B, C, D, E, F) and their attribute values. The output is a reduct of attributes (X)
and a decision rule (X >0.5 ->D = 1).
The graph in Fig .3 displays plotting of the objects on a 2D plane according to their X and Y
values. The objects are color-coded by their decision class with blue for D = 0 and red for D =
1. The graph also shows equivalence classes of the objects based on the reduct attribute X. The
equivalence classes are horizontal lines that partition the plane into regions where all the objects
have same X values. For example, the equivalence class of A is the line X = 0.2.
The graph also shows lower and upper approximations of the decision class D = 1 with respect
to the reduct attribute X. The lower approximation consists of all the objects that definitely
belong to D = 1, which are E and F. The upper approximation consists of all the objects that
possibly belong to D = 1, which are C, E and F. The boundary region consists of the object that
is uncertain whether it belongs to D = 1 or not, and id denoted by C.
The graph also shows the positive region of the decision class D = 1 with respect to the reduct
attribute X. The positive region consists of all the equivalence classes that are subsets of D = 1,
which are X = 0.8 and X = 0.9. The dependency degree of D on X is the ratio of the
cardinality of the positive region to the cardinality of the whole set, which is 2/6 = 0.333.
The graph also shows the decision rule that can be generated from the reduct attribute X and the
lower approximation of D = 1. The decision rule is X >0.5 ->D = 1, which means that if an
object has an X value greater than 0.5, then it belongs to the decision class D = 1. This rule can
correctly classify E and F, but it cannot classify A, B, C and D.
Rough set theory is a mathematical framework for dealing with uncertainty and vagueness in
data analysis and decision making. It was introduced by Zdzisław Pawlak in the 1980s and has
been widely applied in various domains such as artificial intelligence, data mining, machine
learning, pattern recognition, and bioinformatics. Rough set analysis and reasoning involve two
main tasks: finding reducts and computing approximations. A reduct is a minimal subset of
attributes that preserves discernibility of objects in a data set. An approximation is a pair of
lower and upper bounds that enclose a target set of objects based on the available information.
Finding reducts and computing approximations are computationally expensive operations,
especially for large and complex data sets. Therefore, many researchers have proposed different
methods to improve the efficiency and scalability of these algorithms. Some of the main
approaches are
1. Heuristic methods: These methods use various strategies to reduce search space or
speed up the computation of reducts and approximations. For example, genetic
algorithms, ant colony optimization, tabu search, simulated annealing,(2007) etc.
This literature review aims to provide a comprehensive overview of state-of-the-art methods for
building efficient and scalable algorithms for rough set analysis and reasoning. It also discusses
the challenges and future directions for this research area.
There is no definitive answer to why the integration of rough sets with other methods cannot be
achieved still today. However, some possible reasons are:
The integration of rough sets with other methods is not a trivial task. It requires a careful
analysis of the theoretical foundations, the compatibility of the assumptions, the
computational complexity, and the practical benefits of each method. Different methods
may have different goals, perspectives, and representations of data, which may not be
easily reconciled or combined.
The integration of rough sets with other methods may not always be necessary or
desirable. Sometimes, a single method may be sufficient or preferable for a specific
problem or domain. For example, if the data is crisp and discrete, rough sets may be more
suitable than fuzzy sets or neural networks. If the data is noisy and complex, neural
networks may be more suitable than rough sets or probabilistic methods. If the data is
uncertain and incomplete, probabilistic methods may be more suitable than rough sets or
fuzzy sets.
The integration of rough sets with other methods may not always be feasible or effective.
Sometimes, a hybrid method may introduce new problems or drawbacks that outweigh
the advantages. For example, if the integration of rough sets and fuzzy sets leads to a loss
of interpretability or accuracy, it may not be worth pursuing. If the integration of rough
sets and neural networks leads to a high computational cost or overfitting, it may not be
worth pursuing. If the integration of rough sets and probabilistic methods leads to a
conflict of semantics or inconsistency, it may not be worth pursuing.
WAYS TO OVERCOME.
Rough Set Theory is a efficient mathematical tool to handle inconsistent, imprecise and
uncertain knowledge. The scope of assembling various methods such as fuzzy sets, neural
networks and probabilistic reasoning by using different types of approximation spaces or rough
relations are quite abundant. For example, rough sets and fuzzy sets are complementary
generalizations of classical sets. The approximation spaces of rough set theory are sets with
multiple memberships, while fuzzy sets are concerned with partial memberships. Neural
networks can be utilized to learn equivalence relations or membership functions for rough and
fuzzy sets respectively. Probabilistic reasoning can be combined with rough set theory by using
Bayesian networks or Dempster-Shafer theory(2006) to model uncertainty of the data.
Some examples of applications of rough set theory integration with other methods are:
1. Medical data analysis: Rough sets provide a semi-automatic approach to medical data
analysis and can combine with other complementary techniques, e.g., soft-computing, data
mining, statistics, intelligent systems, machine learning, and pattern recognition.
2. Image processing: Rough sets can be used to segment images into regions based on different
attributes or features. Fuzzy sets can be used to handle the ambiguity of the boundaries
between regions. Neural networks can be used to classify the regions into different
categories.
3. Data mining: Rough sets can be used to discover hidden patterns and rules from data. Fuzzy
sets can be used to deal with imprecise or incomplete data. Neural networks can be used to
learn complex nonlinear relationships from data. Probabilistic reasoning can be used to
measure confidence or reliability of patterns and rules.
Mathematically, these ways of integration can be expressed using different operators or functions
that combine the outputs of rough sets and other methods. For instance, let R be a rough set
approximation operator that maps a set X to a pair of lower and upper approximations (R(X),
R(X)), where R(X) is the union of all positive regions and R(X) is the union of all positive and
boundary regions. Let F be a fuzzy set operator that maps a set X to a membership function F(X)
that assigns a degree of belonging to each element of X. Then, one possible way of integrating
rough sets and fuzzy sets is to define a new operator G that maps a set X to a fuzzy set G(X) as
follows:
where * is the product t-norm that computes the minimum of two values. This operator G can be
interpreted as applying the fuzzy set operator F to both lower and upper approximations of X,
and then taking the product of them. This way, G(X) reflects both the certainty and uncertainty
of X in terms of fuzzy sets.
To integrate Rough Set with Neural Networks the following steps can be followed to achieve this
integration:
1. Discretize the continuous attributes of the data using some methods such as entropy-based or
error-based discretization.
2. Construct a decision table from the discretized data, where each row represents an object, each
column represents an attribute, and the last column represents the decision class.
3. Apply rough set theory to find the minimal reducts of the decision table, which are the
minimal subsets of attributes that preserve the same classification ability as the whole set of
attributes.
4. Use the reducts as inputs to train a neural network, such as a multilayer perceptron (MLP), to
classify the objects into their decision classes.
5. Evaluate the performance of the neural network using some metrics such as accuracy,
precision, recall, or F-measure.
Here is a possible pseudocode for the algorithm of integrating rough set with neural network:
# Define the data and the target class
DATA = table of objects, continuous attributes, and target class
TARGET = column of target class
# Apply rough set theory to find the minimal reducts of the decision table
FUNCTION reducts(decision_table):
# Use some rough set method, such as discernibility matrix or genetic algorithm, to find the
minimal reducts
# Return a list of reducts, where each reduct is a subset of attributes
RETURN list_of_reducts
END FUNCTION
# Evaluate the performance of the neural network models and select the best one
FUNCTION evaluate(models):
# Create an empty dictionary for storing the performance scores of each model
scores = {}
# For each model in the list of models
FOR each model in models:
# Get the performance score of the model based on some metric, such as accuracy, precision,
recall, or F-measure
score =model.score(metric)
# Store the score in the dictionary with the model as key and score as value
scores[model] = score
END FOR
# Find the maximum score among all scores
max_score = max(scores.values())
# Find all models that have the maximum score
max_models = [model for model, score in scores.items() if score = = max_score]
# Choose a random model from the max models as the best model
best_model =random.choice(max_models)
# Return the best model and its score
RETURN best_model, max_score
END FUNCTION
To illustrate the process of finding a solution to our topic using rough set based approach, we
will use an example problem of classifying iris flowers based on their sepal length (SL), sepal
width (SW), petal length (PL), and petal width (PW). The iris dataset is a well-known benchmark
dataset that contains 150 instances of three classes: Iris-setosa (S), Iris-versicolor (V), and Iris-
virginica (G). The table below shows the information system S = (U, A), where U is the set of
objects {x1, x2, …, x10} and A is the set of attributes {SL, SW, PL, PW, Class}. The set of
conditional attributes is C = {SL, SW, PL, PW} and the set of decision attributes is D =
{Class}. The Table 6 below shows a sample of 10 instances from the dataset:
The first step is to use rough sets to preprocess the data for other methods. I will use the PSO
algorithm to discretize the continuous attributes into intervals and then construct a decision table
based on the discretized values. The PSO algorithm is a swarm intelligence technique that can
find optimal solutions by simulating the social behavior of a population of particles. The decision
table is a table that represents the data in terms of conditional attributes and decision attributes,
as defined in rough set theory. The Table 7 below shows the discretized decision table for the
sample data:
SL SW PL PW Class
WORKING MODEL
A working model of rough set theory integration with other methods can be described as follows:
1. Suppose a data set contains some attributes and a decision attribute. The rough set theory
is to be used to extract some rules or patterns from the supplied data that can help us
make decisions or predictions.
2. First, an approximation space (U, R) is required to be defined, where U is the set of all
objects in the data and R is an equivalence relation over U. The equivalence relation R
can be defined by using one or more attributes of the data. For example, the relation R
can be writtenas R = IND(A), where A is a subset of attributes and IND(A) is the
indiscernibility relation that groups objects that have same values for A.
3. Next, adecision attribute d is required to bedefined , which is the attribute that one wants
to predict or explain. The decision attribute d can be crisp or fuzzy, depending on the
nature of the problem. For example, if objects are to be classified into discrete categories,
a crisp decision attribute requires to be defined. In case of measuring the degree of
belongingness of objects to a category, a fuzzy decision attribute can be used.
4. Then, the lower and upper approximations of the decision attribute d are required to be
defined and denoted by d↓ and d↑ respectively. These approximations are subsets of U that
contain the objects that certainly or possibly belongtod. Mathematically, d↓ = ⋃{B∈
U/R:B⊆d} and d↑ = ⋃{B∈ U/R:B∩d≠ ∅}. -----------(20)
5. Finally, somerules or patterns are required to be extracted from the approximations of d.
A rule or a pattern is a logical expression that relates some attributes to the decision
attribute. For example, a rule can be of the form IF A THEN d, where A is a condition on
some attributes and d is a value of the decision attribute. A pattern can be of the form A
IMPLIES d, where A is a conjunction of some attributes and d is a value of the decision
attribute.
6. To extract rules or patterns from approximations of d, different methods such as reducts,
discernibility matrices, decision tables, decision trees, etcare used. These methods are
based on finding subsets of attributes which is minimal in nature that preserve
information about d or finding subsets of maximal nature of objects that share the same
values for d.
7. To integrate rough set theory with other methods such as fuzzy sets, neural networks,
probabilistic reasoning, etc., Different types of approximation spaces or equivalence
relations that incorporate the features of these methods are used. For example, fuzzy sets
can be integrated with rough sets by using fuzzy equivalence relations or fuzzy similarity
relations to define the approximation spaces. Neural networks can be integrated with
rough sets by using neural networks to learn the equivalence relations or the membership
functions for rough and fuzzy sets respectively. Rough sets can be assembled with
probabilistic reasoning by utilizing Bayesian networks or Dempster-Shafer theory to
model unpredictability of data. Based on web search results, an example of a numerical
or graphical problem that uses rough set theory integration with other methods are as
follows:
8. Suppose a given data set contains some numerical attributes and a categorical decision
attribute. The data set is shown in Table 8.
The rough set theory is used to extract some rules that can help us predict the value of decision
attribute Play based on the values of the numerical attributes Temperature, Humidity, and Wind.
9. First, an approximation space (U, R) needs to be defined, where U is set of all objects in
data and R is an equivalence relation over U. The equivalence relation R can be defined
by using one or more attributes of data. For example, we can use R = IND(Temperature),
where IND(Temperature) is the indiscernibility relation that group objects that have same
values for Temperature. However, this relation would not be very useful because it would
create too many equivalence classes with only one object for each. Therefore, a more
general relation is needed that can group objects having similar values for
Temperature. One way to do this is to use fuzzy sets to define a fuzzy equivalence
relation or a fuzzy similarity relation.
10. A fuzzy set is a set with partial memberships, where each object has a degree of
belongingness to the set between 0 and 1. A fuzzy equivalence relation is a reflexive,
symmetric and transitive fuzzy relation that assigns a degree of equivalence between any
two objects. A fuzzy similarity relation is a reflexive and symmetric fuzzy relation that
assigns a degree of similarity between any two objects.
11. To define a fuzzy equivalence or similarity relation for Temperature, a membership
function is needed to be specified that maps each value of Temperature to a degree of
membership in a fuzzy set. For example, a triangular membership function can be used
that assigns a degree of membership of 1 to the center value of fuzzy set, and decreases
linearly to 0 at the left and right ends of the fuzzy set. Fig 4 shows an example of three
fuzzy sets for Temperature such as Low, Medium, and High.
12. Using these fuzzy sets, a fuzzy equivalence or similarity relation can be can be defined
for Temperature by assigning a degree of equivalence or similarity between any two
objects based on their degrees of membership in the same fuzzy set. For example, if a
fuzzy equivalence relation is used, a degree of equivalence of 1 can be assigned between
two objects that have same values for Temperature, and a degree of equivalence of 0.5
between two objects that have different values for Temperature but belong to the same
fuzzy set. Table 9 shows an example of a fuzzy equivalence matrix for Temperature.
o1 o2 o3 o4 o5 o6 o7
o1 1 0 0 0.5 0 0 0
o2 - 1 0.5 0 0 0 0
3 - - 1 0 - - -
GRAPH
From the table of the working model:
The graph in Fig 4. shows the relationship between temperature, humidity, wind and play. The
input data are the values of temperature, humidity and wind for each observation. The output
data are values of play for each observation. The graph plots the input data as points on a three-
dimensional coordinate system, where the x-axis represents temperature, the y-axis represents
humidity and the z-axis represents wind. The output data are shown as colors on the points,
where red means play is yes and blue means play is no. The graph helps to visualize how the
input variables affect the output variable. It is evident from the graph in fig 4, higher wind speeds
tend to result in no play, while lower wind speeds tend to result in yes play. There is no clear
pattern between temperature and humidity and play.
LITERATURE REVIEW
Rough sets are a mathematical tool for dealing with uncertainty and vagueness in data analysis
and decision making. They can be used to approximate a set by its lower and upper
approximations, which represent the certain and possible elements of the set respectively. For
example, if some objects are required to be classified into red or blue, but some of them have
mixed colors, roughsets can be used to define the red and blue sets by their lower and upper
bounds, which include objects that are definitely or possibly red or blue.
Fuzzy sets are another way of handling uncertainty and imprecision in data. They allow the
membership of an element to a set to be expressed by a degree of truth, ranging from 0 to 1.
Fuzzy sets can be seen as a generalization of classical sets, where the membership is either 0 or
1. For example, In case of measuring the temperature of a room, with an inaccurate thermometer
, the fuzzy sets are used to describe the temperature by a fuzzy number, which has a core
interval and two fuzzy intervals that indicate the possible error range.
Neural networks are computational models inspired by the structure and function of biological
neurons. They consist of interconnected nodes that process information and learn from data.
Neural networks can be used for various tasks such as classification, regression, clustering,
pattern recognition and more. For example, in order to recognize handwritten digits, the neural
networks is used to learn from a large dataset of labeled images, and then the learned network is
used to classify new images based on their similarity to the learned patterns.
Probabilistic methods are based on the use of probability theory and statistics to model
uncertainty and randomness in data. They can be used to quantify the likelihood of events,
outcomes and hypotheses, as well as to infer unknown parameters from observed data. For
example, in case of diagnosing a disease based on some symptoms, probabilistic methods can be
used to calculate the posterior probability of having the disease given the symptoms, based on
some prior knowledge and evidence.
The integration of rough sets with other methods can be done in different ways, depending on the
purpose and the nature of the problem. Combining rough sets with fuzzy sets to deal with both
vagueness and uncertainty in data. For example, using rough-fuzzy hybridization to construct
fuzzy rules from rough decision tables, which can capture both certain and possible associations
between conditions and decisions. Another example is using fuzzy-rough feature selection to
reduce the dimensionality of data by selecting the most relevant features that preserve the fuzzy
dependency between inputs and outputs.
Combining rough sets with neural networks to enhance the learning and generalization
capabilities of both methods.
For example, using rough-neural hybridization to initialize the weights of neural networks by
using rough membership functions, which can improve the convergence speed and accuracy of
learning process. Another example is using neural-rough feature extraction to extract relevant
features from data by using neural networks to learn nonlinear mappings from inputs to outputs,
and then applying rough set theory to select the most informative features. Combining rough sets
with probabilistic methods to incorporate both aleatory and epistemic uncertainty in data. For
example, using rough-Bayesian hybridization to update the belief of rough sets based on new
evidence by using Bayesian inference, which can handle both uncertainties due to randomness
and uncertainty due to lack of knowledge. Another example is using probabilistic-rough
clustering to group data into uncertain clusters by using probabilistic models such as Gaussian
mixture models or hidden Markov models to estimate the cluster parameters and then applying
rough set theory to obtain lower and upper approximations of the clusters.
One of the main advantages of rough set theory is that it does not require any prior knowledge or
assumptions about the data, such as probability distributions or membership functions. Instead, it
relies on the concept of indiscernibility relations, which partition the data into equivalence
classes based on their attribute values. These equivalence classes form the basic building blocks
of rough sets, which are sets that can be approximated by their lower and upper approximations.
Rough set theory has many potential applications and future directions for research and
development. The potential research direction for the roughset theory can be summarized as:
1. Developing new methods and algorithms for rough set-based data analysis, such as feature
selection, attribute reduction, rule induction, classification, clustering, and association
mining.
2. Integrating rough set theory with other methods and paradigms, such as fuzzy sets, neural
networks, evolutionary algorithms, quantum computing, and blockchain(2002).
3. Applying rough set theory to real-world problems and domains, such as bioinformatics,
medical diagnosis, social network analysis, image processing, natural language processing,
and cybersecurity(2018).
4. Exploring the theoretical foundations and properties of rough set theory, such as its logical
aspects, algebraic structures, topological aspects, and computational complexity (2020).
5. Extending rough set theory to deal with more complex and dynamic data types and
structures, such as temporal data, spatial data, multimedia data, streaming data, and big data
(2018).
CONCLUSION
Rough set theory offers a unique method of dealing with the vagueness and uncertainty in data.
Unlike the crisp set, the rough set can be defined by the lower and upper approximations
respectively kept apart by the precise width called boundary region meant for the uncertainty and
vagueness (2012). The multifaceted application areas of the rough set are very evident. The
various domains such as data mining, knowledge discovery, machine learning, and artificial
intelligence has sought the application of rough set with near accurate result. Some other allied
fields like pattern recognition and information processing, performance evaluation, business and
finance, industry and environmental engineering, and intelligent control systems. The main
purpose of such mathematical domain is to handle incomplete, inconsistent, and noisy data by
using equivalence classes and indiscernibility relations. The very concept of indiscernibility has
been dealt in details. Some other salient features to be included in the purview of the roughest
application areas are performing feature selection, data reduction, rule extraction, and pattern
analysis using concepts such as reducts, cores, dependencies, and decision tables. Reduct and
core were the two important and significant properties of rough sets which respectively implies
the minimal subset of attributes and the subset of attributes respectively. The rough set can be
used to interpret the decision table containing decision attributes and the dependency therein.
The application of roughset to other fields of research like soft computing such as fuzzy sets,
neural networks, and probabilistic reasoning have been dealt with in this paper in detail (2014).
Soft computing is a term that refers to techniques that can deal with imprecision, uncertainty, and
partial truth in data and information processing. Fuzzy sets are sets that have degrees of
membership between 0 and 1, rather than binary membership as in crisp sets. Neural networks
are computational models that mimic the structure and function of biological neurons and can
learn from data. Probabilistic reasoning is a form of logic that incorporates uncertainty and
likelihood into reasoning and inference. Though the scope of such mathematical concept is multi
directional and bears a large array of future scopes in all sorts of research domain still a lot of
researches are required to be carried out (1997). This text is a brief conclusion on rough set
theory based on some online sources with added examples and explanations of concepts.
REFERENCES:
1. Pawlak, Z, Rough Sets, International Journal of Information and Computer Science, 11, pp. 341-
356(1982).
2. Inwinski, T.,Algebraic Approach to Rough Sets. Bull. Polish Acad. Sci., Math., 37, pp. 187-
192(1987).
3. Bryniarski, E, A Calculus of Rough Sets of the Firet Order. Bull. Pol. Acad. Sci. Math., 37, pp.
71-78(1989).
4. Nieminen, J, Rough Sets, Screens, Roundings, and Relations. Bull. Polish Acad. Sci. Tech. 37,
pp. 351-358(1990).
5. Wiweger, A, On Topological Rough Sets, Bull. Polish Acad. Sci. Math, 37, pp. 51-62(1988).
6. Pawlak, Z On Learning – a Rough Set Approach. Lecture Notes in Computer Sciences, Springer
Verlag, 208, pp. 197-227(1986).
7. G.Y. Wang, Y. Wang (2009) Fundam.Inf.90(4), pp. 395-426.
8. Zdzislaw Pawlak Rough Sets, Theoretical Aspects of Reasoning about Data, pp. 174-179(1991).
9. R.W. Swiniarski, A.Skowron, Pattern Recognitt. Lett., 24(6)l pp. 833-849(2003).
10. Boryczka, M. and Slowinski, R. Derivation of Optimal Decision Algorithms from Decision
Tables using Rough Sets, Bull. Polish Acad. Sci. Tech., 36, pp.252-260(1988).
11. Fuzzy-Rough Hybridization: A New Trend in Data Mining by Sushmita Mitra and Sankar
K. Pal(1999).
12. A Hierarchical Fused Fuzzy Deep Neural Network for Data Classification (2017).
13. A Hybrid Fuzzy Genetic Algorithm for Solving Traveling Salesman Problem by R. Rajesh and R.
Rajaram (2010).
14. Lower and Upper Approximation of Rough Set by S. K. Pal and A. Skowron (1991).
15. The Axiomatization of the Rough Set Upper Approximation Operations by J. T. Yao and Y. Y. Yao
(2006).
16. Intuitionistic Fuzzy Rough Sets: Theory to Practice by Shivani Singh and Tanmoy Som (2022)
17. Imputation of missing values by scikit-learn 1.3.0 documentation (2023).
18. Rough Set Theory Based Missing Value Imputation by M. Sujatha, G. Lavanya Devi, K.
Srinivasa Rao and N. Ramesh (2018).
19. Indiscernibility Relation, Rough Sets and Information System by Xibei Yang and Jingyu Yang
(2012)
20. Indiscernibility relations by interrelationships between attributes in rough set data analysis by T.
Nakashima and H. Ishibuchi(2012).
21. Rough Sets Meet Statistics - A New View on Rough Set Reasoning About Numerical Data by
Marko Palangetić, Chris Cornelis, Salvatore Greco and Roman Słowiński (2020).
22. Vagueness and Uncertainty: An F-Rough Set Perspective by Dayong Deng, HoukuanHuang
(2017).
23. MANAGING UNCERTAINTY IN THE CONSTRUCTION INDUSTRY THROUGH THE
ROUGH SET THEORY by CvetankaCvetkovska(2001).
24. Multiobjective Evolution of Fuzzy Rough Neural Network via Distributed Parallelism for Stock
Prediction by Xiangyu Kong, Shangce Gao, Xin Yao (2020).
25. A scalable and effective rough set theory-based approach for big data pre-processing by Zaineb
Chelly Dagdia, Christine Zarges, Gaël Beck, Mustapha Lebbah(2018).
26. An outlier detection approach in large-scale data stream using rough set by Manmohan Singh,
Rajendra Pamula (2019).
27. Rough Set-Based Feature Selection Techniques by Muhammad Summair Raza, Usman Qamar
(2017).
28. Hadoop MapReduce Cookbook by Srinath Perera and Thilina Gunarathne (2013)
29. Learning Spark: Lightning-Fast Data Analytics by Jules S. Damji, Brooke Wenig, Tathagata Das,
and Denny Lee (2020)
30. A New Heuristic Reduct Algorithm Base on Rough Sets Theory by Liang and Wang (2007).
31. Parallel and Distributed Computation: Numerical Methods by Dimitri P. Bertsekas and John
N. Tsitsiklis(1989)
32. Rough Sets and Intelligent Systems - Professor Zdzisław Pawlak in Memoriam: Volume 1 by
Andrzej Skowron (2013).
33. A novel hybrid feature selection method based on rough set and improved harmony search by H.
Hannah Inbarani, M. Bagyamathi, Ahmad Taher Azar (2015).
34. Decision-Theoretic Rough Set Models by YiyuYao (2006).
35. Rough Sets: Mathematical Foundations by Lech Polkowski (2002).
36. A Survey on Rough Set Theory and Its Applications by ShuqinFan (2018).
37. Rough Sets and Knowledge Technology by Dominik Ślęzak(2020).
38. Rough Sets: Past, Present, and Future by Andrzej Skowron and Soma Dutta (2018).
39. Rough-Fuzzy Pattern Recognition: Applications in Bioinformatics and Medical Imaging by
Pradipta Maji and Sankar K. Pal by John Wiley &Sons (2012).
40. Rough Sets and Current Trends in Computing: 9th International Conference, RSCTC 2014,
Granada and Madrid, Spain (2014).
41. Rough Sets: Selected Methods and Applications in Management and Engineering by Georg Peters
and Andrzej Skowron by Springer (2012).
42. Rough Sets and Data Mining: Analysis of Imprecise Data by T.Y. Lin and N. Cercone (1997).