0% found this document useful (0 votes)
24 views11 pages

Value of Fuzzy Logic For Data Mining and Machine Learning - A Case Study

This document discusses the value of fuzzy logic for data mining and machine learning. It outlines that fuzzy logic methods can perform activities that non-fuzzy methods cannot. The document then proposes a novel clustering-based technique for association rule mining that works well with fuzzy methods but not non-fuzzy methods. The technique is experimentally verified on several real-world datasets.

Uploaded by

Mas Yudi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views11 pages

Value of Fuzzy Logic For Data Mining and Machine Learning - A Case Study

This document discusses the value of fuzzy logic for data mining and machine learning. It outlines that fuzzy logic methods can perform activities that non-fuzzy methods cannot. The document then proposes a novel clustering-based technique for association rule mining that works well with fuzzy methods but not non-fuzzy methods. The technique is experimentally verified on several real-world datasets.

Uploaded by

Mas Yudi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Expert Systems with Applications 162 (2020) 113781

Contents lists available at ScienceDirect

Expert Systems with Applications


journal homepage: www.elsevier.com/locate/eswa

Value of fuzzy logic for data mining and machine learning: A case study
Vugar E. Mirzakhanov
Department of Computer Engineering, Azerbaijan State Oil and Industry University, Baku, Azerbaijan

a r t i c l e i n f o a b s t r a c t

Article history: In this paper, a case study on the role of fuzzy logic (FL) in data mining and machine learning is carried
Received 12 February 2020 out. It is outlined that, in order to draw more attention of data-mining and machine-learning communi-
Revised 21 June 2020 ties to FL, studies on FL could be more focused not on the activities that fuzzy methods can perform better
Accepted 18 July 2020
but rather on the activities that fuzzy methods can perform and the non-fuzzy ones can’t. Such approach
Available online 25 July 2020
takes us away from discussing quantitative differences between fuzzy and non-fuzzy methods to dis-
cussing qualitative differences, which are possibly more favorable objects of scientific curiosity.
Keywords:
Following the outlined suggestion, a novel speed-up technique is proposed in this paper to support asso-
Association rule mining
Clustering
ciation rule mining (ARM). The proposed technique is a clustering-based one and provides fusion of clus-
Data mining tering and ARM. The catchy feature of this technique is that it works well if applied in fuzzy ARM and
Fuzzy logic doesn’t work well if applied in non-fuzzy ARM. The proposed technique is put through experimental ver-
Machine learning ification involving several real-world datasets, and the results substantiate its effectiveness.
Ó 2020 Elsevier Ltd. All rights reserved.

1. Introduction tendency to overlook or underestimate the actual potential of FL in


data-mining and machine-learning literature. The books
Since its introduction in 1965 by Lotfi Zadeh (Zadeh, 1965), (Aggarwal, 2015; Alpaydın, 2010b; Azzalini & Scarpa, 2012;
fuzzy logic has evolved into a vast and well-developed branch of Bramer, 2007; Clarke et al., 2009; Ertel, 2011b; Fernández-Llatas
science with numerous applications in the fields of control, opti- & García-Gómez, 2015; Harrington, 2012b; Lantz, 2013b; Larose
mization, data analysis, etc. Current paper is mainly focused on & Larose, 2015; Mellouk & Chebira, 2009; Mohri, Rostamizadeh,
the implementation issues of fuzzy logic (FL) in the fields of data & Talwalkar, 2018a; North, 2012; Russell, 2014; Witten & Frank,
mining and machine learning. 2005a; Xu, Zong, & Yang, 2013; Zaki & Meira, 2014), providing a
Data mining (DM) is an interdisciplinary field of science with broad review and analysis of DM and/or ML fields, have no infor-
the goal of data-driven knowledge acquisition, representation mation on fuzzy methods and techniques: at the best, some of
and application (Ertel, 2011c; Han, Kamber, & Pei, 2012c). Machine these books casually mention a couple of fuzzy concepts. The
learning (ML) is a central subfield of artificial intelligence with the books (Han, Kamber, & Pei, 2012b; Kumar, 2014; Sammut &
goal to design effective and efficient predictive computer algo- Webb, 2017), providing a broad review and analysis of DM and/
rithms that are able to transform data into some intelligent action or ML fields, have some limited information on fuzzy methods
(Ertel, 2011a; Lantz, 2013a; Mohri, Rostamizadeh, & Talwalkar, and techniques. Though the reasons of such position can be quite
2018b). Many researchers tend to consider DM and ML as closely diverse, we tend to focus on the issues caused by fuzzy-related
related fields (Alpaydın, 2010a; Austin, Tu, Ho, Levy, & Lee, 2013; researches themselves. With no claim for comprehensiveness, we
Delgado, Marín, Martín-Bautista, Sánchez, & Vila, 2005; Ertel, outline two following major issues (Couso, Borgelt, Hullermeier,
2011c; Harrington, 2012a; Kavakiotis et al., 2017; Lantz, 2013a; & Kruse, 2019; Hüllermeier, 2005, 2011, 2015):
Mărginean, 2004; Sammut & Webb, 2017), sharing mostly the
same methods and techniques; some researchers even consider  Interpretability issue. One of the major claims in fuzzy-related
them as a single common research field with the name DMML researches is that fuzzy systems possess the property of exten-
(or ML&DM) (Clarke, Fokoue, & Zhang, 2009; Hüllermeier, 2005, sive interpretability. However, though fuzzy systems do possess
2011). the property of interpretability, they also possess three corre-
Though multiple researches have been reported on the applica- sponding shortcomings. First, in order to be interpretable,
tion of FL in the fields of DM and ML, there is a further-illustrated applied linguistic terms (words) have to be defined manually
by a user/expert, which makes the interpretability of the corre-
sponding fuzzy system a bit subjective, since the same word can
E-mail address: [email protected]

https://doi.org/10.1016/j.eswa.2020.113781
0957-4174/Ó 2020 Elsevier Ltd. All rights reserved.
2 V.E. Mirzakhanov / Expert Systems with Applications 162 (2020) 113781

n o
be perceived differently by different users/experts. Second, the i
D ¼ dj ji ¼ 1; . . . ; I; j ¼ 1; . . . ; J ; ð1Þ
manually defined linguistic terms are obviously not adapted
to statistical data, which negatively affects the accuracy of the i
where dj is the value of the jth attribute in the ith record, I is the
corresponding data-driven prediction systems. One of the solu-
number of records, J is the number of attributes.
tions is to tune linguistic terms (by neural network, genetic
The goal of ARM is to process D and extract a list of reliable
algorithm, etc.), but it negatively affects interpretability. Third,
(strong, interesting) IF–THEN rules, describing D in the following
interpretability tends to become rather questionable when
form:
dealing with complex fuzzy systems: for example, a rule-
based prediction system having the extensive number of rules IF A1 is=has S1 and . . . and Am is=has Sm ; THEN
; ð2Þ
and/or extended rule length (e.g. as in (Mirzakhanov, 2019; Amþ1 is=has Smþ1 and . . . and Amþn is=has Smþn ½Q 
Mirzakhanov & Gardashova, 2019)) is more likely to be not
easily interpretable. where Aj ðj ¼ 1; . . . ; m þ nÞ is an attribute of D, Sj ðj ¼ 1; . . . ; m þ nÞ is
 Extension issue. A significant part of fuzzy-related researches a value of Aj , and Q 2 ½0; 1 is a quality measure of the association
proposes some extensions of standard non-fuzzy DM and ML rule.
methods. The corresponding issue is that such extensions tend According to the type of Aj ðj ¼ 1; . . . ; m þ nÞ, association rules
to look like an incremental upgrade, refining some properties can be classified in chronological order as follows2. Historically,
of the method. And, since such extension, alongside with some the first type of an association rule to be mined during ARM was a
upgrade, usually increases the complexity of the method; it may boolean association rule (Agrawal et al., 1993), in which
seem not too motivating for DM/ML researchers to put the Aj ðj ¼ 1; . . . ; m þ nÞ can take only two values 0 (no) and 1 (yes).
extension into practice. Shortly after introduction of the boolean association rule, in order
to overcome its limitations, a quantitative association rule (Srikant
In this paper, we focus on the second aforementioned issue and
& Agrawal, 1996) was proposed to be mined during ARM, in which
assume that, in order to increase the recogniton of FL within DM
Aj ðj ¼ 1; . . . ; m þ nÞ can be a categorical or quantitative attribute.
and ML communities, the following can be performed: instead of
Usually, when mining quantitative association rules,
discussing the activities that fuzzy methods can do better (in some
Aj ðj ¼ 1; . . . ; m þ nÞ in ARM can’t take its numerical values directly
way) than non-fuzzy ones, to put an accent on the activities fuzzy
from D: instead, partitioning of attribute ranges is applied, replacing
methods can do (at least, can do well) and non-fuzzy ones can’t (at
attribute values from D by several crisp partitions (intervals). Shortly
least, can do unwell). Such approach takes us away from discussing
after introduction of the quantitative association rule, a fuzzy associ-
quantitative differences between fuzzy and non-fuzzy methods to
ation rule (Chan & Au, 1997; Kuok, Fu, & Wong, 1998) was proposed
discussing qualitative differences, which are possibly more favor-
to be mined during ARM, which can be viewed as an extension of the
able objects of scientific curiosity.
non-fuzzy association rule: instead of crisp partitioning, fuzzy parti-
In this paper, the aforementioned thesis is supported by per-
tioning of attribute ranges is applied, allowing Aj ðj ¼ 1; . . . ; m þ nÞ to
forming the following research: a novel speed-up technique is pro-
take values in the form of fuzzy linguistic terms (words).
posed to assist association rule mining (ARM) with cluster analysis.
There are two basic quality measures of a rule in ARM: support
The proposed technique provides fusion of ARM and clustering: the
and confidence. Support is a relative amount of data records cov-
word fusion is deliberately used instead of combination to point out
ered by a rule. Confidence is a ratio of the amount of data records
that the proposed technique doesn’t simply apply clustering and
supporting the whole rule to the amount of records supporting its
ARM in a consequent manner; clustering explicitly affects ARM
antecedent part. Support (supp) and confidence (conf) of an associ-
by updating its quality-measure formulas. The catchy feature of
ation rule (2) can be computed as follows3:
this speed-up technique is that it works well if applied in fuzzy
PI     
ARM and doesn’t work well if applied in non-fuzzy ARM.
i¼1 min lS1 di1 ; :::; lSmþn dimþn
The rest of the paper is organized as follows. Section 2 describes supp ¼ ; ð3Þ
ARM and clustering. Section 3 proposes a novel speed-up tech- I
 
nique in ARM. Section 4 provides the experimental verification of
where lSj dij ðj ¼ 1; . . . ; m þ nÞ is a membership degree of dij in Sj :
the proposed technique. Section 5 discusses the proposed tech-
nique and concludes the paper. PI     
i¼1 min lS1 di1 ; . . . ; lSmþn dimþn
conf ¼ PI      : ð4Þ
2. Association rule mining and clustering i¼1 min lS1 di1 ; . . . ; lSm dim

2.1. Association rule mining The basic ARM process can be described as follows.
First, a user specifies minsupp and minconf, which are the
One of the major research fields in DM and ML is association threshold values of supp and conf, respectively: the user seeks for
rule mining1, which goal is to mine large amounts of data for attri- rules with supp P minsupp and conf P minconf :
bute associations. It is commonly accepted that ARM was introduced Second, all possible arbitrary-length combinations of
by Agrawal et al. in (Agrawal et al., 1993). However, some research- Sj ðj ¼ 1; . . . ; m þ nÞ are generated. Obviously, a combination can’t
ers (Hájek, Holeňa, & Rauch, 2010; Todorovski, Chorbev, & contain two or more Sj belonging to the same attribute
Loskovska, 2010) note that the research field of ARM had already
been defined by Hájek et al. in (Hájek, Havel, & Chytil, 1966) a few 2
Technically, the number of association-rule types is far more significant (Kumar,
decades before Agrawal et al. in (Agrawal, Imieliński, & Swami, 2014): besides the ones described in the paper, there are also profile association rules,
1993). cyclic association rules, intertransaction association rules, etc. In this paper, we tend
to apply the most generalized and basic interpretation and classification of
In this paper, a dataset put though ARM is defined as follows: association rules.
3
For the sake of conciseness, a single pair of formulas for supp and conf is shown in
this paper: the formulas (3) and (4) are valid in the case of boolean, non-fuzzy
quantitative and fuzzy association rules; though, in the case of the first two rule types,
1
It can be noted that, in general, ARM is more covered by DM rather than ML the definitions of supp and conf are usually simpler. Additionally, it can be noted that,
(Witten & Frank, 2005a). During the 1990s, ARM was even often considered as a though the minimum operator is the most popular choice in (3) and (4), the product
synonym of DM (S. Zhang & Wu, 2011). can also be applied as t-norm (Dubois, Hüllermeier, & Prade, 2006).
V.E. Mirzakhanov / Expert Systems with Applications 162 (2020) 113781 3

Aj ðj ¼ 1; . . . ; m þ nÞ; since such combination can’t be further trans-


formed into valid rules.
Third, supp is computed from D for each generated combination
and compared to the user-defined minsupp.
Fourth, the combinations with supp P minsupp are used to gen-
erate all possible association rules.
Fifth, conf is computed for each generated rule and compared to
the user-defined minconf. The rules with conf P minconf consti-
tute the result of ARM.
The aforementioned ARM process is quite resource-intensive,
and, in order to overcome this issue, Apriori algorithm was pro-
posed in (Agrawal et al., 1996; Agrawal & Srikant, 1994; Mannila,
Toivonen, & Verkamo, 1994), which can be briefly described in
the following rule: a combination of Sj ðj ¼ 1; . . . ; m þ nÞ can have (a) (b)
supp P minsupp only if all its subsets have supp P minsupp: By
following this rule, the total number of generated combinations
of Sj ðj ¼ 1; . . . ; m þ nÞ during ARM can be significantly lowered
by generating greater-length combinations only from shorter-
length combinations having supp P minsupp: Though Apriori
algorithm still remains quite popular in ARM, a lot of other
ARM algorithms have been proposed since its introduction, which
can be divided into two main groups: Apriori-based ones
(e.g. DHP) and not Apriori-based ones (e.g. FP-tree) (Zhang &
He, 2010).

2.2. Clustering

One of the major research fields in DM and ML is clustering, (c) (d)


which goal is to discover groups (clusters) of highly similar records
within mined data. Fig. 1. Illustrative example of the basic idea behind the proposed speed-up
technique.
The amount of clustering methods available today is quite sig-
nificant, and they can be classified, in a concise manner, as parti-
tioning, hierarchical, density-based or grid-based methods (Han,
Kamber, & Pei, 2012a). In this paper, we apply K-means (Lloyd,
figure, the dataset contains several rather similar records, causing
1982)4, which belongs to the class of partitioning methods and is
some data redundancy; which elimination could possibly speed-
nearly the simplest clustering method available.
up the ARM process of R: In order to eliminate the outlined redun-
The clustering process by K-means can be described as follows.
dancy, the following procedure can be performed:
First, a user specifies the number of clusters to be discovered
within data.
1. The rather similar records can be discovered by performing
Second, cluster centroids (centers, central points) need to be
clustering of the dataset: rather similar records are more likely
predefined. At this step, a stochastic procedure can be imple-
to get into the same clusters. Let’s say, in our example, the
mented, which defines each cluster centroid as a random point in
r1 ; r2 ; r3 ; r4 records got into the cluster C 1 with the centroid c1
dataset’s attribute space.
Third, each record of D is included in the nearest cluster: to and the remaining records got into the cluster C 2 with the cen-
measure the distance between data records and cluster centroids, troid c2 (Fig. 1(b)).
Euclidean metric is a quite common choice and is applied in this 2. For each record, the distance to the corresponding cluster cen-
paper. ter is computed. If the distance is lower than some accepted
Fourth, cluster centroids are recomputed: an updated centroid value, a record is considered to be close to the centroid. Let’s
is the average of incorporated data records. say, in our example, the r 1 ; r 2 ; r 3 ; r 4 and r 6 ; r 7 ; r 8 ; r 9 ; r 10 records
Fifth, if there is no difference between preceding and recom- are discovered to be close to the corresponding centroids
puted cluster centroids, the clustering process is finished. Other- (Fig. 1(c)).
wise, the clustering process jumps to the third step. 3. If a record is the close-to-the-centroid one, it can be encapsulated
by the corresponding centroid: original close-to-the-centroid
records are replaced by added records (the corresponding
3. Novel speed-up technique to assist association rule mining
centroids). In our example, r 1 ; r 2 ; r 3 ; r 4 are replaced by c1 , and
3.1. Idea of the proposed speed-up technique r6 ; r7 ; r8 ; r9 ; r10 are replaced by c2 in R: As the result, R is trans-
formed into its generalized form R ^ ¼ fc1 ð4Þ; r5 ð1Þ; c2 ð5Þg; where
Before presenting the algorithm of the proposed speed-up () defines the number of encapsulated original records, called
technique, its basic idea is described by applying the following weight in this paper (Fig. 1(d)).
illustrative example. Let’s consider a dataset R ¼ fr 1 ; r 2 ; r 3 ; r 4 ; r 5 ; 4. ARM of R ^ possesses lower computational cost, since R
^ contains
r 6 ; r7 ; r8 ; r9 ; r10 g having 10 records and 2 numeric attributes. The less data in comparison with R: original R contains 10 records
dataset is shown in a graphic form in Fig. 1(a). As seen from the with 2 attribute values describing a record, and R^ contains 3
records with 2 attribute values and 1 weight index describing
4
Research (Lloyd, 1982) was actually introduced in 1957 in form of a technical a record. Namely, the overall decrease in the computational cost
report. of ARM is caused by the decrease in the cost of the supp and conf
4 V.E. Mirzakhanov / Expert Systems with Applications 162 (2020) 113781

^ instead of R; supp is ^ n o
computations. In the case of applying R D¼ d^i ji ¼ 1; . . . ; ^I; j ¼ 1; . . . ; J þ 1 ; ð5Þ
j
computed as
     
4min lS ðc11 Þ;lS ðc12 Þ þ1min lS ðr 51 Þ;lS ðr 52 Þ þ5min lS ðc21 Þ;lS ðc22 Þ
1 2 1 2 1 2
where d^i is the value of the jth attribute in the ith record, ^I is the
j
P10  4þ1þ5
 number of records, J is the total number of original attributes (with-
min lS1 ðri1 Þ;lS2 ðri2 Þ
instead of
i¼1
: Computation of conf is updated out considering the added weight attribute/index W).
10
in a similar way. ARM of D ^ claims the reconsideration of formulas (3) and (4).
Their updated versions are shown below:
For more profound understanding of the aforementioned exam- P^I     
^ can be rewritten in its ‘‘unwrapped” version5
i
 min lS1 d^i ; :::; l ^i
ple, R i¼1 W 1 Smþn dmþn
^ ¼ fc1 ; c1 ; c1 ; c1 ; r 5 ; c2 ; c2 ; c2 ; c2 ; c2 g, which shows that the performed supp ¼ P^I : ð6Þ
R i
i¼1 W
generalization basically replaced several original records with the
closely-situated centroids. As seen from the ‘‘unwrapped” version P^I     
^i ; . . . ; l
 min lS1 d
i ^i
^ the supp & conf computations in the case of R
of R, ^ are [functionally] i¼1 W 1 Smþn dmþn
conf ¼ P^I      : ð7Þ
^i ^i
i¼1 W  min lS1 d1 ; . . . ; lSm dm
fully consistent with the supp & conf computations in the case of R: i

the proposed speed-up doesn’t apply new quality-measure formulas


but rather ‘‘wraps” the existing ones.
To conclude, the basic idea of the proposed speed-up technique
is that, in order to simplify a dataset and the corresponding ARM, 4. Experimental verification of the proposed speed-up
the rather similar records can be ‘‘moved” in the dataset’s attribute technique
space from their initial unique points to some common excessive
points (centroids); where the uniqueness property means that an Experimental verification in this section involves the following
initial point represents only the original record itself, and the exces- hardware and software:
siveness property means that a common point represents several
original records: the uniqueness property is denoted by assigning  Dell Latitude E6420 notebook (Intel Core i5-2520M, 2  4 GB
a singular-weight index (weight is equal to 1) to the original DDR3 SDRAM) running MS Windows 8.1.
records, and the excessiveness property is denoted by assigning a  Toolbox (Vugar Mirzakhanov, 2019), applying the proposed
plural-weight index (weight is equal to the number of encapsu- speed-up technique in ARM and designed by means of MATLAB
lated original records) to the added records (cluster centroids). (R2017b).

4.1. Datasets and partitioning


3.2. Algorithm of the proposed speed-up technique
Three following datasets are used in this paper to verify the pro-
The proposed speed-up technique can be viewed as a sequence
posed speed-up technique: ‘‘MAGIC Gamma Telescope”, ‘‘HTRU 2”
of the following steps:
and ‘‘Wilt”. The applied datasets are the causal ones, where a cau-
sal dataset is a dataset with its attributes been predetermined as
1. A mined dataset D is normalized for further clustering.
the antecedent (ant.) or consequent (con.) ones. The reason of
2. A clustering method is applied to D:
using causal datasets in the experimental verification is explained
3. Centroids of the obtained clusters are added to D as new
in Section 4.2.
records.
The ‘‘MAGIC Gamma Telescope” dataset (Bock & Savicky, 2007)
4. The dataset D is supplemented with a new weight attribute W:
is a set of records representing properties of images obtained by
W is set equal to 1 in original data records and 0 in added data
Cherenkov gamma-ray telescope (Bock et al., 2004). Dataset’s
records (cluster centroids).
dimension is 19020  11; and its attributes are briefly described
5. A user specifies mindist 2 ½0; 1; which is the maximum accept-
in Table 1.
able distance ðdistÞ between an original data record and the cor-
The ‘‘HTRU 2” dataset (Lyon, 2017) is a set of records represent-
responding cluster centroid6.
ing properties of pulsar candidates (Lyon, 2016). Dataset’s dimen-
6. Original data records with dist 6 mindist are deleted from D;
sion is 17898  9; and its attributes are briefly described in Table 2.
and their W values are added to the W values of the corre-
The ‘‘Wilt” dataset (Johnson, 2014) is a set of records represent-
sponding cluster centroids (added data records).
ing properties of satellite images to detect diseased trees (Johnson,
7. Added data records with the weight remaining equal to 0 are
Tateishi, & Hoan, 2013). Dataset’s dimension is 4839  6; and its
deleted from D:
attributes are briefly described in Table 3.
Attribute partitioning (fuzzy and non-fuzzy) for the ‘‘MAGIC
As a result of applying the speed-up technique, D is transformed
Gamma Telescope”, ‘‘HTRU 2” and ‘‘Wilt” datasets is shown in
^ with the reduced number of records:
into its generalized form D Figs. 2–4, respectively. In this paper, we apply equiwidth partition-
ing [with triangular terms in the fuzzy case]: such choice conforms
5
It should be noted that the ‘‘unwrapped” version of the generalized form of an to the recommendations in (Hullermeier & Yi, 2007). We do not
original dataset is provided here only for illustration. In the proposed speed-up perform partitioning manually by a user/expert in order to elimi-
technique, application of the generalized form in its ‘‘unwrapped” version will
nate the possibility of getting partitions deliberately hand-picked
eliminate any possible speed-up in ARM, since the ‘‘unwrapped” version contains the
same number of records as the original dataset itself. to satisfy the desired experimental-verification results. Also, we
6
The proposed speed-up technique basically has only one parameter to be defined do not perform partitioning by clustering in order to not artificially
– mindist. However, since being a clustering-based one, the technique also possesses suppress the ‘‘boundary effects” in the non-fuzzy case: if obtaining
the corresponding parameters of the applied clustering method. In this paper, we partitions by clustering, the major part of data is more likely to be
apply the K-means clustering method, which needs the number [and centers] of
clusters to be predefined. Thus, the proposed speed-up technique gets two param-
located at the center rather than at the boundaries of a partition,
eters to be defined – mindist and number of clusters (cluster centers are defined in a which reduces the negative effect of applying crisp (not fuzzy) par-
stochastic manner). tition boundaries in ARM (Hullermeier & Yi, 2007).
V.E. Mirzakhanov / Expert Systems with Applications 162 (2020) 113781 5

Table 1 4.2. Synthesis of experimental-verification procedure


Attribute Information on ‘‘MAGIC Gamma Telescope”

№ Name Type Range In this paper, a speed-up technique applied in DM and/or ML


1 fLength ant. [4.2835, 334.177] fields is expected to satisfy the following two conditions:
2 fWidth ant. [0, 256.382]
3 fSize ant. [1.9413, 5.3233]  Speed-up technique shouldn’t have a [notable] negative impact
4 fConc ant. [0.0131, 0.893] on effectiveness.
5 fConc1 ant. [0.0003, 0.6752]
6 fAsym ant. [457.9161, 575.2407]
 Speed-up technique should have a [notable] positive impact on
7 fM3Long ant. [331.78, 238.321] efficiency.
8 fM3Trans ant. [205.8947, 179.851]
9 fAlpha ant. [0, 90]
Thus, in order to be able to verify the compliance of the pro-
10 fDist ant. [1.2826, 495.561] posed speed-up technique with the aforementioned conditions,
11 Class con. {0 – Hadron, 1 – Gamma} we need to define effectiveness and efficiency measures applied
in ARM in this paper.
Effectiveness measure. Similarity measures, applied in
Table 2 (Hullermeier & Yi, 2007; Verlinde, De Cock, & Boute, 2006), are
Attribute Information on ‘‘HTRU 2” not applicable in our case: they can only identify whether or not
ARM results are similar, but can’t determine which ARM result is
№ Name Type Range
better/worse. In this paper, in order to measure the effectiveness
1 MP ant. [5.8125, 192.6172]
of ARM, we apply ARM results to associative classification (AC),
2 DP ant. [24.772, 98.7789]
3 KP ant. [1.876, 8.0695]
whose goal is to build a classifier based on association rules
4 SP ant. [1.7919, 68.1016] (Thabtah, 2007). A classifier usually has a predefined and constant
5 MC ant. [0.2132, 223.3921] set of inputs/outputs, which is the reason why we use causal data-
6 DC ant. [7.3704, 110.6422] sets (see Section 4.1) and only full-length rules, applying all applica-
7 KC ant. [3.1393, 34.5398]
ble data attributes, in the experimental verification. In our case, the
8 SC ant. [1.977, 1.191]
9 Class con. {0 – Non-pulsar, 1 – Pulsar} classifier is a Mamdani-Type fuzzy inference system (FIS)
(Mamdani & Assilian, 1975) that functions as follows:

 In the case of non-fuzzy ARM, there can be no more than one


Table 3 rule ‘‘firing” at a time (because of non-overlapping partitioning),
Attribute Information on ‘‘Wilt”
so the class value is commonly defined in a crisp way (0 or 1) by
№ Name Type Range a single rule.
1 GLCM_Pan ant. [0, 183.2813]  In the case of fuzzy ARM, there can be several rules ‘‘firing” at a
2 Mean_G ant.  [117.2105, 1848.9] time, so the class value is commonly defined in a fuzzy way (a
3 Mean_R ant.  [50.5789, 1594.6] value within ½0; 1) by multiple rules. Predicted class, in this
4 Mean_NIR ant.  [86.5, 1597.3]
case, is the one closest to the class value (e.g. 0.6 is closer to
5 SD_Pan ant. [0, 156.5084]
6 Class con. {0 – Other, 1 – Diseased tree} 1, so the predicted class is 1).

Thus, we apply classification error as effectiveness measure in


this paper, which is a relative number of the records incorrectly
classified by an ARM-based FIS.
Efficiency measure. We apply processing time as efficiency
measure in this paper. Processing time is the total amount of
time spent by a computer to obtain an association-rule list.
Processing time is the sum of ARM time and clustering time
(if applicable).
Fig. 2. Linguistic variables X (X = fLength, fWidth, fSize, fConc, fConc1, fAsym, In this paper, the experimental verification applies basic
fM3Long, fM3Trans, fAlpha, fDist), where [lX , rX] is the range of X in ‘‘MAGIC ARM with Apriori (see Section 2.1) and applies K-means (see
Gamma Telescope”.
Section 2.2) as a clustering method in the proposed speed-up
technique. The parameters of the speed-up technique are defined
as follows: number of clusters 2 f10; 100; 1000g and mindist 2
f0:1; 0:3; 0:5g: The experimental-verification procedure is synthe-
sized as follows:

1. Minsupp is set equal to 4:8828  104 ;7:6294  106 and


6:43  105 in the case of processing the ‘‘MAGIC Gamma Tele-
Fig. 3. Linguistic variables X (X = MP, DP, KP, SP, MC, DC, KC, SC), where [lX , rX] is scope”, ‘‘HTRU 2” and ‘‘Wilt” datasets, respectively. Minconf is
the range of X in ‘‘HTRU 2”. 0.5 in all three cases7.
2. Non-fuzzy and fuzzy ARM processes are performed in the case
of all three datasets with/without applying the proposed
speed-up technique.

Fig. 4. Linguistic variables X (X = GLCM_Pan, Mean_G, Mean_R, Mean_NIR, SD_Pan),


7
where [lX , rX] is the range of X in ‘‘Wilt”. The reasonableness of these choices is discussed in Section 5.
6 V.E. Mirzakhanov / Expert Systems with Applications 162 (2020) 113781

3. Obtained association-rule lists are cleared from conflicts8 and


passed to the rule bases of the corresponding FISs. Each rule in
each FIS has the weight set equal to its conf value, so the ‘‘less-
precise” rules will have less impact on performance than the
‘‘more-precise” ones.
4. Designed FISs [and, therefore, the corresponding ARM pro-
cesses] are compared on the corresponding datasets in terms
of classification error and processing time.

In the aforementioned experimental-verification procedure, the


applied K-means method possesses some performance instability
(because of stochastic initialization), and the applied efficiency
measure is also unstable (different processing-time measurements
provide slightly different results). Therefore, each experiment per-
formed in accordance with the aforementioned procedure is
repeated 10 times, and the mean values of classification error
and processing time are considered as the final ones.

4.3. Experiments

First, ARM processes with speed-up (marked in green and


denoted as S) and ARM processes without speed-up (marked in Fig. 5. Fuzzy (left) and non-fuzzy (right) ARM of the ‘‘MAGIC Gamma Telescope”
red and denoted as $) are compared in terms of effectiveness in dataset.
accordance with the experimental-verification procedure from
Section 4.2 (Figs. 5–7):

 Application of the proposed speed-up technique in fuzzy ARM


doesn’t have a notable negative impact on effectiveness, since
the increase in classification error is less than 1% in all the
experiments.
 Application of the proposed speed-up technique in non-fuzzy
ARM has a significant negative impact on effectiveness, which
weakens only in the case of low-level data generalization (dis-
cussed in Section 5).

For reference purposes, the data from Figs. 5–7 are supported
with the additional parameter ^I=I  100; which is the relative size
(%) of generalized data obtained by applying the speed-up tech-
nique (Table 4).
Second, ARM processes with speed-up and ARM processes with-
out speed-up are compared in terms of efficiency in accordance
with the experimental-verification procedure from Section 4.2
(Figs. 8 and 9). Non-fuzzy ARM processes and ARM processes on
the ‘‘Wilt” dataset aren’t covered by the figures, which is justified
as follows:

 Speed-up technique has already failed to satisfy the ‘‘no-nega


tive-impact-on-effectiveness” condition (see Section 4.2) in
Fig. 6. Fuzzy (left) and non-fuzzy (right) ARM of the ‘‘HTRU 2” dataset.
the case of non-fuzzy ARM, so the corresponding analysis of
efficiency is rather redundant.
 Non-fuzzy ARM, in the case of the applied datasets, isn’t
resource-intensive enough to properly analyze the increase in
As seen from Figs. 8 and 9, the proposed speed-up technique
efficiency by applying the speed-up technique.
does have a notable positive impact on efficiency in ARM. How-
 ARM of the ‘‘Wilt” dataset isn’t resource-intensive enough to
ever, the efficiency of the technique depends on its chosen param-
properly analyze the increase in efficiency by applying the
eters, and some notes are taken on this issue:
speed-up technique.
 Applying too low values of number of clusters and/or mindist
It should be explicitly pointed out that the proposed speed-up
may be not too suitable in some cases, since it can significantly
technique does provide an increase in efficiency in the aforemen-
^ and, therefore, increase processing time.
increase the size of D
tioned cases; it is just not too presentable and valid due to the
low computational cost of ARM in these cases. For example, ARM of the ‘‘MAGIC Gamma Telescope” dataset
actually slows down a bit, if applying the technique with 10
8
clusters and mindist ¼ 0:1 (see Fig. 8 and Table 4).
Conflict, in this case, means the appearance of conflicting rules, which are the
rules with the same antecedent part. In order to eliminate a conflict, only the
 Applying too high values of number of clusters can cause the sit-
conflicting rule with the highest conf value is retained in the rule list. uation of getting a bunch of additional records (cluster centers)
V.E. Mirzakhanov / Expert Systems with Applications 162 (2020) 113781 7

Fig. 8. Fuzzy ARM of the ‘‘MAGIC Gamma Telescope” dataset.

Fig. 7. Fuzzy (left) and non-fuzzy (right) ARM of the ‘‘Wilt” dataset.

Table 4
Size of Data Generalized by Applying the Speed-up Technique.

Dataset Number of mindist Relative


clusters data size (%)
‘‘MAGIC Gamma Telescope” 10 0.1 99
0.3 38
0.5 12 Fig. 9. Fuzzy ARM of the ‘‘HTRU 2” dataset.
100 0.1 89
0.3 12
0.5 2
1000 0.1 58
 In the case of fuzzy ARM, the proposed speed-up technique
0.3 8 works well, since it satisfies both the ‘‘no-negative-impact-on-
0.5 5 effectiveness” and the ‘‘positive-impact-on-efficiency” condi-
‘‘HTRU 2” 10 0.1 43 tions (see Section 4.2).
0.3 7
 In the case of non-fuzzy ARM, the proposed speed-up technique
0.5 2
100 0.1 13 works unwell, since it doesn’t satisfy the ‘‘no-negative-impact-
0.3 1 on-effectiveness” condition.
0.5 1
1000 0.1 8
5. Discussions and conclusions
0.3 6
0.5 6
‘‘Wilt” 10 0.1 18 This section is organized as follows. Each discussed issue forms
0.3 2 its own subsection and is presented in a question-and-answer for-
0.5 1
mat: the question is the heading, and the answer is the body of a
100 0.1 5
0.3 2 subsection. The last subsection presents overall conclusions of
0.5 2 the paper.
1000 0.1 21
0.3 21
5.1. Why do we need the proposed speed-up technique?
0.5 21

Though the ARM processes shown in this paper take little time
to be performed, the goal of ARM is to process rather large and
complex datasets, like (Reiss, 2012; Whiteson, 2014a, 2014b).
with 1–2 original data records been incorporated by each cen- ARM of such datasets can take some extensive amounts of time,
troid. Thus, the decrease in original-data size can be counterbal- and this is the case when the proposed speed-up technique
anced by the corresponding increase in additional-data size, becomes handy.
which negatively affects processing time. For example, ARM of
the ‘‘Wilt” dataset (having only 4839 records) gets more records 5.2. Is the proposed speed-up technique novel?
to be processed if applying the technique with 1000 clusters
instead of 100 clusters (see Table 4). Joint application of ARM and clustering is a relatively popular
topic, and, to this date, there have been a lot of corresponding
At the end, we come to the following results of the performed researches reported in data-mining and machine-learning litera-
experimental verification: ture. A few of these researches are listed below:
8 V.E. Mirzakhanov / Expert Systems with Applications 162 (2020) 113781

 In (Alhajj & Kaya, 2008; Chen, Hong, & Tseng, 2006; Kaya & fuzzy ARM with applying the speed-up technique gets less
Alhajj, 2003a, 2003b, 2004; Li, 2009; Mangalampalli & Pudi, and less effective when increasing mindist and/or decreasing
2010; Tan, 2018; Thomas & Raju, 2014; Been-Chian Chien, number of clusters.
2001; Jia et al., 2015; Li et al., 2015), clustering is applied to par-  In the case of low mindist and/or high number of clusters, the dif-
titioning [of data attribute ranges] in ARM. ference between the original and the corresponding additional
 In (Ananthanarayana, Murty, & Subramanian, 2001; Lai & Yang, data records tends to be less significant, which reduces the
2000; Quan, Ngo, & Hui, 2009; Riaz, Arooj, Hassan, & Kim, 2014; partition-instability effect. This statement is proved by the per-
Yotsawat & Srivihok, 2015), data are clustered and ARM is per- formed experimental verification: as seen from Figs. 5–7, non-
formed separately within each cluster. fuzzy ARM with applying the speed-up technique gets more
 In (Lent, Swami, & Widom, 1997; Pi, Qin, & Yuan, 2006), cluster- and more effective when decreasing mindist and/or increasing
ing is applied [for different purposes] to the results of ARM. number of clusters.
 In (Chaudhary, Papapanagiotou, & Devetsikiotis, 2010;
Sobhanam & Mariappan, 2013), clustering and ARM are applied The aforementioned case of ‘‘low mindist and/or high number of
to solve some tasks from different application domains. clusters” is called low-level data generalization in this paper. In the
 In (Zhao, Zhang, & Zhang, 2004), clustering is applied to dis- case of low-level data generalization, the speed-up technique
cover similar/dissimilar data attributes. It is stated that associ- doesn’t have a [notable] negative impact on the effectiveness of
ation rules with dissimilar attributes are more interesting. non-fuzzy ARM. However, as explained in Section 4, applying
 In (Grissa Touzi, Thabet, & Sassi, 2011), clustering is applied to ‘‘low mindist and/or high number of clusters” is not desired, since
an initial dataset, and the data attributes are replaced by it tends to negatively affect the increase in efficiency of ARM. Thus,
obtained clusters. Thus, a new dataset is generated with each despite not lowering ARM’s effectiveness in the case of low-level
record defining the membership of the corresponding initial- data generalization, the application of the speed-up technique in
dataset record to an obtained cluster. ARM of such dataset pro- non-fuzzy ARM still remains rather useless.
vides a set of so-called ‘‘meta-rules”. It is stated that required
ordinary association rules can be derived from ‘‘meta-rules”. 5.4. Is applying Apriori sufficient to verify the proposed speed-up
The goal of the paper is to reduce the number of data attributes technique?
(by replacing them with clusters) and, therefore, to reduce the
computational cost of ARM. In this paper, the proposed speed-up technique is applied in the
 In (Watanabe & Takahashi, 2006), data are clustered, and supp ARM processes performed by using Apriori algorithm. Such deci-
(number of records) of each cluster is measured. So, when per- sion causes one major issue, which is discussed as follows.
forming ARM, supp is not computed in a usual way but derived As stated in Section 2.1, Apriori algorithm was originally pro-
from supp of the corresponding clusters. posed and applied in ARM to decrease its computational cost. Thus,
Apriori algorithm itself is a speed-up technique in ARM. From this
So, despite the topic itself being not novel, to the best of our perspective, it seems that the proposed speed-up in ARM has been
knowledge, the fusion of ARM and clustering proposed in this verified by fusing two speed-up techniques: the proposed one and
paper hasn’t been reported before in the scientific literature. Apriori. The corresponding issue is that it is not obvious why the
proposed speed-up technique, being successfully fused with one
5.3. Why does the proposed speed-up technique work unwell in non- [speed-up] algorithm in ARM, should be considered as commonly
fuzzy ARM? applicable in ARM: there are a lot of other ARM algorithms and
methods, and they may be not so compatible/consistent with the
The proposed speed-up technique works unwell in the case of proposed speed-up technique.
non-fuzzy ARM mostly because of the ‘‘boundary effects”. The term The aforementioned issue is consecutively discussed as follows:
‘‘boundary effects” is taken from (Hullermeier & Yi, 2007) and
stands for negative anomalies caused by applying crisp partition  Indeed, there are multiple ARM algorithms and methods in the
boundaries. According to (Sudkamp, 2005), there is a major bound- field; and, technically, the existence of the algorithms/methods
ary anomaly called partition instability. Partition instability can be poorly consistent with the proposed speed-up technique is
observed when a significant number of data records are located rather certain. However, if a technique is consistent with the
near crisp partition boundaries: if the partition boundaries are major part of ARM algorithms and methods, it still can be con-
slightly changed, a significant number of records fall out of (fall sidered as commonly applicable in ARM.
into) the corresponding partitions and cause significant changes  As stated in Section 2.1, supp and conf are two basic quality
to the results of ARM (Sudkamp, 2005). measures in ARM; and the major part of ARM methods and
The boundary anomaly occurred in this paper is a variation of algorithms, despite providing different variations/modifica-
the aforementioned partition instability. In our case, a significant tions to the ARM process, still apply either supp & conf or
number of records fall out of (fall into) the corresponding crisp par- their functional derivatives (Delgado et al., 2005; Delgado,
titions not because of changes in partitioning but because of Ruiz, Sánchez, & Vila, 2015; Lenca, Vaillant, Meyer, &
changes in data: additional records (cluster centers) that incorpo- Lallich, 2007).
rate original records near partition boundaries appear to make  As shown in Section 3, the particularity of the proposed speed-
the data fall out of (fall into) the corresponding crisp partitions. up technique in ARM is that it directly affects not the whole
In our case, the partition instability is mostly a function of mind- ARM process but only the computation of supp and conf: it
ist and number of clusters: ‘‘wraps” the original quality-measure formulas to reduce the
computational cost.
 In the case of high mindist and/or low number of clusters, the dif-  Thus, if the proposed speed-up technique increases the effi-
ference between the original and the corresponding additional ciency of ARM by only ‘‘wrapping” the supp & conf formulas,
data records tends to be more significant, which enhances the and the major part of ARM methods and algorithms applies
partition-instability effect. This statement is proved by the per- either supp & conf or their derivatives; then the proposed
formed experimental verification: as seen from Figs. 5–7, non- speed-up is rather commonly applicable in ARM.
V.E. Mirzakhanov / Expert Systems with Applications 162 (2020) 113781 9

5.5. Why is K-means and only K-means applied in the proposed speed- applied in the experiments is actually identical to minconf ¼ 0;
up technique? since the rules with conf < 0:5 can’t pass the conflict-cleaning pro-
cess, anyway.
K-means is applied in this paper for the following reasons:
5.7. Is it possible to revise the proposed speed-up technique in future
 K-means is simple to perceive and understand. work to make fuzzy clustering applicable?
 With K-means being applied, the speed-up technique gets only
two parameters to be defined (mindist and number of clusters): it The proposed speed-up technique is based on non-fuzzy clus-
makes experimental verification easier, since the results can be tering, so we have analyzed the fusion of non-fuzzy clustering with
compactly displayed in two-dimensional figures, and simplifies fuzzy and non-fuzzy ARM in this paper. It might be interesting to
the analysis of the technique. revise the speed-up technique in the way to make fuzzy clustering
 K-means has very low computational cost, which is quite applicable, so we could analyze the fusion of fuzzy clustering with
important, since the proposed technique is intended to speed- fuzzy and non-fuzzy ARM.
up ARM. The proposed speed-up technique, in its current form, is not
appropriate for the aforementioned future work, and the main rea-
Additional clustering methods aren’t considered within the son is explained as follows. In the case of applying fuzzy clustering
paper for the following reasons: in speed-up, each record of the dataset D gets not a single cluster
but a list of membership degrees in all applicable clusters. Such
 Additional clustering methods will make the paper more clustering results require the reconsideration of the current data
sophisticated. generalization procedure, and some thoughts on the issue are pro-
 In common, adding supplementary clustering methods doesn’t vided below:
contribute to the goal of the paper: obviously, the better (more
efficient/effective) a clustering method is, the better the corre-  At first, it seems that the aforementioned issue has a simple
sponding speed-up is; and the goal of the paper is not the sur- solution: the cluster possessing the highest membership degree
vey and comparison [in terms of effectiveness/efficiency] of of a record is to be selected as the superior one, so the record is
clustering methods. deleted from D and its weight is encapsulated by the cluster
centroid of the superior cluster (of course, if dist 6 mindist).
5.6. What is the justification of the minsupp and minconf values However, this solution eliminates almost any possible advan-
applied during experimental verification? tage of using fuzzy clustering: if the goal is to get a single most
suitable cluster for a record, then the performance of fuzzy clus-
In common, the computational cost of ARM critically depends tering [in the proposed speed-up technique] becomes rather
on minsupp (Witten & Frank, 2005b). Thus, in this paper, it is quite similar to the performance of non-fuzzy clustering.
important to define adequate minsupp: too low minsupp can signif-  A possibly more relevant solution to the aforementioned issue
icantly increase processing time and make the experimental verifi- is to select not a single but multiple superior clusters for a
cation process a bit morbid, and too high minsupp can ruin the record of D; so, if the record is deleted from D, its weight is
efficiency analysis of the proposed technique. divided into several parts and each weight part is encapsulated
In this paper, we apply the technique proposed in (Mirzakhanov by the centroid of the corresponding superior cluster: the
& Gardashova, 2019) to compute adequate minsupp, which func- greater the membership degree of the record in a superior clus-
tions as follows: the technique assumes that minsupp of an associ- ter is, the greater the weight part encapsulated by the corre-
ation rule should be consistent with its coverage of dataset’s sponding centroid is.
attribute space. For example, a term MP2 (see Fig. 3) covers 0.25,
and a term Pulsar (see Table 2) covers 0.5 of the attribute space: 5.8. Conclusions
thus, a rule ‘‘IF MP is MP2, THEN Class is Pulsar” covers 0.125
(0.25  0.5) of the attribute space and gets the same minsupp9. Contribution of this paper can be summarized as follows:
In the experimental verification, in order to get a single value of
minsupp for all rules in ARM, minsupp’s computation is based on  A clustering-based technique is proposed to speed-up ARM.
the attribute-space coverage of only full-range linguistic terms  The speed-up technique is proposed and considered within the
(for example, MP2, MP3 and MP4 in Fig. 3 are the same-coverage frame of FL, DM and ML relations.
full-range terms, but MP1 and MP5 are the half-range ones).
Before explaining why minconf is set equal to 0.5, two state-
ments should be made without proof (for conciseness): Declaration of Competing Interest

 The sum of conflicting rules’ conf values is equal to 1 in classifi- The authors declare that they have no known competing finan-
cation tasks (tasks with crisp consequent terms). cial interests or personal relationships that could have appeared
 A conflicting rule with higher supp value has higher conf value. to influence the work reported in this paper.

Since only the conflicting rule with the highest conf value is Acknowledgements
retained in a rule list and applied in the experimental verification
(see Section 4.2), it can be discovered from the aforementioned This research did not receive any specific grant from funding
statements that no rules with conf < 0:5 can be retained in a rule agencies in the public, commercial, or not-for-profit sectors.
list and applied in the experiments. Thus, the minconf ¼ 0:5 choice
References
9
In the experimental verification, only full-length association rules are applied to
classification (see Section 4.2), so the rule ‘‘IF MP is MP2, THEN Class is Pulsar” shown Aggarwal, C. C. (2015). Data Mining. https://doi.org/10.1007/978-3-319-14142-8.
here is not actually used during verification: this rule is only applied here to illustrate Agrawal, R., Imieliński, T., & Swami, A. (1993). Mining association rules between
the computation of minsupp. sets of items in large databases. Proceedings of the 1993 ACM SIGMOD
10 V.E. Mirzakhanov / Expert Systems with Applications 162 (2020) 113781

International Conference on Management of Data, 207–216. https://doi.org/ Han, J., Kamber, M., & Pei, J. (2012a). Cluster analysis: basic concepts and methods.
10.1145/170035.170072. In D. Cerra & H. Severson (Eds.), Data Mining (Third Edition) (pp. 443–495).
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., & Verkamo, A. I. (1996). Fast https://doi.org/10.1016/B978-0-12-381479-1.00010-1
discovery of association rules. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, & Han, J., Kamber, M., & Pei, J. (2012b). Data Mining: Concepts and Techniques (3rd
R. Uthurusamy (Eds.), Advances in Knowledge Discovery and Data Mining (pp. ed.). Waltham, MA, USA: Elsevier.
307–328). Retrieved from http://dl.acm.org/citation.cfm?id=257938.257975. Han, J., Kamber, M., & Pei, J. (2012c). Introduction. In D. Cerra & H. Severson (Eds.),
Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. Data Mining (Third Edition) (pp. 1–38). https://doi.org/10.1016/B978-0-12-
Proceedings of the 20th VLDB Conference, 487–499. San Francisco: Morgan 381479-1.00001-0.
Kaufmann Publishers Inc. Harrington, P. (2012a). Machine learning basics. In Machine Learning in Action (pp.
Alhajj, R., & Kaya, M. (2008). Multi-objective genetic algorithms based automated 3–17). Shelter Island, NY, USA: Manning Publications.
clustering for fuzzy association rules mining. Journal of Intelligent Information Harrington, P. (2012b). Machine Learning in Action. Shelter Island, NY, USA:
Systems, 31(3), 243–264. https://doi.org/10.1007/s10844-007-0044-1. Manning Publications.
Alpaydın, E. (2010a). Introduction. In Introduction to Machine Learning (2nd ed., pp. Hsiangchu Lai, & Tzyy-Ching Yang. (2000). A group-based inference approach to
1–19). Cambridge, MA, USA: MIT Press. customized marketing on the Web integrating clustering and association rules
Alpaydın, E. (2010b). Introduction to Machine Learning (2nd ed.). Cambridge, MA, techniques. Proceedings of the 33rd Annual Hawaii International Conference on
USA: MIT Press. System Sciences, 1, 10. https://doi.org/10.1109/HICSS.2000.926875.
Ananthanarayana, V. S., Murty, M. N., & Subramanian, D. K. (2001). Multi- Hüllermeier, E. (2005). Fuzzy methods in machine learning and data mining: Status
dimensional semantic clustering of large databases for association rule and prospects. Fuzzy Sets and Systems, 156(3), 387–406. https://doi.org/10.1016/
mining. Pattern Recognition, 34(4), 939–941. https://doi.org/10.1016/S0031- j.fss.2005.05.036.
3203(00)00128-X. Hüllermeier, E. (2011). Fuzzy sets in machine learning and data mining. Applied Soft
Austin, P. C., Tu, J. V., Ho, J. E., Levy, D., & Lee, D. S. (2013). Using methods from the Computing, 11(2), 1493–1505. https://doi.org/10.1016/j.asoc.2008.01.004.
data-mining and machine-learning literature for disease classification and Hüllermeier, E. (2015). Does machine learning need fuzzy logic?. Fuzzy Sets and
prediction: a case study examining classification of heart failure subtypes. Systems, 281, 292–299. https://doi.org/10.1016/j.fss.2015.09.001.
Journal of Clinical Epidemiology, 66(4), 398–407. https://doi.org/10.1016/j. Hullermeier, E., & Yi, Y. (2007). In defense of fuzzy association analysis. IEEE
jclinepi.2012.11.008. Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 37(4),
Azzalini, A., & Scarpa, B. (2012). Data Analysis and Data Mining. An Introduction. 1039–1043. https://doi.org/10.1109/TSMCB.2007.895332.
New York, NY, USA: Oxford University Press. Jia, J., Lu, Y., Chu, J., & Su, H. (2015). Fuzzy clustering-based quantitative association
Been-Chian Chien, Zin-Long Lin, & Tzung-Pei Hong. (2001). An efficient clustering rules mining in multidimensional data set. In Y. Tan, Y. Shi, F. Buarque, A.
algorithm for mining fuzzy quantitative association rules. Proceedings Joint 9th Gelbukh, S. Das, & A. Engelbrecht (Eds.), Advances in Swarm and Computational
IFSA World Congress and 20th NAFIPS International Conference, 3, 1306–1311. Intelligence. ICSI 2015. Lecture Notes in Computer Science, vol 9142 (pp. 68–
https://doi.org/10.1109/NAFIPS.2001.943736. 75). https://doi.org/10.1007/978-3-319-20469-7_9.
Bock, R. K., Chilingarian, A., Gaug, M., Hakl, F., Hengstebeck, T., Jiřina, M., & Wittek, Johnson, B. (2014). Wilt Dataset. Retrieved October 14, 2019, from UCI ML
W. (2004). Methods for multidimensional event classification: a case study Repository website: http://archive.ics.uci.edu/ml/datasets/wilt.
using images from a Cherenkov gamma-ray telescope. Nuclear Instruments and Johnson, B. A., Tateishi, R., & Hoan, N. T. (2013). A hybrid pansharpening approach
Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and and multiscale object-based image analysis for mapping diseased pine and oak
Associated Equipment, 516(2–3), 511–528. https://doi.org/10.1016/j. trees. International Journal of Remote Sensing, 34(20), 6969–6982. https://doi.
nima.2003.08.157. org/10.1080/01431161.2013.810825.
Bock, R. K., & Savicky, P. (2007). MAGIC Gamma Telescope Dataset. Retrieved Kavakiotis, I., Tsave, O., Salifoglou, A., Maglaveras, N., Vlahavas, I., & Chouvarda, I.
October 10, 2019, from UCI ML Repository website: https://archive.ics.uci.edu/ (2017). Machine learning and data mining methods in diabetes research.
ml/datasets/MAGIC Gamma Telescope. Computational and Structural Biotechnology Journal, 15, 104–116. https://doi.org/
Bramer, M. (2007). Principles of Data Mining. London, UK: Springer. 10.1016/j.csbj.2016.12.005.
Chan, K., & Au, W. H. (1997). Mining fuzzy association rules. Proceedings of the 6th Kaya, M., & Alhajj, R. (2003). A clustering algorithm with genetically optimized
International Conference on Information and Knowledge Management, 209– membership functions for fuzzy association rules mining. The 12th IEEE
215. New York. International Conference on Fuzzy Systems, 2, 881–886. https://doi.org/10.1109/
Chaudhary, U. K., Papapanagiotou, I., & Devetsikiotis, M. (2010). Flow classification FUZZ.2003.1206547.
using clustering and association rule mining. 15th IEEE International Workshop Kaya, M., & Alhajj, R. (2003). Facilitating fuzzy association rules mining by using
on Computer Aided Modeling, Analysis and Design of Communication Links and multi-objective genetic algorithms for automated clustering. Third IEEE
Networks, 76–80. https://doi.org/10.1109/CAMAD.2010.5686959. International Conference on Data Mining, 561–564. https://doi.org/10.1109/
Chen, Chun-Hao, Hong, Tzung-Pei, & Tseng, Vincent S. (2006). A cluster-based ICDM.2003.1250977.
fuzzy-genetic mining approach for association rules and membership functions. Kaya, M., & Alhajj, R. (2004). Integrating multi-objective genetic algorithms into
IEEE International Conference on Fuzzy Systems, 2006, 1411–1416. https://doi. clustering for fuzzy association rules mining. Fourth IEEE International
org/10.1109/FUZZY.2006.1681894. Conference on Data Mining, ICDM’04, 431–434. https://doi.org/10.1109/
Clarke, B., Fokoue, E., & Zhang, H. H. (2009). Principles and Theory for Data Mining ICDM.2004.10050.
and Machine Learning. https://doi.org/10.1007/978-0-387-98135-2. Kumar, T. (2014). Introduction to Data Mining (1st ed.). Harlow, UK: Pearson
Couso, I., Borgelt, C., Hullermeier, E., & Kruse, R. (2019). Fuzzy sets in data analysis: Education Limited.
from statistical foundations to machine learning. IEEE Computational Intelligence Kuok, C. M., Fu, A., & Wong, M. H. (1998). Mining fuzzy association rules in
Magazine, 14(1), 31–44. https://doi.org/10.1109/MCI.2018.2881642. databases. ACM SIGMOD Record, 27(1), 41–46. https://doi.org/10.1145/
Delgado, M., Marín, N., Martín-Bautista, M. J., Sánchez, D., & Vila, M.-A. (2005). 273244.273257.
Mining fuzzy association rules: an overview. In Soft Computing for Information Lantz, B. (2013a). Introducing machine learning. In Machine Learning with R (pp. 5–
Processing and Analysis (pp. 351–373). https://doi.org/10.1007/3-540-32365- 27). Birmingham, UK: Packt Publishing.
1_15. Lantz, B. (2013b). Machine Learning with R. Birmingham, UK: Packt Publishing.
Delgado, M., Ruiz, M. D., Sánchez, D., & Vila, M. A. (2015). On Fuzzy Modus Ponens to Larose, D. T., & Larose, C. D. (2015). Data Mining and Predictive Analytics (2nd ed.).
Assess Fuzzy Association Rules. In Enric Trillas: A Passion for Fuzzy Sets. Studies Hoboken, NJ, USA: John Wiley & Sons.
in Fuzziness and Soft Computing, 322, 269–276. https://doi.org/10.1007/978-3- Lenca, P., Vaillant, B., Meyer, P., & Lallich, S. (2007). Association rule interestingness
319-16235-5_21. measures: Experimental and theoretical studies. in quality measures in data
Dubois, D., Hüllermeier, E., & Prade, H. (2006). A systematic approach to the mining. Studies Computational Intelligence, 43, 51–76. https://doi.org/10.1007/
assessment of fuzzy association rulesData Mining and Knowledge Discovery, 13(2), 978-3-540-44918-8_3.
167–192. https://doi.org/10.1007/s10618-005-0032-4. Lent, B., Swami, A., & Widom, J. (1997). Clustering association rules. Proceedings of
Ertel, W. (2011a). Introduction. In Introduction to Artificial Intelligence (pp. 1–14). the 13th International Conference on Data Engineering, 220–231. https://doi.
London, UK: Springer. org/10.1109/ICDE.1997.581756.
Ertel, W. (2011b). Introduction to Artificial Intelligence. London, UK: Springer. Li, B., Pei, Z., & Qin, K. (2015). Association rules mining based on clustering analysis
Ertel, W. (2011c). Machine learning and data mining. In Introduction to Artificial and soft sets. 2015 IEEE International Conference on Computer and Information
Intelligence (pp. 161–220). https://doi.org/10.1007/978-0-85729-299-5_8. Technology; Ubiquitous Computing and Communications; Dependable,
Fernández-Llatas, C., & García-Gómez, J. M. (Eds.). (2015). Data Mining in Clinical Autonomic and Secure Computing; Pervasive Intelligence and Computing,
Medicine. https://doi.org/10.1007/978-1-4939-1985-7. 675–680. https://doi.org/10.1109/CIT/IUCC/DASC/PICOM.2015.97.
Grissa Touzi, A., Thabet, A., & Sassi, M. (2011). Efficient reduction of the number of Li, Q. (2009). An algorithm of quantitative association rule on fuzzy clustering with
associations rules using fuzzy clustering on the data. In Y. Tan, Y. Shi, Y. Chai, & application to cross-selling in telecom industry. International Joint Conference on
G. Wang (Eds.), Advances in Swarm Intelligence. ICSI 2011. Lecture Notes in Computational Sciences and Optimization, 2009, 759–762. https://doi.org/
Computer Science, vol 6729 (pp. 191–199). https://doi.org/10.1007/978-3-642- 10.1109/CSO.2009.441.
21524-7_23. Lloyd, S. (1982). Least squares quantization in PCM. IEEE Transactions on Information
Hájek, P., Havel, I., & Chytil, M. (1966). The GUHA method of automatic Theory, 28(2), 129–137. https://doi.org/10.1109/TIT.1982.1056489.
hypotheses determination. Computing, 1(4), 293–308. https://doi.org/ Lyon, R. (2016). Why are pulsars hard to find? Retrieved August 3, 2019, from
10.1007/BF02345483. https://www.escholar.manchester.ac.uk/api/datastream?publicationPid=uk-ac-
Hájek, Petr, Holeňa, M., & Rauch, J. (2010). The GUHA method and its meaning for man-scw:305203&datastreamId=FULL-TEXT.PDF.
data mining. Journal of Computer and System Sciences, 76(1), 34–48. https://doi. Lyon, R. (2017). HTRU2 Dataset. Retrieved September 17, 2017, from UCI ML
org/10.1016/j.jcss.2009.05.004. Repository website: https://archive.ics.uci.edu/ml/datasets/HTRU2.
V.E. Mirzakhanov / Expert Systems with Applications 162 (2020) 113781 11

Mamdani, E. H., & Assilian, S. (1975). An experiment in linguistic synthesis with a Sudkamp, T. (2005). Examples, counterexamples, and measuring fuzzy
fuzzy logic controller. International Journal of Man-Machine Studies, 7(1), 1–13. associations. Fuzzy Sets and Systems, 149(1), 57–71. https://doi.org/10.1016/
https://doi.org/10.1016/S0020-7373(75)80002-2. j.fss.2004.07.017.
Mangalampalli, A., & Pudi, V. (2010). FPrep: fuzzy clustering driven efficient Tan, S. C. (2018). Improving association rule mining using clustering-based
automated pre-processing for fuzzy association rule mining. International discretization of numerical data. International Conference on Intelligent and
Conference on Fuzzy Systems, 1–8. https://doi.org/10.1109/FUZZY.2010.5584154. Innovative Computing Applications, 2018, 1–5. https://doi.org/10.1109/
Mannila, H., Toivonen, H., & Verkamo, I. (1994). Efficient algorithms for discovering ICONIC.2018.8601291.
association rules. Proceedings of the AAAI Workshop on Knowledge Discovery Thabtah, F. (2007). A review of associative classification mining. The Knowledge
in Databases, 181–192. Engineering Review, 22(1), 37–65. https://doi.org/10.1017/S0269888907
Mărginean, F. A. (2004). Soft learning: a conceptual bridge between data mining and 001026.
machine learning. In Applications and Science in Soft Computing (pp. 241–248). Thomas, B., & Raju, G. (2014). A novel unsupervised fuzzy clustering method for
https://doi.org/10.1007/978-3-540-45240-9_33. preprocessing of quantitative attributes in association rule mining. Information
Mellouk, A., & Chebira, A. (Eds.). (2009). Machine Learning. Croatia: In-teh. Technology and Management, 15(1), 9–17. https://doi.org/10.1007/s10799-013-
Mirzakhanov, Vugar. (2019). Clustering-based speed-up technique in ARM. 0168-7.
Retrieved October 24, 2019, from MATLAB Central File Exchange website: Todorovski, V., Chorbev, I., & Loskovska, S. (2010). Overview of the Guha method as
https://www.mathworks.com/matlabcentral/fileexchange/73104. a data mining technique. Proceedings of the Seventh Conference on Informatics
Mirzakhanov, Vuqar (2019). The fuzzification issue in the Wu–Mendel approach for and Information Technology, 11–16. Skopje: Saints Cyril and Methodius
linguistic summarisation using IF-THEN rules. Journal of Experimental & University.
Theoretical Artificial Intelligence, 31(1), 117–136. https://doi.org/10.1080/ Verlinde, H., De Cock, M., & Boute, R. (2006). Fuzzy versus quantitative association
0952813X.2018.1544202. rules: a fair data-driven comparison. IEEE Transactions on Systems, Man, and
Mirzakhanov, Vuqar, & Gardashova, L. (2019). Modification of the Wu-Mendel Cybernetics, Part B (Cybernetics), 36(3), 679–684. https://doi.org/10.1109/
approach for linguistic summarization. Journal of Experimental & Theoretical TSMCB.2005.860134.
Artificial Intelligence, 31(1), 77–97. https://doi.org/10.1080/ Watanabe, T., & Takahashi, H. (2006). A quantitative association rule mining
0952813X.2018.1518998. algorithm based on clustering algorithm. 2006 IEEE International Conference on
Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018a). Foundations of Machine Systems, Man and Cybernetics, 2652–2657. https://doi.org/10.1109/
Learning (2nd ed.). Cambridge, MA, USA: MIT Press. ICSMC.2006.385264.
Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018b). Introduction. In Foundations Whiteson, D. (2014a). HIGGS Dataset. Retrieved July 4, 2017, from UCI ML
of Machine Learning (2nd ed., pp. 1–8). Cambridge, MA, USA: MIT Press. Repository website: http://archive.ics.uci.edu/ml/datasets/HIGGS.
North, M. (2012). Data Mining for the Masses. Global Text Project. Whiteson, D. (2014b). SUSY Dataset. Retrieved September 19, 2017, from UCI ML
Pi, D., Qin, X., & Yuan, P. (2006). A modified fuzzy C-means algorithm for association Repository website: https://archive.ics.uci.edu/ml/datasets/SUSY.
rules clustering. In D. Huang, K. Li, & G. W. Irwin (Eds.), Computational Witten, I., & Frank, E. (2005a). Data Mining. Practical Machine Learning Tools and
Intelligence. ICIC 2006. Lecture Notes in Computer Science, vol 4114 (pp. 1093– Techniques (2nd ed.). San Francisco, CA, USA: Elsevier.
1103). https://doi.org/10.1007/978-3-540-37275-2_137. Witten, I., & Frank, E. (2005b). Mining association rules. In Data Mining. Practical
Quan, T. T., Ngo, L. N., & Hui, S. C. (2009). An effective clustering-based approach for Machine Learning Tools and Techniques (2nd ed., pp. 112–119). San Francisco,
conceptual association rules mining. 2009 IEEE-RIVF International Conference CA, USA: Elsevier.
on Computing and Communication Technologies, 1–7. https://doi.org/10.1109/ Xu, G., Zong, Y., & Yang, Z. (2013). Applied Data Mining. Boca Raton, FL, USA: CRC
RIVF.2009.5174619. Press.
Reiss, A. R. (2012). PAMAP2 Physical Activity Monitoring Dataset. Retrieved July 18, Yotsawat, W., & Srivihok, A. (2015). Rules mining based on clustering of inbound
2017, from UCI ML Repository website: http://archive.ics.uci.edu/ml/datasets/ tourists in Thailand. In H. Sulaiman, M. Othman, M. Othman, Y. Rahim, & N. Pee
PAMAP2+Physical+Activity+Monitoring. (Eds.), Advanced Computer and Communication Engineering Technology.
Riaz, M., Arooj, A., Malik Tahir Hassan, & Jeong-Bae Kim. (2014). Clustering based Lecture Notes in Electrical Engineering, vol 315 (pp. 693–705). https://doi.org/
association rule mining on online stores for optimized cross product 10.1007/978-3-319-07674-4_65.
recommendation. The 2014 International Conference on Control, Automation Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3), 338–353. https://doi.
and Information Sciences (ICCAIS 2014), 176–181. https://doi.org/10.1109/ org/10.1016/S0019-9958(65)90241-X.
ICCAIS.2014.7020553. Zaki, M. J., & Meira, W. (2014). Data Mining and Analysis: Fundamental Concepts
Russell, M. A. (2014). Mining the Social Web (2nd ed.). Sebastopol, CA, USA: O’Reilly and Algorithms. https://doi.org/10.1017/CBO9780511810114.
Media. Zhang, M., & He, C. (2010). Survey on association rules mining algorithms. In
Sammut, C., & Webb, G. (Eds.). (2017). Encyclopedia of Machine Learning and Data Advancing Computing, Communication, Control and Management. Lecture
Mining (2nd ed.). New York, NY, USA: Springer. Notes. Electrical Engineering, 56, 111–118. https://doi.org/10.1007/978-3-642-
Sobhanam, H., & Mariappan, A. K. (2013). Addressing cold start problem in 05173-9_15.
recommender systems using association rules and clustering technique. Zhang, S., & Wu, X. (2011). Fundamentals of association rules in data mining and
International Conference on Computer Communication and Informatics, 2013, knowledge discovery. Wiley Interdisciplinary Reviews: Data Mining and
1–5. https://doi.org/10.1109/ICCCI.2013.6466121. Knowledge Discovery, 1(2), 97–116. https://doi.org/10.1002/widm.10.
Srikant, R., & Agrawal, R. (1996). Mining quantitative association rules in large Zhao, Y., Zhang, C., & Zhang, S. (2004). Discovering interesting association rules by
relational tables. Proceedings of the 1996 ACM SIGMOD International clustering. In G. I. Webb & X. Yu (Eds.), AI 2004. Lecture Notes in Computer
Conference on Management of Data - SIGMOD ’96, 1–12. https://doi.org/ Science, vol 3339 (pp. 1055–1061). https://doi.org/10.1007/978-3-540-30549-
10.1145/235968.233311. 1_101.

You might also like