Value of Fuzzy Logic For Data Mining and Machine Learning - A Case Study
Value of Fuzzy Logic For Data Mining and Machine Learning - A Case Study
Value of fuzzy logic for data mining and machine learning: A case study
Vugar E. Mirzakhanov
Department of Computer Engineering, Azerbaijan State Oil and Industry University, Baku, Azerbaijan
a r t i c l e i n f o a b s t r a c t
Article history: In this paper, a case study on the role of fuzzy logic (FL) in data mining and machine learning is carried
Received 12 February 2020 out. It is outlined that, in order to draw more attention of data-mining and machine-learning communi-
Revised 21 June 2020 ties to FL, studies on FL could be more focused not on the activities that fuzzy methods can perform better
Accepted 18 July 2020
but rather on the activities that fuzzy methods can perform and the non-fuzzy ones can’t. Such approach
Available online 25 July 2020
takes us away from discussing quantitative differences between fuzzy and non-fuzzy methods to dis-
cussing qualitative differences, which are possibly more favorable objects of scientific curiosity.
Keywords:
Following the outlined suggestion, a novel speed-up technique is proposed in this paper to support asso-
Association rule mining
Clustering
ciation rule mining (ARM). The proposed technique is a clustering-based one and provides fusion of clus-
Data mining tering and ARM. The catchy feature of this technique is that it works well if applied in fuzzy ARM and
Fuzzy logic doesn’t work well if applied in non-fuzzy ARM. The proposed technique is put through experimental ver-
Machine learning ification involving several real-world datasets, and the results substantiate its effectiveness.
Ó 2020 Elsevier Ltd. All rights reserved.
https://doi.org/10.1016/j.eswa.2020.113781
0957-4174/Ó 2020 Elsevier Ltd. All rights reserved.
2 V.E. Mirzakhanov / Expert Systems with Applications 162 (2020) 113781
n o
be perceived differently by different users/experts. Second, the i
D ¼ dj ji ¼ 1; . . . ; I; j ¼ 1; . . . ; J ; ð1Þ
manually defined linguistic terms are obviously not adapted
to statistical data, which negatively affects the accuracy of the i
where dj is the value of the jth attribute in the ith record, I is the
corresponding data-driven prediction systems. One of the solu-
number of records, J is the number of attributes.
tions is to tune linguistic terms (by neural network, genetic
The goal of ARM is to process D and extract a list of reliable
algorithm, etc.), but it negatively affects interpretability. Third,
(strong, interesting) IF–THEN rules, describing D in the following
interpretability tends to become rather questionable when
form:
dealing with complex fuzzy systems: for example, a rule-
based prediction system having the extensive number of rules IF A1 is=has S1 and . . . and Am is=has Sm ; THEN
; ð2Þ
and/or extended rule length (e.g. as in (Mirzakhanov, 2019; Amþ1 is=has Smþ1 and . . . and Amþn is=has Smþn ½Q
Mirzakhanov & Gardashova, 2019)) is more likely to be not
easily interpretable. where Aj ðj ¼ 1; . . . ; m þ nÞ is an attribute of D, Sj ðj ¼ 1; . . . ; m þ nÞ is
Extension issue. A significant part of fuzzy-related researches a value of Aj , and Q 2 ½0; 1 is a quality measure of the association
proposes some extensions of standard non-fuzzy DM and ML rule.
methods. The corresponding issue is that such extensions tend According to the type of Aj ðj ¼ 1; . . . ; m þ nÞ, association rules
to look like an incremental upgrade, refining some properties can be classified in chronological order as follows2. Historically,
of the method. And, since such extension, alongside with some the first type of an association rule to be mined during ARM was a
upgrade, usually increases the complexity of the method; it may boolean association rule (Agrawal et al., 1993), in which
seem not too motivating for DM/ML researchers to put the Aj ðj ¼ 1; . . . ; m þ nÞ can take only two values 0 (no) and 1 (yes).
extension into practice. Shortly after introduction of the boolean association rule, in order
to overcome its limitations, a quantitative association rule (Srikant
In this paper, we focus on the second aforementioned issue and
& Agrawal, 1996) was proposed to be mined during ARM, in which
assume that, in order to increase the recogniton of FL within DM
Aj ðj ¼ 1; . . . ; m þ nÞ can be a categorical or quantitative attribute.
and ML communities, the following can be performed: instead of
Usually, when mining quantitative association rules,
discussing the activities that fuzzy methods can do better (in some
Aj ðj ¼ 1; . . . ; m þ nÞ in ARM can’t take its numerical values directly
way) than non-fuzzy ones, to put an accent on the activities fuzzy
from D: instead, partitioning of attribute ranges is applied, replacing
methods can do (at least, can do well) and non-fuzzy ones can’t (at
attribute values from D by several crisp partitions (intervals). Shortly
least, can do unwell). Such approach takes us away from discussing
after introduction of the quantitative association rule, a fuzzy associ-
quantitative differences between fuzzy and non-fuzzy methods to
ation rule (Chan & Au, 1997; Kuok, Fu, & Wong, 1998) was proposed
discussing qualitative differences, which are possibly more favor-
to be mined during ARM, which can be viewed as an extension of the
able objects of scientific curiosity.
non-fuzzy association rule: instead of crisp partitioning, fuzzy parti-
In this paper, the aforementioned thesis is supported by per-
tioning of attribute ranges is applied, allowing Aj ðj ¼ 1; . . . ; m þ nÞ to
forming the following research: a novel speed-up technique is pro-
take values in the form of fuzzy linguistic terms (words).
posed to assist association rule mining (ARM) with cluster analysis.
There are two basic quality measures of a rule in ARM: support
The proposed technique provides fusion of ARM and clustering: the
and confidence. Support is a relative amount of data records cov-
word fusion is deliberately used instead of combination to point out
ered by a rule. Confidence is a ratio of the amount of data records
that the proposed technique doesn’t simply apply clustering and
supporting the whole rule to the amount of records supporting its
ARM in a consequent manner; clustering explicitly affects ARM
antecedent part. Support (supp) and confidence (conf) of an associ-
by updating its quality-measure formulas. The catchy feature of
ation rule (2) can be computed as follows3:
this speed-up technique is that it works well if applied in fuzzy
PI
ARM and doesn’t work well if applied in non-fuzzy ARM.
i¼1 min lS1 di1 ; :::; lSmþn dimþn
The rest of the paper is organized as follows. Section 2 describes supp ¼ ; ð3Þ
ARM and clustering. Section 3 proposes a novel speed-up tech- I
nique in ARM. Section 4 provides the experimental verification of
where lSj dij ðj ¼ 1; . . . ; m þ nÞ is a membership degree of dij in Sj :
the proposed technique. Section 5 discusses the proposed tech-
nique and concludes the paper. PI
i¼1 min lS1 di1 ; . . . ; lSmþn dimþn
conf ¼ PI : ð4Þ
2. Association rule mining and clustering i¼1 min lS1 di1 ; . . . ; lSm dim
2.1. Association rule mining The basic ARM process can be described as follows.
First, a user specifies minsupp and minconf, which are the
One of the major research fields in DM and ML is association threshold values of supp and conf, respectively: the user seeks for
rule mining1, which goal is to mine large amounts of data for attri- rules with supp P minsupp and conf P minconf :
bute associations. It is commonly accepted that ARM was introduced Second, all possible arbitrary-length combinations of
by Agrawal et al. in (Agrawal et al., 1993). However, some research- Sj ðj ¼ 1; . . . ; m þ nÞ are generated. Obviously, a combination can’t
ers (Hájek, Holeňa, & Rauch, 2010; Todorovski, Chorbev, & contain two or more Sj belonging to the same attribute
Loskovska, 2010) note that the research field of ARM had already
been defined by Hájek et al. in (Hájek, Havel, & Chytil, 1966) a few 2
Technically, the number of association-rule types is far more significant (Kumar,
decades before Agrawal et al. in (Agrawal, Imieliński, & Swami, 2014): besides the ones described in the paper, there are also profile association rules,
1993). cyclic association rules, intertransaction association rules, etc. In this paper, we tend
to apply the most generalized and basic interpretation and classification of
In this paper, a dataset put though ARM is defined as follows: association rules.
3
For the sake of conciseness, a single pair of formulas for supp and conf is shown in
this paper: the formulas (3) and (4) are valid in the case of boolean, non-fuzzy
quantitative and fuzzy association rules; though, in the case of the first two rule types,
1
It can be noted that, in general, ARM is more covered by DM rather than ML the definitions of supp and conf are usually simpler. Additionally, it can be noted that,
(Witten & Frank, 2005a). During the 1990s, ARM was even often considered as a though the minimum operator is the most popular choice in (3) and (4), the product
synonym of DM (S. Zhang & Wu, 2011). can also be applied as t-norm (Dubois, Hüllermeier, & Prade, 2006).
V.E. Mirzakhanov / Expert Systems with Applications 162 (2020) 113781 3
2.2. Clustering
^ instead of R; supp is ^ n o
computations. In the case of applying R D¼ d^i ji ¼ 1; . . . ; ^I; j ¼ 1; . . . ; J þ 1 ; ð5Þ
j
computed as
4min lS ðc11 Þ;lS ðc12 Þ þ1min lS ðr 51 Þ;lS ðr 52 Þ þ5min lS ðc21 Þ;lS ðc22 Þ
1 2 1 2 1 2
where d^i is the value of the jth attribute in the ith record, ^I is the
j
P10 4þ1þ5
number of records, J is the total number of original attributes (with-
min lS1 ðri1 Þ;lS2 ðri2 Þ
instead of
i¼1
: Computation of conf is updated out considering the added weight attribute/index W).
10
in a similar way. ARM of D ^ claims the reconsideration of formulas (3) and (4).
Their updated versions are shown below:
For more profound understanding of the aforementioned exam- P^I
^ can be rewritten in its ‘‘unwrapped” version5
i
min lS1 d^i ; :::; l ^i
ple, R i¼1 W 1 Smþn dmþn
^ ¼ fc1 ; c1 ; c1 ; c1 ; r 5 ; c2 ; c2 ; c2 ; c2 ; c2 g, which shows that the performed supp ¼ P^I : ð6Þ
R i
i¼1 W
generalization basically replaced several original records with the
closely-situated centroids. As seen from the ‘‘unwrapped” version P^I
^i ; . . . ; l
min lS1 d
i ^i
^ the supp & conf computations in the case of R
of R, ^ are [functionally] i¼1 W 1 Smþn dmþn
conf ¼ P^I : ð7Þ
^i ^i
i¼1 W min lS1 d1 ; . . . ; lSm dm
fully consistent with the supp & conf computations in the case of R: i
4.3. Experiments
For reference purposes, the data from Figs. 5–7 are supported
with the additional parameter ^I=I 100; which is the relative size
(%) of generalized data obtained by applying the speed-up tech-
nique (Table 4).
Second, ARM processes with speed-up and ARM processes with-
out speed-up are compared in terms of efficiency in accordance
with the experimental-verification procedure from Section 4.2
(Figs. 8 and 9). Non-fuzzy ARM processes and ARM processes on
the ‘‘Wilt” dataset aren’t covered by the figures, which is justified
as follows:
Fig. 7. Fuzzy (left) and non-fuzzy (right) ARM of the ‘‘Wilt” dataset.
Table 4
Size of Data Generalized by Applying the Speed-up Technique.
Though the ARM processes shown in this paper take little time
to be performed, the goal of ARM is to process rather large and
complex datasets, like (Reiss, 2012; Whiteson, 2014a, 2014b).
with 1–2 original data records been incorporated by each cen- ARM of such datasets can take some extensive amounts of time,
troid. Thus, the decrease in original-data size can be counterbal- and this is the case when the proposed speed-up technique
anced by the corresponding increase in additional-data size, becomes handy.
which negatively affects processing time. For example, ARM of
the ‘‘Wilt” dataset (having only 4839 records) gets more records 5.2. Is the proposed speed-up technique novel?
to be processed if applying the technique with 1000 clusters
instead of 100 clusters (see Table 4). Joint application of ARM and clustering is a relatively popular
topic, and, to this date, there have been a lot of corresponding
At the end, we come to the following results of the performed researches reported in data-mining and machine-learning litera-
experimental verification: ture. A few of these researches are listed below:
8 V.E. Mirzakhanov / Expert Systems with Applications 162 (2020) 113781
In (Alhajj & Kaya, 2008; Chen, Hong, & Tseng, 2006; Kaya & fuzzy ARM with applying the speed-up technique gets less
Alhajj, 2003a, 2003b, 2004; Li, 2009; Mangalampalli & Pudi, and less effective when increasing mindist and/or decreasing
2010; Tan, 2018; Thomas & Raju, 2014; Been-Chian Chien, number of clusters.
2001; Jia et al., 2015; Li et al., 2015), clustering is applied to par- In the case of low mindist and/or high number of clusters, the dif-
titioning [of data attribute ranges] in ARM. ference between the original and the corresponding additional
In (Ananthanarayana, Murty, & Subramanian, 2001; Lai & Yang, data records tends to be less significant, which reduces the
2000; Quan, Ngo, & Hui, 2009; Riaz, Arooj, Hassan, & Kim, 2014; partition-instability effect. This statement is proved by the per-
Yotsawat & Srivihok, 2015), data are clustered and ARM is per- formed experimental verification: as seen from Figs. 5–7, non-
formed separately within each cluster. fuzzy ARM with applying the speed-up technique gets more
In (Lent, Swami, & Widom, 1997; Pi, Qin, & Yuan, 2006), cluster- and more effective when decreasing mindist and/or increasing
ing is applied [for different purposes] to the results of ARM. number of clusters.
In (Chaudhary, Papapanagiotou, & Devetsikiotis, 2010;
Sobhanam & Mariappan, 2013), clustering and ARM are applied The aforementioned case of ‘‘low mindist and/or high number of
to solve some tasks from different application domains. clusters” is called low-level data generalization in this paper. In the
In (Zhao, Zhang, & Zhang, 2004), clustering is applied to dis- case of low-level data generalization, the speed-up technique
cover similar/dissimilar data attributes. It is stated that associ- doesn’t have a [notable] negative impact on the effectiveness of
ation rules with dissimilar attributes are more interesting. non-fuzzy ARM. However, as explained in Section 4, applying
In (Grissa Touzi, Thabet, & Sassi, 2011), clustering is applied to ‘‘low mindist and/or high number of clusters” is not desired, since
an initial dataset, and the data attributes are replaced by it tends to negatively affect the increase in efficiency of ARM. Thus,
obtained clusters. Thus, a new dataset is generated with each despite not lowering ARM’s effectiveness in the case of low-level
record defining the membership of the corresponding initial- data generalization, the application of the speed-up technique in
dataset record to an obtained cluster. ARM of such dataset pro- non-fuzzy ARM still remains rather useless.
vides a set of so-called ‘‘meta-rules”. It is stated that required
ordinary association rules can be derived from ‘‘meta-rules”. 5.4. Is applying Apriori sufficient to verify the proposed speed-up
The goal of the paper is to reduce the number of data attributes technique?
(by replacing them with clusters) and, therefore, to reduce the
computational cost of ARM. In this paper, the proposed speed-up technique is applied in the
In (Watanabe & Takahashi, 2006), data are clustered, and supp ARM processes performed by using Apriori algorithm. Such deci-
(number of records) of each cluster is measured. So, when per- sion causes one major issue, which is discussed as follows.
forming ARM, supp is not computed in a usual way but derived As stated in Section 2.1, Apriori algorithm was originally pro-
from supp of the corresponding clusters. posed and applied in ARM to decrease its computational cost. Thus,
Apriori algorithm itself is a speed-up technique in ARM. From this
So, despite the topic itself being not novel, to the best of our perspective, it seems that the proposed speed-up in ARM has been
knowledge, the fusion of ARM and clustering proposed in this verified by fusing two speed-up techniques: the proposed one and
paper hasn’t been reported before in the scientific literature. Apriori. The corresponding issue is that it is not obvious why the
proposed speed-up technique, being successfully fused with one
5.3. Why does the proposed speed-up technique work unwell in non- [speed-up] algorithm in ARM, should be considered as commonly
fuzzy ARM? applicable in ARM: there are a lot of other ARM algorithms and
methods, and they may be not so compatible/consistent with the
The proposed speed-up technique works unwell in the case of proposed speed-up technique.
non-fuzzy ARM mostly because of the ‘‘boundary effects”. The term The aforementioned issue is consecutively discussed as follows:
‘‘boundary effects” is taken from (Hullermeier & Yi, 2007) and
stands for negative anomalies caused by applying crisp partition Indeed, there are multiple ARM algorithms and methods in the
boundaries. According to (Sudkamp, 2005), there is a major bound- field; and, technically, the existence of the algorithms/methods
ary anomaly called partition instability. Partition instability can be poorly consistent with the proposed speed-up technique is
observed when a significant number of data records are located rather certain. However, if a technique is consistent with the
near crisp partition boundaries: if the partition boundaries are major part of ARM algorithms and methods, it still can be con-
slightly changed, a significant number of records fall out of (fall sidered as commonly applicable in ARM.
into) the corresponding partitions and cause significant changes As stated in Section 2.1, supp and conf are two basic quality
to the results of ARM (Sudkamp, 2005). measures in ARM; and the major part of ARM methods and
The boundary anomaly occurred in this paper is a variation of algorithms, despite providing different variations/modifica-
the aforementioned partition instability. In our case, a significant tions to the ARM process, still apply either supp & conf or
number of records fall out of (fall into) the corresponding crisp par- their functional derivatives (Delgado et al., 2005; Delgado,
titions not because of changes in partitioning but because of Ruiz, Sánchez, & Vila, 2015; Lenca, Vaillant, Meyer, &
changes in data: additional records (cluster centers) that incorpo- Lallich, 2007).
rate original records near partition boundaries appear to make As shown in Section 3, the particularity of the proposed speed-
the data fall out of (fall into) the corresponding crisp partitions. up technique in ARM is that it directly affects not the whole
In our case, the partition instability is mostly a function of mind- ARM process but only the computation of supp and conf: it
ist and number of clusters: ‘‘wraps” the original quality-measure formulas to reduce the
computational cost.
In the case of high mindist and/or low number of clusters, the dif- Thus, if the proposed speed-up technique increases the effi-
ference between the original and the corresponding additional ciency of ARM by only ‘‘wrapping” the supp & conf formulas,
data records tends to be more significant, which enhances the and the major part of ARM methods and algorithms applies
partition-instability effect. This statement is proved by the per- either supp & conf or their derivatives; then the proposed
formed experimental verification: as seen from Figs. 5–7, non- speed-up is rather commonly applicable in ARM.
V.E. Mirzakhanov / Expert Systems with Applications 162 (2020) 113781 9
5.5. Why is K-means and only K-means applied in the proposed speed- applied in the experiments is actually identical to minconf ¼ 0;
up technique? since the rules with conf < 0:5 can’t pass the conflict-cleaning pro-
cess, anyway.
K-means is applied in this paper for the following reasons:
5.7. Is it possible to revise the proposed speed-up technique in future
K-means is simple to perceive and understand. work to make fuzzy clustering applicable?
With K-means being applied, the speed-up technique gets only
two parameters to be defined (mindist and number of clusters): it The proposed speed-up technique is based on non-fuzzy clus-
makes experimental verification easier, since the results can be tering, so we have analyzed the fusion of non-fuzzy clustering with
compactly displayed in two-dimensional figures, and simplifies fuzzy and non-fuzzy ARM in this paper. It might be interesting to
the analysis of the technique. revise the speed-up technique in the way to make fuzzy clustering
K-means has very low computational cost, which is quite applicable, so we could analyze the fusion of fuzzy clustering with
important, since the proposed technique is intended to speed- fuzzy and non-fuzzy ARM.
up ARM. The proposed speed-up technique, in its current form, is not
appropriate for the aforementioned future work, and the main rea-
Additional clustering methods aren’t considered within the son is explained as follows. In the case of applying fuzzy clustering
paper for the following reasons: in speed-up, each record of the dataset D gets not a single cluster
but a list of membership degrees in all applicable clusters. Such
Additional clustering methods will make the paper more clustering results require the reconsideration of the current data
sophisticated. generalization procedure, and some thoughts on the issue are pro-
In common, adding supplementary clustering methods doesn’t vided below:
contribute to the goal of the paper: obviously, the better (more
efficient/effective) a clustering method is, the better the corre- At first, it seems that the aforementioned issue has a simple
sponding speed-up is; and the goal of the paper is not the sur- solution: the cluster possessing the highest membership degree
vey and comparison [in terms of effectiveness/efficiency] of of a record is to be selected as the superior one, so the record is
clustering methods. deleted from D and its weight is encapsulated by the cluster
centroid of the superior cluster (of course, if dist 6 mindist).
5.6. What is the justification of the minsupp and minconf values However, this solution eliminates almost any possible advan-
applied during experimental verification? tage of using fuzzy clustering: if the goal is to get a single most
suitable cluster for a record, then the performance of fuzzy clus-
In common, the computational cost of ARM critically depends tering [in the proposed speed-up technique] becomes rather
on minsupp (Witten & Frank, 2005b). Thus, in this paper, it is quite similar to the performance of non-fuzzy clustering.
important to define adequate minsupp: too low minsupp can signif- A possibly more relevant solution to the aforementioned issue
icantly increase processing time and make the experimental verifi- is to select not a single but multiple superior clusters for a
cation process a bit morbid, and too high minsupp can ruin the record of D; so, if the record is deleted from D, its weight is
efficiency analysis of the proposed technique. divided into several parts and each weight part is encapsulated
In this paper, we apply the technique proposed in (Mirzakhanov by the centroid of the corresponding superior cluster: the
& Gardashova, 2019) to compute adequate minsupp, which func- greater the membership degree of the record in a superior clus-
tions as follows: the technique assumes that minsupp of an associ- ter is, the greater the weight part encapsulated by the corre-
ation rule should be consistent with its coverage of dataset’s sponding centroid is.
attribute space. For example, a term MP2 (see Fig. 3) covers 0.25,
and a term Pulsar (see Table 2) covers 0.5 of the attribute space: 5.8. Conclusions
thus, a rule ‘‘IF MP is MP2, THEN Class is Pulsar” covers 0.125
(0.25 0.5) of the attribute space and gets the same minsupp9. Contribution of this paper can be summarized as follows:
In the experimental verification, in order to get a single value of
minsupp for all rules in ARM, minsupp’s computation is based on A clustering-based technique is proposed to speed-up ARM.
the attribute-space coverage of only full-range linguistic terms The speed-up technique is proposed and considered within the
(for example, MP2, MP3 and MP4 in Fig. 3 are the same-coverage frame of FL, DM and ML relations.
full-range terms, but MP1 and MP5 are the half-range ones).
Before explaining why minconf is set equal to 0.5, two state-
ments should be made without proof (for conciseness): Declaration of Competing Interest
The sum of conflicting rules’ conf values is equal to 1 in classifi- The authors declare that they have no known competing finan-
cation tasks (tasks with crisp consequent terms). cial interests or personal relationships that could have appeared
A conflicting rule with higher supp value has higher conf value. to influence the work reported in this paper.
Since only the conflicting rule with the highest conf value is Acknowledgements
retained in a rule list and applied in the experimental verification
(see Section 4.2), it can be discovered from the aforementioned This research did not receive any specific grant from funding
statements that no rules with conf < 0:5 can be retained in a rule agencies in the public, commercial, or not-for-profit sectors.
list and applied in the experiments. Thus, the minconf ¼ 0:5 choice
References
9
In the experimental verification, only full-length association rules are applied to
classification (see Section 4.2), so the rule ‘‘IF MP is MP2, THEN Class is Pulsar” shown Aggarwal, C. C. (2015). Data Mining. https://doi.org/10.1007/978-3-319-14142-8.
here is not actually used during verification: this rule is only applied here to illustrate Agrawal, R., Imieliński, T., & Swami, A. (1993). Mining association rules between
the computation of minsupp. sets of items in large databases. Proceedings of the 1993 ACM SIGMOD
10 V.E. Mirzakhanov / Expert Systems with Applications 162 (2020) 113781
International Conference on Management of Data, 207–216. https://doi.org/ Han, J., Kamber, M., & Pei, J. (2012a). Cluster analysis: basic concepts and methods.
10.1145/170035.170072. In D. Cerra & H. Severson (Eds.), Data Mining (Third Edition) (pp. 443–495).
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., & Verkamo, A. I. (1996). Fast https://doi.org/10.1016/B978-0-12-381479-1.00010-1
discovery of association rules. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, & Han, J., Kamber, M., & Pei, J. (2012b). Data Mining: Concepts and Techniques (3rd
R. Uthurusamy (Eds.), Advances in Knowledge Discovery and Data Mining (pp. ed.). Waltham, MA, USA: Elsevier.
307–328). Retrieved from http://dl.acm.org/citation.cfm?id=257938.257975. Han, J., Kamber, M., & Pei, J. (2012c). Introduction. In D. Cerra & H. Severson (Eds.),
Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. Data Mining (Third Edition) (pp. 1–38). https://doi.org/10.1016/B978-0-12-
Proceedings of the 20th VLDB Conference, 487–499. San Francisco: Morgan 381479-1.00001-0.
Kaufmann Publishers Inc. Harrington, P. (2012a). Machine learning basics. In Machine Learning in Action (pp.
Alhajj, R., & Kaya, M. (2008). Multi-objective genetic algorithms based automated 3–17). Shelter Island, NY, USA: Manning Publications.
clustering for fuzzy association rules mining. Journal of Intelligent Information Harrington, P. (2012b). Machine Learning in Action. Shelter Island, NY, USA:
Systems, 31(3), 243–264. https://doi.org/10.1007/s10844-007-0044-1. Manning Publications.
Alpaydın, E. (2010a). Introduction. In Introduction to Machine Learning (2nd ed., pp. Hsiangchu Lai, & Tzyy-Ching Yang. (2000). A group-based inference approach to
1–19). Cambridge, MA, USA: MIT Press. customized marketing on the Web integrating clustering and association rules
Alpaydın, E. (2010b). Introduction to Machine Learning (2nd ed.). Cambridge, MA, techniques. Proceedings of the 33rd Annual Hawaii International Conference on
USA: MIT Press. System Sciences, 1, 10. https://doi.org/10.1109/HICSS.2000.926875.
Ananthanarayana, V. S., Murty, M. N., & Subramanian, D. K. (2001). Multi- Hüllermeier, E. (2005). Fuzzy methods in machine learning and data mining: Status
dimensional semantic clustering of large databases for association rule and prospects. Fuzzy Sets and Systems, 156(3), 387–406. https://doi.org/10.1016/
mining. Pattern Recognition, 34(4), 939–941. https://doi.org/10.1016/S0031- j.fss.2005.05.036.
3203(00)00128-X. Hüllermeier, E. (2011). Fuzzy sets in machine learning and data mining. Applied Soft
Austin, P. C., Tu, J. V., Ho, J. E., Levy, D., & Lee, D. S. (2013). Using methods from the Computing, 11(2), 1493–1505. https://doi.org/10.1016/j.asoc.2008.01.004.
data-mining and machine-learning literature for disease classification and Hüllermeier, E. (2015). Does machine learning need fuzzy logic?. Fuzzy Sets and
prediction: a case study examining classification of heart failure subtypes. Systems, 281, 292–299. https://doi.org/10.1016/j.fss.2015.09.001.
Journal of Clinical Epidemiology, 66(4), 398–407. https://doi.org/10.1016/j. Hullermeier, E., & Yi, Y. (2007). In defense of fuzzy association analysis. IEEE
jclinepi.2012.11.008. Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 37(4),
Azzalini, A., & Scarpa, B. (2012). Data Analysis and Data Mining. An Introduction. 1039–1043. https://doi.org/10.1109/TSMCB.2007.895332.
New York, NY, USA: Oxford University Press. Jia, J., Lu, Y., Chu, J., & Su, H. (2015). Fuzzy clustering-based quantitative association
Been-Chian Chien, Zin-Long Lin, & Tzung-Pei Hong. (2001). An efficient clustering rules mining in multidimensional data set. In Y. Tan, Y. Shi, F. Buarque, A.
algorithm for mining fuzzy quantitative association rules. Proceedings Joint 9th Gelbukh, S. Das, & A. Engelbrecht (Eds.), Advances in Swarm and Computational
IFSA World Congress and 20th NAFIPS International Conference, 3, 1306–1311. Intelligence. ICSI 2015. Lecture Notes in Computer Science, vol 9142 (pp. 68–
https://doi.org/10.1109/NAFIPS.2001.943736. 75). https://doi.org/10.1007/978-3-319-20469-7_9.
Bock, R. K., Chilingarian, A., Gaug, M., Hakl, F., Hengstebeck, T., Jiřina, M., & Wittek, Johnson, B. (2014). Wilt Dataset. Retrieved October 14, 2019, from UCI ML
W. (2004). Methods for multidimensional event classification: a case study Repository website: http://archive.ics.uci.edu/ml/datasets/wilt.
using images from a Cherenkov gamma-ray telescope. Nuclear Instruments and Johnson, B. A., Tateishi, R., & Hoan, N. T. (2013). A hybrid pansharpening approach
Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and and multiscale object-based image analysis for mapping diseased pine and oak
Associated Equipment, 516(2–3), 511–528. https://doi.org/10.1016/j. trees. International Journal of Remote Sensing, 34(20), 6969–6982. https://doi.
nima.2003.08.157. org/10.1080/01431161.2013.810825.
Bock, R. K., & Savicky, P. (2007). MAGIC Gamma Telescope Dataset. Retrieved Kavakiotis, I., Tsave, O., Salifoglou, A., Maglaveras, N., Vlahavas, I., & Chouvarda, I.
October 10, 2019, from UCI ML Repository website: https://archive.ics.uci.edu/ (2017). Machine learning and data mining methods in diabetes research.
ml/datasets/MAGIC Gamma Telescope. Computational and Structural Biotechnology Journal, 15, 104–116. https://doi.org/
Bramer, M. (2007). Principles of Data Mining. London, UK: Springer. 10.1016/j.csbj.2016.12.005.
Chan, K., & Au, W. H. (1997). Mining fuzzy association rules. Proceedings of the 6th Kaya, M., & Alhajj, R. (2003). A clustering algorithm with genetically optimized
International Conference on Information and Knowledge Management, 209– membership functions for fuzzy association rules mining. The 12th IEEE
215. New York. International Conference on Fuzzy Systems, 2, 881–886. https://doi.org/10.1109/
Chaudhary, U. K., Papapanagiotou, I., & Devetsikiotis, M. (2010). Flow classification FUZZ.2003.1206547.
using clustering and association rule mining. 15th IEEE International Workshop Kaya, M., & Alhajj, R. (2003). Facilitating fuzzy association rules mining by using
on Computer Aided Modeling, Analysis and Design of Communication Links and multi-objective genetic algorithms for automated clustering. Third IEEE
Networks, 76–80. https://doi.org/10.1109/CAMAD.2010.5686959. International Conference on Data Mining, 561–564. https://doi.org/10.1109/
Chen, Chun-Hao, Hong, Tzung-Pei, & Tseng, Vincent S. (2006). A cluster-based ICDM.2003.1250977.
fuzzy-genetic mining approach for association rules and membership functions. Kaya, M., & Alhajj, R. (2004). Integrating multi-objective genetic algorithms into
IEEE International Conference on Fuzzy Systems, 2006, 1411–1416. https://doi. clustering for fuzzy association rules mining. Fourth IEEE International
org/10.1109/FUZZY.2006.1681894. Conference on Data Mining, ICDM’04, 431–434. https://doi.org/10.1109/
Clarke, B., Fokoue, E., & Zhang, H. H. (2009). Principles and Theory for Data Mining ICDM.2004.10050.
and Machine Learning. https://doi.org/10.1007/978-0-387-98135-2. Kumar, T. (2014). Introduction to Data Mining (1st ed.). Harlow, UK: Pearson
Couso, I., Borgelt, C., Hullermeier, E., & Kruse, R. (2019). Fuzzy sets in data analysis: Education Limited.
from statistical foundations to machine learning. IEEE Computational Intelligence Kuok, C. M., Fu, A., & Wong, M. H. (1998). Mining fuzzy association rules in
Magazine, 14(1), 31–44. https://doi.org/10.1109/MCI.2018.2881642. databases. ACM SIGMOD Record, 27(1), 41–46. https://doi.org/10.1145/
Delgado, M., Marín, N., Martín-Bautista, M. J., Sánchez, D., & Vila, M.-A. (2005). 273244.273257.
Mining fuzzy association rules: an overview. In Soft Computing for Information Lantz, B. (2013a). Introducing machine learning. In Machine Learning with R (pp. 5–
Processing and Analysis (pp. 351–373). https://doi.org/10.1007/3-540-32365- 27). Birmingham, UK: Packt Publishing.
1_15. Lantz, B. (2013b). Machine Learning with R. Birmingham, UK: Packt Publishing.
Delgado, M., Ruiz, M. D., Sánchez, D., & Vila, M. A. (2015). On Fuzzy Modus Ponens to Larose, D. T., & Larose, C. D. (2015). Data Mining and Predictive Analytics (2nd ed.).
Assess Fuzzy Association Rules. In Enric Trillas: A Passion for Fuzzy Sets. Studies Hoboken, NJ, USA: John Wiley & Sons.
in Fuzziness and Soft Computing, 322, 269–276. https://doi.org/10.1007/978-3- Lenca, P., Vaillant, B., Meyer, P., & Lallich, S. (2007). Association rule interestingness
319-16235-5_21. measures: Experimental and theoretical studies. in quality measures in data
Dubois, D., Hüllermeier, E., & Prade, H. (2006). A systematic approach to the mining. Studies Computational Intelligence, 43, 51–76. https://doi.org/10.1007/
assessment of fuzzy association rulesData Mining and Knowledge Discovery, 13(2), 978-3-540-44918-8_3.
167–192. https://doi.org/10.1007/s10618-005-0032-4. Lent, B., Swami, A., & Widom, J. (1997). Clustering association rules. Proceedings of
Ertel, W. (2011a). Introduction. In Introduction to Artificial Intelligence (pp. 1–14). the 13th International Conference on Data Engineering, 220–231. https://doi.
London, UK: Springer. org/10.1109/ICDE.1997.581756.
Ertel, W. (2011b). Introduction to Artificial Intelligence. London, UK: Springer. Li, B., Pei, Z., & Qin, K. (2015). Association rules mining based on clustering analysis
Ertel, W. (2011c). Machine learning and data mining. In Introduction to Artificial and soft sets. 2015 IEEE International Conference on Computer and Information
Intelligence (pp. 161–220). https://doi.org/10.1007/978-0-85729-299-5_8. Technology; Ubiquitous Computing and Communications; Dependable,
Fernández-Llatas, C., & García-Gómez, J. M. (Eds.). (2015). Data Mining in Clinical Autonomic and Secure Computing; Pervasive Intelligence and Computing,
Medicine. https://doi.org/10.1007/978-1-4939-1985-7. 675–680. https://doi.org/10.1109/CIT/IUCC/DASC/PICOM.2015.97.
Grissa Touzi, A., Thabet, A., & Sassi, M. (2011). Efficient reduction of the number of Li, Q. (2009). An algorithm of quantitative association rule on fuzzy clustering with
associations rules using fuzzy clustering on the data. In Y. Tan, Y. Shi, Y. Chai, & application to cross-selling in telecom industry. International Joint Conference on
G. Wang (Eds.), Advances in Swarm Intelligence. ICSI 2011. Lecture Notes in Computational Sciences and Optimization, 2009, 759–762. https://doi.org/
Computer Science, vol 6729 (pp. 191–199). https://doi.org/10.1007/978-3-642- 10.1109/CSO.2009.441.
21524-7_23. Lloyd, S. (1982). Least squares quantization in PCM. IEEE Transactions on Information
Hájek, P., Havel, I., & Chytil, M. (1966). The GUHA method of automatic Theory, 28(2), 129–137. https://doi.org/10.1109/TIT.1982.1056489.
hypotheses determination. Computing, 1(4), 293–308. https://doi.org/ Lyon, R. (2016). Why are pulsars hard to find? Retrieved August 3, 2019, from
10.1007/BF02345483. https://www.escholar.manchester.ac.uk/api/datastream?publicationPid=uk-ac-
Hájek, Petr, Holeňa, M., & Rauch, J. (2010). The GUHA method and its meaning for man-scw:305203&datastreamId=FULL-TEXT.PDF.
data mining. Journal of Computer and System Sciences, 76(1), 34–48. https://doi. Lyon, R. (2017). HTRU2 Dataset. Retrieved September 17, 2017, from UCI ML
org/10.1016/j.jcss.2009.05.004. Repository website: https://archive.ics.uci.edu/ml/datasets/HTRU2.
V.E. Mirzakhanov / Expert Systems with Applications 162 (2020) 113781 11
Mamdani, E. H., & Assilian, S. (1975). An experiment in linguistic synthesis with a Sudkamp, T. (2005). Examples, counterexamples, and measuring fuzzy
fuzzy logic controller. International Journal of Man-Machine Studies, 7(1), 1–13. associations. Fuzzy Sets and Systems, 149(1), 57–71. https://doi.org/10.1016/
https://doi.org/10.1016/S0020-7373(75)80002-2. j.fss.2004.07.017.
Mangalampalli, A., & Pudi, V. (2010). FPrep: fuzzy clustering driven efficient Tan, S. C. (2018). Improving association rule mining using clustering-based
automated pre-processing for fuzzy association rule mining. International discretization of numerical data. International Conference on Intelligent and
Conference on Fuzzy Systems, 1–8. https://doi.org/10.1109/FUZZY.2010.5584154. Innovative Computing Applications, 2018, 1–5. https://doi.org/10.1109/
Mannila, H., Toivonen, H., & Verkamo, I. (1994). Efficient algorithms for discovering ICONIC.2018.8601291.
association rules. Proceedings of the AAAI Workshop on Knowledge Discovery Thabtah, F. (2007). A review of associative classification mining. The Knowledge
in Databases, 181–192. Engineering Review, 22(1), 37–65. https://doi.org/10.1017/S0269888907
Mărginean, F. A. (2004). Soft learning: a conceptual bridge between data mining and 001026.
machine learning. In Applications and Science in Soft Computing (pp. 241–248). Thomas, B., & Raju, G. (2014). A novel unsupervised fuzzy clustering method for
https://doi.org/10.1007/978-3-540-45240-9_33. preprocessing of quantitative attributes in association rule mining. Information
Mellouk, A., & Chebira, A. (Eds.). (2009). Machine Learning. Croatia: In-teh. Technology and Management, 15(1), 9–17. https://doi.org/10.1007/s10799-013-
Mirzakhanov, Vugar. (2019). Clustering-based speed-up technique in ARM. 0168-7.
Retrieved October 24, 2019, from MATLAB Central File Exchange website: Todorovski, V., Chorbev, I., & Loskovska, S. (2010). Overview of the Guha method as
https://www.mathworks.com/matlabcentral/fileexchange/73104. a data mining technique. Proceedings of the Seventh Conference on Informatics
Mirzakhanov, Vuqar (2019). The fuzzification issue in the Wu–Mendel approach for and Information Technology, 11–16. Skopje: Saints Cyril and Methodius
linguistic summarisation using IF-THEN rules. Journal of Experimental & University.
Theoretical Artificial Intelligence, 31(1), 117–136. https://doi.org/10.1080/ Verlinde, H., De Cock, M., & Boute, R. (2006). Fuzzy versus quantitative association
0952813X.2018.1544202. rules: a fair data-driven comparison. IEEE Transactions on Systems, Man, and
Mirzakhanov, Vuqar, & Gardashova, L. (2019). Modification of the Wu-Mendel Cybernetics, Part B (Cybernetics), 36(3), 679–684. https://doi.org/10.1109/
approach for linguistic summarization. Journal of Experimental & Theoretical TSMCB.2005.860134.
Artificial Intelligence, 31(1), 77–97. https://doi.org/10.1080/ Watanabe, T., & Takahashi, H. (2006). A quantitative association rule mining
0952813X.2018.1518998. algorithm based on clustering algorithm. 2006 IEEE International Conference on
Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018a). Foundations of Machine Systems, Man and Cybernetics, 2652–2657. https://doi.org/10.1109/
Learning (2nd ed.). Cambridge, MA, USA: MIT Press. ICSMC.2006.385264.
Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018b). Introduction. In Foundations Whiteson, D. (2014a). HIGGS Dataset. Retrieved July 4, 2017, from UCI ML
of Machine Learning (2nd ed., pp. 1–8). Cambridge, MA, USA: MIT Press. Repository website: http://archive.ics.uci.edu/ml/datasets/HIGGS.
North, M. (2012). Data Mining for the Masses. Global Text Project. Whiteson, D. (2014b). SUSY Dataset. Retrieved September 19, 2017, from UCI ML
Pi, D., Qin, X., & Yuan, P. (2006). A modified fuzzy C-means algorithm for association Repository website: https://archive.ics.uci.edu/ml/datasets/SUSY.
rules clustering. In D. Huang, K. Li, & G. W. Irwin (Eds.), Computational Witten, I., & Frank, E. (2005a). Data Mining. Practical Machine Learning Tools and
Intelligence. ICIC 2006. Lecture Notes in Computer Science, vol 4114 (pp. 1093– Techniques (2nd ed.). San Francisco, CA, USA: Elsevier.
1103). https://doi.org/10.1007/978-3-540-37275-2_137. Witten, I., & Frank, E. (2005b). Mining association rules. In Data Mining. Practical
Quan, T. T., Ngo, L. N., & Hui, S. C. (2009). An effective clustering-based approach for Machine Learning Tools and Techniques (2nd ed., pp. 112–119). San Francisco,
conceptual association rules mining. 2009 IEEE-RIVF International Conference CA, USA: Elsevier.
on Computing and Communication Technologies, 1–7. https://doi.org/10.1109/ Xu, G., Zong, Y., & Yang, Z. (2013). Applied Data Mining. Boca Raton, FL, USA: CRC
RIVF.2009.5174619. Press.
Reiss, A. R. (2012). PAMAP2 Physical Activity Monitoring Dataset. Retrieved July 18, Yotsawat, W., & Srivihok, A. (2015). Rules mining based on clustering of inbound
2017, from UCI ML Repository website: http://archive.ics.uci.edu/ml/datasets/ tourists in Thailand. In H. Sulaiman, M. Othman, M. Othman, Y. Rahim, & N. Pee
PAMAP2+Physical+Activity+Monitoring. (Eds.), Advanced Computer and Communication Engineering Technology.
Riaz, M., Arooj, A., Malik Tahir Hassan, & Jeong-Bae Kim. (2014). Clustering based Lecture Notes in Electrical Engineering, vol 315 (pp. 693–705). https://doi.org/
association rule mining on online stores for optimized cross product 10.1007/978-3-319-07674-4_65.
recommendation. The 2014 International Conference on Control, Automation Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3), 338–353. https://doi.
and Information Sciences (ICCAIS 2014), 176–181. https://doi.org/10.1109/ org/10.1016/S0019-9958(65)90241-X.
ICCAIS.2014.7020553. Zaki, M. J., & Meira, W. (2014). Data Mining and Analysis: Fundamental Concepts
Russell, M. A. (2014). Mining the Social Web (2nd ed.). Sebastopol, CA, USA: O’Reilly and Algorithms. https://doi.org/10.1017/CBO9780511810114.
Media. Zhang, M., & He, C. (2010). Survey on association rules mining algorithms. In
Sammut, C., & Webb, G. (Eds.). (2017). Encyclopedia of Machine Learning and Data Advancing Computing, Communication, Control and Management. Lecture
Mining (2nd ed.). New York, NY, USA: Springer. Notes. Electrical Engineering, 56, 111–118. https://doi.org/10.1007/978-3-642-
Sobhanam, H., & Mariappan, A. K. (2013). Addressing cold start problem in 05173-9_15.
recommender systems using association rules and clustering technique. Zhang, S., & Wu, X. (2011). Fundamentals of association rules in data mining and
International Conference on Computer Communication and Informatics, 2013, knowledge discovery. Wiley Interdisciplinary Reviews: Data Mining and
1–5. https://doi.org/10.1109/ICCCI.2013.6466121. Knowledge Discovery, 1(2), 97–116. https://doi.org/10.1002/widm.10.
Srikant, R., & Agrawal, R. (1996). Mining quantitative association rules in large Zhao, Y., Zhang, C., & Zhang, S. (2004). Discovering interesting association rules by
relational tables. Proceedings of the 1996 ACM SIGMOD International clustering. In G. I. Webb & X. Yu (Eds.), AI 2004. Lecture Notes in Computer
Conference on Management of Data - SIGMOD ’96, 1–12. https://doi.org/ Science, vol 3339 (pp. 1055–1061). https://doi.org/10.1007/978-3-540-30549-
10.1145/235968.233311. 1_101.