0% found this document useful (0 votes)
4 views17 pages

Pulkkinen 2010

This paper presents a multiobjective genetic fuzzy system (GFS) designed for regression problems, focusing on optimizing fuzzy partitions while maintaining their transparency. The proposed approach utilizes dynamic constraints for tuning membership functions and combines benefits from existing initialization methods to enhance accuracy and interpretability. Experimental results demonstrate that this GFS outperforms or matches existing methods across various benchmark problems with up to 21 input variables.

Uploaded by

venkata kiran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views17 pages

Pulkkinen 2010

This paper presents a multiobjective genetic fuzzy system (GFS) designed for regression problems, focusing on optimizing fuzzy partitions while maintaining their transparency. The proposed approach utilizes dynamic constraints for tuning membership functions and combines benefits from existing initialization methods to enhance accuracy and interpretability. Experimental results demonstrate that this GFS outperforms or matches existing methods across various benchmark problems with up to 21 input variables.

Uploaded by

venkata kiran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 18, NO.

1, FEBRUARY 2010 161

A Dynamically Constrained Multiobjective Genetic


Fuzzy System for Regression Problems
Pietari Pulkkinen and Hannu Koivisto

Abstract—In this paper, a multiobjective genetic fuzzy system data, the accuracy of FMs is deteriorated [13]. Thus, it is im-
(GFS) to learn the granularities of fuzzy partitions, tuning the portant to not only optimize the rules and rule conditions, but
membership functions (MFs), and learning the fuzzy rules is pre- also the membership-function (MF) parameters. However, this
sented. It uses dynamic constraints, which enable three-parameter
MF tuning to improve the accuracy while guaranteeing the trans- increases the search space and may deteriorate the transparency
parency of fuzzy partitions. The fuzzy models (FMs) are initialized of fuzzy partitions.
by a method that combines the benefits of Wang–Mendel (WM) There are also studies in which fuzzy partitions are not fixed
and decision-tree algorithms. Thus, the initial FMs have less rules, and factor 1) is taken into account by other means. Merging of
rule conditions, and input variables than if WM initialization were highly similar fuzzy sets was used in [14] and [15] to improve
to be used. Moreover, the fuzzy partitions of initial FMs are always
transparent. Our approach is tested against recent multiobjective the transparency of fuzzy partitions. Parameters of a fuzzy set
and monoobjective GFSs on six benchmark problems. It is con- that cover another fuzzy set were automatically adjusted in [4].
cluded that the accuracy and interpretability of our FMs are al- Penalties were issued in [5], if the intersection point of two fuzzy
ways comparable or better than those in the comparative studies. sets was not between user-specified boundaries. This approach
Furthermore, on some benchmark problems, our approach clearly not only avoided highly overlapping fuzzy sets, but also en-
outperforms some comparative approaches. Suitability of our ap-
proach for higher dimensional problems is shown by studying three sured that the whole universe of discourse (UoD) was strongly
benchmark problems that have up to 21 input variables. covered. The approach [5] was extended in [16] to reduce the
effects of relaxed covering [4]. Here, [16] is followed; however,
Index Terms—Genetic fuzzy systems (GFSs), initialization,
accuracy, interpretability, Mamdani fuzzy models (FMs). instead of minimizing the penalties, dynamic constraints are
used to ensure that the fuzzy partitions are always transparent.
I. INTRODUCTION This increases the selection pressure and improves the search
NTERPRETABILITY-accuracy tradeoff of fuzzy models efficiency [17].
I (FMs) has recently attained a lot of research interest [1]–[9].
Since it is not possible to maximize these contradicting ob-
This paper deals with regression (or function estimation)
problems, which have not yet received as much research efforts
jectives simultaneously, multiobjective evolutionary algorithms as classification problems [6]. We apply Mamdani FMs [18],
(MOEAs) have recently been used to find a Pareto optimal set which are also called linguistic FMs. When regression problems
of FMs that present different tradeoffs between the objectives. are considered, the population is usually initialized randomly or
These approaches are also called multiobjective genetic fuzzy by Wang and Mendel (WM) method [19]. Unfortunately, ran-
systems (GFS) [10], [11]. dom initialization does not guarantee a good starting point for
Accuracy is often measured by mean-squared error (MSE) further optimization, and WM method usually leads to high
when regression problems are considered. However, there is no number of rules and rule conditions when high-dimensional
exact measure for interpretability of FMs [2] and it tends to be problems and/or problem with many data points are considered.
somewhat subjective. Nevertheless, the definition by Ishibuchi Recently, we proposed a decision-tree (DT) based initialization
and Yamamoto [12] is often used. It defines interpretability by method for regression problems [20], which reduces the num-
four factors: 1) transparency of fuzzy partitions; 2) complex- ber of input variables and leads to less rules and rule conditions
ity of FMs (e.g., the number of fuzzy rules and input variables); than WM initialization. However, it does not necessarily create
3) complexity of fuzzy-rule base (e.g., type of rules and the num- transparent fuzzy partitions. WM algorithm, on the other hand,
ber of rule conditions); and 4) complexity of fuzzy reasoning creates rules for a priori given fuzzy partitions; thus, trans-
(e.g., defuzzification method). parency of fuzzy partitions is usually high. Here, we combine
Factor 1) is often satisfied by using fixed fuzzy partitions (uni- the benefits of WM and DT initialization. Therefore, the ini-
formly distributed or known by a priori knowledge) [3], [12]. tial fuzzy partitions are transparent, and the initial FMs contain
However, a priori knowledge is often not available. Further- less rules, rule conditions, and input variables than when WM
more, if fuzzy partitions do not present the real distribution of algorithm is used.
The initial population is then optimized by multiobjective
Manuscript received March 9, 2009; revised June 15, 2009 and September GFS that uses dynamic constraints to ensure the transparency
18, 2009; accepted November 24, 2009. First published December 15, 2009; of fuzzy partitions. It also reduces the number of rules, rule
current version published February 5, 2010.
The authors are with the Department of Automation Science and Engi- conditions, MFs, and input variables. The proposed initializa-
neering, Tampere University of Technology, Tampere 33101, Finland (e-mail: tion method and multiobjective GFS therefore aid to satisfy the
[email protected]; [email protected]). aforementioned factors [1)–3)]. Factor 4), which is the com-
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. plexity of fuzzy reasoning, is taken into account by applying
Digital Object Identifier 10.1109/TFUZZ.2009.2038712 simple-weighted-average-defuzzication method.
1063-6706/$26.00 © 2009 IEEE
162 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 18, NO. 1, FEBRUARY 2010

TABLE I
SECOND-GENERATION MULTIOBJECTIVE GFSS APPLIED TO IDENTIFICATION OF LINGUISTIC FMS

Our multiobjective GFS is tested on a set of nine benchmark accuracy. Unfortunately, it often deteriorates the transparency
problems having 2 up to 21 input variables. For six of them, of fuzzy partitions. In the area of regression problems, there are
there are results of other recently proposed GFSs available. Our some methods [21], [22], [25]–[27] that apply MFs tuning and
results are compared to them, and it is shown that our results are have appropriately considered this factor. One of them [27] is
comparable or better in terms of accuracy and interpretability. a context-adaptation approach that only performs MFs tuning,
This paper is organized as follows. First, a brief survey of requiring the whole rule base to be provided by the user. MF pa-
recently proposed multiobjective GFSs is given. Based on this, rameters are learnt using a linguistic two-tuple tuning scheme [9]
novelty of our multiobjective GFS is clearly pointed out. Then, in [21] and [22]. Piecewise-linear-transformation techniques are
the interpretability of FMs is discussed and a special attention applied in [25] and a wrapper-based embedded process is used
is paid to transparency of fuzzy partitions. Then, in Section IV, in [26]. The approaches [8], [20], [23] apply conventional three-
the proposed initialization method is introduced. After this, in parameter MFs tuning with static constraints, which does not
Section V, dynamically constrained multiobjective GFS is pre- guarantee transparency of fuzzy partitions.
sented. The results comparisons are performed in Section VI In this paper, three-parameter MFs tuning with dynamic con-
and conclusions are given in Section VII. straints is applied. The search space is therefore larger compared
to two-tuple representation, which only modifies the lateral dis-
II. MULTIOBJECTIVE GENETIC FUZZY SYSTEMS FOR placements of the MFs. On the other hand, it is excepted that the
LINGUISTIC-FUZZY-MODEL IDENTIFICATION: proposed approach improves the accuracy. Moreover, because
STATE OF THE ART of dynamic constraints, it is guaranteed that the whole UoD is
Recently, several researchers have focused on designing mul- strongly covered and there is no highly overlapping MFs. Our
tiobjective GFSs to identify of compact and accurate linguistic approach also does not require that MFs are uniformly shaped
FMs. Ishibuchi’s research group has published several papers as long as the transparency conditions, which are introduced
that consider fuzzy classification. Nonetheless, until recently, later in Section III-A, are met. In some cases, uniformly shaped
there were hardly any papers that considered multiobjective MFs can actually be misleading if they do not present the real
GFSs in regression problems [23]. distribution of the data. In some cases, it is therefore necessary
Table I presents multiobjective GFSs for classification and that some fuzzy sets are, for example, wide, whereas some oth-
regression problems. For the sake of brevity, it includes only ers are narrow. Finally, granularities of global fuzzy partitions
the recent approaches that apply the second-generation MOEAs are also learnt by our approach. These properties guarantee that
(e.g., the nondominated sorting genetic algorithm II (NSGA-II), our approach maintains the transparency of fuzzy partitions at a
the strength pareto evolutionary algorithm 2 (SPEA2), and the good level.
pareto archived evolution strategy (PAES)). It also excludes Input-variable selection before applying GFS (i.e., in initial-
those approaches that apply first-order Takagi–Sugeno FMs. In ization phase) reduces the number of parameters to be opti-
this table, rule selection means that a rule is either included mized. This has been applied by some approaches; however,
or not included into an FM, whereas rule learning means that in the field of regression, there is only one approach [20] that
appropriate rule conditions are learned by GFS. It is seen that applies this. Usually, regression problems with 2 up to 10 input
usually either rule learning or rule selection is applied, and there variables are studied in the literature, and therefore, the role of
is only one approach [27] that applies neither of them. input-variable selection is not crucial. However, in this paper,
MFs of fuzzy rules are taken from four different fuzzy par- its role becomes more important as problems up to 21 input
titions in [1], which means that the resulting global fuzzy par- variables are studied.
titions are not always transparent. Granularities of global fuzzy The difference between the proposed approach and the ap-
partitions are learnt in [24], which improves the transparency. proach [16] is more than just a different problem type. Trans-
The most trivial way to obtain transparent fuzzy partitions is to parency of fuzzy partitions was obtained in [16] by minimizing
use evenly distributed uniformly shaped MFs, like in [3]. How- a transparency index. It means that the transparency indexes of
ever, MFs tuning is often applied because it usually improves the FMs in population may be very different. There may be some
PULKKINEN AND KOIVISTO: DYNAMICALLY CONSTRAINED MULTIOBJECTIVE GENETIC FUZZY SYSTEM FOR REGRESSION PROBLEMS 163

FMs with highly transparent fuzzy partitions and some other


FMs with unacceptable fuzzy partitions. Naturally, by constrain-
ing the range in which the value of transparency index can vary
reduces the variation. However, in this case, the offspring popu-
lation will usually contain some infeasible FMs (FMs for which
the transparency index is not acceptable). This deteriorates the
search efficiency of GFS. In this paper, transparency of fuzzy
partitions is guaranteed by dynamic constraints. This reduces
the number of fitness objectives by one, which increases the
selection pressure [17].
Based on this brief analysis, it can be concluded that the
proposed multiobjective GFS is novel. Indeed, to the best of
our knowledge, there exist no multiobjective GFS applicable to
regression, which performs rule learning and three-parameter
MFs tuning, while preserving transparency of fuzzy partitions.
Moreover, input variables are selected in two ways. First, dur-
ing the initialization phase. Second, during the multiobjective-
GFS-search process, which can select input variables among the Fig. 1. Examples of fuzzy partitions that are considered to be transparent. MF
remaining ones after the initialization. centers are marked with dotted vertical lines. (a) Gaussian MFs. (b) gbell MFs.
(c) Symmetrical trapezoidal MFs. (d) Symmetrical triangular MFs.

III. INTERPRETABILITY OF FUZZY MODELS


2) α-condition: At any intersection point of two MFs, the
As mentioned previously, in this paper, the factors 2) and 3)
membership value is at most α.
of the interpretability definition [12] are satisfied by minimizing
3) γ-condition: At the center of each MF, no other MF re-
the complexity of FMs and factor 4) by application of simple-
ceives membership value larger than γ. Center of an MF
weighted-average defuzzification. However, because the MFs
depends on which MF type is used. For gbell MF (with
are tuned, factor 1)—transparency of fuzzy partitions—requires
parameters a, b, and c) and Gaussian MF (with param-
a special attention. In the next section, a definition for this is
eters c and σ), center is the parameter c. For triangu-
given. It applies only to input variables, because in this paper,
lar MF (with parameters a < b < c), b is the center. For
singleton output MFs are used. Because singleton MFs can be
trapezoidal MF (with parameters a < b < c < d), center
presented with only one parameter, it is sufficient to apply static
is b + ((c − b)/2) (see also Fig. 1).
constraints, introduced later in Section V-B, to maintain the
4) β-condition: UoD is strongly covered, i.e., at each point
transparency of output partition at a good level.
of UoD, at least one MF has membership value at least β.
Fig. 1 shows examples of fuzzy partitions with settings β =
A. Transparency of Fuzzy Partitions
0.05, γ = 0.25, and α = 0.8. Section III-B describes how β, γ,
As in [27], this paper uses the transparency definition by de and α must generally be selected in order to apply the dynamic-
Oliveira [28], which states that a transparent fuzzy partition tuning strategy.
must meet the conditions, which are given as follows. In this paper, gbell MFs are used. They are defined as
1) The number of MFs per variable is moderate.
1
2) MFs are distinguishable, i.e., two MFs do not present the µ(x; a, b, c) = (1)
1 + |((x − c)/a)|2b
same or almost the same linguistic meaning.
3) Each MF is normal. An MF is normal if it has membership where a, b, and c define the width, shape, and center of an MF,
value 1 at least at one point of UoD. respectively. As gbell MFs are symmetrical, first of the previous
4) UoD is strongly covered. At least one MF receives a mem- conditions is met. Fulfillment of the rest three conditions rely
bership value β (where β > 0) at any point of UoD. largely on computing the values of x, for which an MF receives
Condition 1) is easily met by constraining the maximum num- a certain membership value µ. Because of the symmetry of gbell
ber of MFs to a moderate number (for example, 9). Also, con- MFs, any membership value µ ∈ (0, 1) is received on the left
dition 3) is met by applying normal MFs and genetic operators and right side of the center c. These points are denoted here by
that do not alter their normality. Meeting conditions 2) and 4) is IL and IR
more challenging. In this paper, it is considered that they are met
if globally defined MFs are used and the following conditions IL (µ, p) = c − a (κ(µ))1/2b , µ ∈ (0, 1) (2)
are met. IR (µ, p) = c + a (κ(µ))1/2b , µ ∈ (0, 1) (3)
1) Symmetry condition: The shapes of all MFs are sym-
metrical. For example, Gaussian MF and generalized-bell where p = [a, b, c]T is a vector containing the MF parameters
(gbell) MF are symmetrical by definition. Also, other MF and
types, such as triangular and trapezoidal MFs can be easily 1−µ
made symmetrical. κ(µ) = , µ ∈ (0, 1). (4)
µ
164 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 18, NO. 1, FEBRUARY 2010

Equations (2) and (3) are used to formulate the α, γ, and β minimum value of MA is 2. Centers are distributed evenly as
conditions. For the sake of clarity, each of them is split into two χ
parts, denoted here by left and right. They ensure the fulfillment c1 = χlow , and cj = cj −1 + , j = 2, . . . , MA .
MA − 1
of the conditions on the left or right side of the center of an MF, (6)
respectively. Let the active MFs of a variable be indexed as j = Assigning the values for a and c according to (5) and (6)
1, . . . , MA , where MA is the number of currently active MFs of guarantees that UoD is strongly covered and the membership
that variable. It will be shown later that our multiobjective GFS value of each MF pair at their intersection point is 0.5. Thus,
maintains the ordering of MFs, i.e., if i > j, then ci > cj , where 0 < β < 0.5 and 0.5 < α < 1 must be selected in order to apply
ci and cj are the gbell parameters c of MFs i and j. Moreover, in the dynamic-tuning strategy. Because the membership value at
this paper, the fuzzy partitions with only one MF are not allowed, each intersection point is 0.5, the β and α conditions are ful-
because they are not considered transparent. Hence, throughout filled. Moreover, because gbell MFs are symmetrical, the sym-
this paper, it is known that if j = 1, then MF is the leftmost metry condition is satisfied as well. The γ-condition requires
MF and its neighboring MF is j + 1. If 1 < j < MA , then MF that at the center of each MF, no other MF receives membership
is in the middle of neighboring MFs j − 1 and j + 1. Finally, value larger than γ. This algorithm selects b, such that, at the
if j = MA , then MF is the rightmost MF and the neighboring center of each MF, the neighboring MF(s) receive the member-
MF is j − 1. Thus, the transparency conditions can be written ship value γ ∗ = 0.05. Thus, γ ∗ < γ < 0.5 must be selected in
as follows order to apply the dynamic-tuning strategy. The following for-
Right α-condition mula for selecting b can be derived by starting from either (2)
or (3):
IR (α, pj )) ≤ IL (α, pj +1 ), if j < MA .
ln κ(γ ∗ )
bj = , j = 1, . . . , MA (7)
Left α-condition 2 ln(dcenter,j /aj )

IL (α, pj )) ≥ IR (α, pj −1 ), if j > 1. where



 min(cj − cj −1 , cj +1 − cj ),
if 1 < j < MA
Right γ-condition dcenter,j = cj +1 − cj , if j = 1

cj − cj −1 , if j = MA
IR (γ, pj ) ≤ cj +1 ∧ cj ≤ IL (γ, pj +1 ), if j < MA .
(8)
Left γ-condition denotes the minimum distance from cj to the nearest center(s)
of neighboring MF(s).
IL (γ, pj ) ≥ cj −1 ∧ cj ≥ IR (γ, pj −1 ), if j > 1. Because MFs are evenly distributed, dcenter,j =
 χ/(MA − 1) ∀j. Thus, (7) can be written as
IR (β, pj ) ≥ IL (β, pj +1 ), if j < MA
Right β-condition: ln κ(γ ∗ )
IR (β, pj ) ≥ χhigh , if j = MA bj = , j = 1, . . . , MA . (9)
 ln 4
IL (β, pj )) ≤ IR (β, pj −1 ), if j > 1
Left β-condition: There is no upper limit for the value of b in the sense that larger
IL (β, pj ) ≤ χlow , if j = 1
b values will not violate the transparency conditions. However,
where the variable range is χ = χhigh − χlow , where χlow and very large b values are not desired as they make gbell MFs similar
χhigh are the lower and upper bounds of the variable, respec- to crisp sets and because b is the exponent in (1). Therefore, value
tively. These conditions are the basis of the proposed dynamic of b for each MF is defined by (9) by this algorithm.
constraints, which require that the fuzzy partitions of initial FMs
are transparent. Thus, two simple partition algorithms to create C. Partitioning Algorithm to Create Unevenly Distributed
transparent fuzzy partitions are introduced next. Fuzzy Partition
As there is no a priori knowledge about the distribution of
B. Partitioning Algorithm to Create Evenly Distributed Fuzzy MFs, it is also beneficial to create unevenly distributed nonuni-
Partition formly shaped MFs. The following algorithm is used for this
This algorithm creates a fuzzy partition consisting of MA purpose, and it is applied to create the fuzzy partitions of the
evenly distributed uniformly shaped MFs, and it is only used rest FMs of the initial population and as a part of genetic oper-
when creating the first FM of the initial population. Because ators. It selects c and a as follows:
MFs are uniformly shaped, the gbell parameter a for each MF a1 = max(am in , r1 aeven ), and c1 = χlow (10)
j is   
(2j − 1)aeven − (cj −1 + aj −1 )
χ aj = max am in , rj
aj = aeven = , j = 1, . . . , MA . (5) 2
2(MA − 1)
(11)
It is required that each aj ≥ am in = 0.025χ to avoid very nar-
where j = 2, . . . , MA − 1
row MFs. This limits the maximum value of MA to 21; however,
in practice, more than nine MFs are hardly ever assigned. The cj = cj −1 + aj −1 + aj , j = 2, . . . , MA − 1 (12)
PULKKINEN AND KOIVISTO: DYNAMICALLY CONSTRAINED MULTIOBJECTIVE GENETIC FUZZY SYSTEM FOR REGRESSION PROBLEMS 165

Fig. 3. Procedure of creating the first FM of the population.

IV. POPULATION INITIALIZATION


Whenever a GFS is used, the population needs to be initialized
first. In order to reduce the search space, it is desirable that the
initialization method is able to select the relevant input variables.
Fig. 2. Example of (a) unevenly distributed fuzzy partition and (b) its inverse.
Thus, in [15], the C4.5 [29] DT-based method for classification
problems was proposed. Recently, in [20], it was made suitable
aM A = χhigh − (cM A −1 + aM A −1 ), and cM A = χhigh for regression problems. Although this method is capable of
selecting relevant input variables, its main limitations are that:
(13) 1) it does not guarantee transparent fuzzy partitions and 2) it
where r1 , r2 , . . . , rM A −1 ∈ [0, 1] are random real numbers; may create far more rules than necessary when applied to noisy
aeven and am in were defined in the previous section. It can be datasets.
easily verified that by selecting r1 = r2 = · · · = rM A −1 = 1, In this paper, DT initialization is neither used to create the rule
this algorithm is identical to the algorithm in the previous base nor to initialize MF parameters, but to select relevant input
section. variables, to reduce the number of input MFs, and rule condi-
Unlike in the previous partition algorithm, here, parameter tions. MF parameters are determined by the introduced partition
b values are randomly selected from interval [1, 10]. However, algorithms (see Section III-B and Section III-C), which guar-
they are not allowed to be less than the corresponding minimum antee transparency of fuzzy partitions. Rule base is created by
values computed according to (7). Thus, it is guaranteed that slightly modified WM algorithm [19]. The proposed two mod-
at the center of each MF, the neighboring MF(s) receive the ifications are that: 1) when a data point is matched to MFs in
membership value less than or equal to γ ∗ . order to generate a rule, the data point is not always matched to
It is seen from (10), (11), and (13) that the more narrow MFs MFs of all possible input variables. Instead, it is first classified
are more likely to be located on the left side of the range and the by the constructed DT, and only those input variables that were
wider MFs on the right side of the range. There is, naturally, no used by DT to classify the data point are used for matching
justification for this. Hence, by uniform chance, the parameters and 2) as WM algorithm may create large number of rules for
are defined either by (10)–(13) or by their inversion as follows: datasets with many data points and/or input variables, the gener-
ated rules are divided among the members of initial population
a∗j = aM A −j +1 , b∗j = bM A −j +1 , c∗j = χhigh − cM A −j +1 and only a portion of them is allowed to be included into one
(14) FM.
where j = 1, . . . , MA .
As an example, consider creating a fuzzy partition with five
A. Creation of the First Fuzzy Method of the Population
MFs in range [0, 1]. From (5), it follows that aeven = 1/8.
Let r1 = 1/2, r2 = 1, r3 = 1/2, and r4 = 1/2. Thus, a1 = The procedure of creating the first FM is shown in Fig. 3.
a∗5 = 1/16, c1 = 0, a2 = a∗4 = 5/32, c2 = 7/32, a3 = a∗3 = It is started by discretizing the continuous output data in or-
1/16, c3 = 7/16, a4 = a∗2 = 3/32, c4 = 19/32, a5 = a∗1 = der to apply C4.5 algorithm. This is done by dividing the
5/16, and c5 = 1 and c∗1 = 0, c∗2 = 13/32, c∗3 = 9/16, c∗4 = output to Mout crisp regions. Each continuous output value
25/32, and c∗5 = 1. The minimum values for b1 = b∗5 , b2 = falls into one of these Mout regions and it is replaced with
b∗4 , b3 = b∗3 , b4 = b∗2 , and b5 = b∗1 , according to (7), are 1.1752, corresponding class label S ∈ {1, . . . , Mout }, which represents
4.3755, 1.6067, 2.8820, and 5.6114, respectively. Fig. 2(a) these regions. Then, C4.5 algorithm can be applied and a DT
shows the resulting partition when centers are computed ac- constructed.
cording to (10)–(13), whereas Fig. 2(b) depicts the resulting All input variables which are not used by DT are then re-
partition when (14) is used instead. It is seen that although MFs moved. Then, fuzzy partitions for the remaining input variables
are nonuniformly shaped and unevenly distributed, the fuzzy and for the output are created. A user is required to provide
partitions are transparent and reasonable linguistic values could the number of input MFs Min and the number of output MFs
be given. In Fig. 2, β = 0.05, γ = 0.25, and α = 0.8. Mout . However, the DT can be used to limit the number of
166 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 18, NO. 1, FEBRUARY 2010

input MFs. First, the DT is transformed into an FM, according Mamdani fuzzy rules are expressed as
to [15]. After this, the number of MFs for each input variable
Ri : If x1 is Bi,1 . . . and xn s is Bi,n s , then Ci
j in the resulting FM is checked and denoted by MDT j . Then,
instead of partitioning each input variable with Min MFs, each where Bi,j , with j = 1, . . . , ns and i = 1, . . . , R, is an input
input partition is created with min(MDT j , Min ) MFs. The out- fuzzy set, Ci is an output fuzzy set, and R is the number of
put is partitioned with Mout MFs. These partitions consist of rules. To reduce the computational costs, the output of FMs is
uniformly shaped evenly distributed MFs and are created by the computed by approximation of centroid of gravity method [3],
algorithm introduced in Section III-B. [30] as
Then, a slightly modified WM algorithm is used. As men- R
tioned previously, when a rule is generated, each data point i=1 βi (xk )C̄i
ŷk =  R
, k = 1, . . . , D (15)
is first classified by the constructed DT and only those in- i=1 βi (xk )
put variables that were used by DT to classify the data point where C̄i is the center value of Ci , and βi (xk ) =
are used for matching and become conditions of the gener- n s
j =1 i,j (xk ,j ) is the degree of rule activation. When the
B
ated rule. All other parts of the classical WM algorithm remain slightly modified WM algorithm was used to create the rule
unchanged. base, gbell output MFs were used. However, at the optimiza-
After creating the rule base, the number of active MFs MA,j tion phase, application of gbell MFs is not necessary anymore,
for each input variable j is checked (an MF is active if it is since C̄ is the only output MF parameter affecting the outcome.
part of at least one of the rules). If MA,j < min(MDT j , Min ), Therefore, all gbell output MFs are replaced with singleton MFs
then there is a gap in fuzzy partition and the whole UoD is as
not strongly covered. If this is the case and if MA,j ≥ 2, then 
a new evenly distributed partition with MA,j MFs is created. 1, if x = C̄
µ(x, C̄) =
If MA,j < 2, then input variable j is removed and MA,j is set 0, if x = C̄
to 0. The maximum number of MFs, i.e., Mm ax j = MA,j , that where C̄ is the corresponding gbell MF parameter c. For the
each FM of the population can use in input variable j is de- purpose of multiobjective GFS optimization, the antecedents of
termined by this phase. Also, all the input variables that are the rule base are presented with an integer-coded matrix A. It
not removed until now form the set of candidate input vari- specifies for each rule i = 1, . . . , R that MF is used for input
ables. The number of these remaining input variables is denoted variable j = 1, . . . , ns
by ns . a 
The generated rule base may contain large amount of rules. 1,1 a 1,2 ... a 1,n s
However, in this paper, each FM can contain at most Rm ax = 30  a2,1 a2,2 ... a2,n s 
A=
 .. .. .. ..   (16)
rules. If the rule base has more than Rm ax rules, then Rm ax . . . .
rules are randomly selected out of it. Otherwise, the rule base aR ,1 aR ,2 ... aR ,n s
is taken as a whole. If rules are randomly selected, it may
result into some gaps in the fuzzy partition, which is not al- ai,j ∈ {0, 1, . . . , Mm ax j }, where Mm ax j is the maximum-
lowed. In this case, it is required that the number of active number MFs in input variable j. If ai,j = 0, input variable j
MFs for each input variable must be Mm ax j and the number is not used in rule i. Input variable j is not used in an FM if
of active output MFs must be Mout . If this is not the case, ∀i, ai,j = 0, and rule i is not used in an FM if ∀j, ai,j = 0. Input
then max(Mm ax 1 , Mm ax 2 , . . . , Mm ax n s , Mout ) randomly se- MF parameters to which each ai,j is referring are defined in a
lected rules are replaced with some rules, thus making all the real-coded matrix P as
inactive MFs active. In this paper, these rules are created, such p p1,2 . . . p1,δ 
1,1
that, in the first of the rules, all antecedents and the consequent  p2,1 p2,2 . . . p2,δ 
are 1. In the second rule, they are all 2. This is continued un- P =  .. .. .. .. 
 (17)
. . . .
til all inactive MFs have become active. Of course, it must be
pρ,1 pρ,2 . . . pρ,δ
taken care that the antecedents for input variable j are at most
Mm ax j and, for the consequent, at most Mout . This rule replace- where ρ is the number of parameters used to define an MF. In
ment is necessary only if the rule base contains more than Rm ax this paper, ρ = 3, because gbell MFs are used. The  smaximum
rules. Otherwise, it is certain that there are no gaps in the fuzzy number of MFs in an FM is denoted by δ = nj =1 Mm ax j .
partition. Thus, for any ai,j = 0, the corresponding input MF parameters
are p1,l , p2,l , and p3,l , where

B. Mamdani Fuzzy Model and Its Coding for Multiobjective- ai,j , if j = 1
l=  (18)
Genetic-Fuzzy-System Optimization ai,j + jk−1
=1 M k , if j > 1.
The original dataset contains n input variables; however, Similarly as A states the input MFs used in the rules, an
the initialization method selects ns ≤ n of them. Therefore, integer-coded vector s defines the output MFs (singletons)
a dataset with D data points is denoted as Z = [X y], where used in the rules. Formally, s = [s1 , s2 , . . . , sR ]T , where si ∈
X is D × ns input matrix, and y is D × 1 output vector. The {1, . . . , Mout }, with i = 1, . . . , R. The maximum number of
first FM and all other FMs in this paper are Mamdani FMs. output MFs is denoted by Mout . The output MF parameters to
PULKKINEN AND KOIVISTO: DYNAMICALLY CONSTRAINED MULTIOBJECTIVE GENETIC FUZZY SYSTEM FOR REGRESSION PROBLEMS 167

which each si is referring are defined in a real-coded vector o = distributed fuzzy partition with MA,j MFs is created by using
[o1 , o2 , . . . , oM o u t ]T . The total number of parameters to be opti- the algorithm in Section III-C. If MA,j < 2, then all nonzero
mized by a multiobjective GFS is θ = Rns + ρδ + R + Mout , rule conditions, if any, of that input variable are forced to zero.
i.e., the sum of the cardinalities of A, P , s, and o. After this, this input variable has no active MFs, and the value
of MF parameters for this input variable can be assigned to any
C. Mamdani-Fuzzy-Model Coding: An Example value. However, if the genetic operators at a later stage cause at
least two MFs to be active, then the value of these parameters is
Let us assume that the first FM of the initial population has
determined by the algorithm in Section III-C. Finally, the output
four rules and uses two input variables x1 and x2 , which are par-
MF parameters o for all Np op − 1 FMs are the same as in the
titioned, respectively, with three and two gbell MFs. The output
first FM.
is partitioned with four singleton MFs. Both input variables and
the output are in the range of [0, 1]. The partitions are uniformly
E. Creation of the Rest of the Population: An Example
shaped and evenly distributed. The rule base is given as follows.
Rule1 : If x1 is 1 and x2 is 1, then output is 1. Let us return to the example from Section IV-C and consider
Rule2 : If x1 is 2 and x2 is 2, then output is 2. creating one of the rest Np op − 1 FMs. Since the initial FM has
Rule3 : If x1 is 2 and x2 is 1, then output is 3. only 4 ≤ Rm ax rules, the rules are created by modifying the
Rule4 : If x1 is 3 and x2 is 2, then output is 4. rules of the first FM. Assume that as a result, the condition If
This FM is coded as x1 is 1 of the first rule was changed to If x1 is 3. Now, the FM
      has no rule in which the condition If x1 is 1 is part of. Thus,
1 1 1 0
2 2 2  1/3  the input MF 1 of x1 is inactive and a new unevenly distributed
A= , s =  , o=  partition is created with two MFs and assigned to input MFs 2
2 1 3 2/3
3 2 4 1 and 3 of x1 , such that their order is maintained. Similarly, a new
  unevenly distributed partition with two MFs is also created for
0.25 0.25 0.25 0.5 0.5 x2 , which still has two active MFs. The following could be the
P =  2.124 2.124 2.124 2.124 2.124  result after these operations:
0 0.5 1 0 1  
    0
where the first, second, and third row of P contain the gbell 3 1 1
 
parameters a, b, and c, respectively. The first three columns of 2 2 2  1/3 
A= , s =  , o= 
P contain the gbell parameters of the three MFs of x1 and the 2 1 3  2/3 
rest two columns contain the gbell parameters of the two MFs of 3 2 4 1
x2 . These parameters are computed according to the algorithm  
0.25 0.3 0.7 0.8 0.2
in Section III-B.
P =  2.124 3 9 7 4 
0 0 1 0 1
D. Creation of the Rest of the Population
The first FM defines the maximum number of rules, maximum where the operated parameters are indicated with boldface. The
number of input variables, and maximum number of MFs per parameter values of input MF 1 in x1 are indicated with italics,
input variable for all the rest Np op − 1 FMs of the population, because they are currently not important. If at some point of
where Np op is the population size. optimization, MF 1 becomes active again, the values will be
If the rule base generated by slightly modified WM algorithm assigned by the algorithm in Section III-C. Before this, none of
has more than Rm ax rules, it means that some of the randomly the genetic operators will operate on these parameters.
selected rules in the first FM were replaced in order to avoid
gaps in the fuzzy partition. In this case, one of the Np op − V. DYNAMICALLY CONSTRAINTED MULTIOBJECTIVE GENETIC
1 FMs receives the rule base (i.e., A and s) of the first FM FUZZY SYSTEM
without any replacements. Then, A and s of the rest Np op − After the initialization, the further optimization is performed
2 FMs are created by randomly selecting Rm ax rules from the by popular NSGA-II [31]. Other parts of the algorithm are
generated rule base. left unchanged; however, the original genetic operators are re-
If the generated rule base has at most Rm ax rules, then the placed with operators applying dynamic constraints, thus ensur-
rule conditions A of Np op − 1 FMs are created by modifying the ing transparency of fuzzy partitions.
rule conditions of the first FM by replacing them with random
conditions [7]. However, do-not-care conditions (i.e., conditions
that are 0) are not allowed here, as it was pointed out in [8] that it A. Fitness Objectives
is easier to obtain compact than accurate FMs. Rule consequents Two objectives are tobe minimized, which are as follows.
s for all Np op − 1 FMs are the same as in the first FM. 1) MSE = (1/2D) D k =1 (yk − ŷk ) , where yk and ŷk are
2

After creating A and s of the rest Np op − 1 FMs, the input the actual and predicted outputs for data point k, and D
MF parameters P are assigned based on A of each individual is the number of data points. This objective is actually
FM. For each input variable j of each FM, the number of active MSE/2, but it is denoted here as MSE, which is quite
MFs MA,j is first checked. If MA,j ≥ 2, then a new unevenly common in the field of GFSs.
168 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 18, NO. 1, FEBRUARY 2010

2) Number of active-rule conditions (total rule length): that bj ≤ bhigh = 10; however, due to partition algorithms, it
Rcond . may be that initially bj > bhigh . In this case, bj is not al-
The MSE objective is constrained, such that, each FM need lowed to increase anymore. Finally, cj ∈ [clow , chigh ], where
to have MSE ≤ 1.5 × MSEinitial , where MSEinitial is the MSE clow = χlow , and chigh = χhigh . Next, the dynamic constraints
of the first FM of the initial population created in Section IV-A. are introduced. They can all be derived starting from (2) and (3).
This constraint is fairly easy to meet as it will be seen in 1) Dynamic Constraints for Parameter a: If aj is increased
Section VI that the accuracy can be significantly improved by (i.e., MF j becomes wider), the upper limit satisfying the γ-
multiobjective GFS optimization. However, it guarantees that condition is
the population does not contain some FMs only because they
dcenter,j
are very compact. Their accuracy must be reasonable as well. aγ ,j =
(κ(γ))1/2b j
B. Static Constraints for Output Membership Functions where κ(γ) and dcenter,j are computed according to (4) and (8),
As singleton output MFs are used, there is only one parameter respectively.
to be optimized (lateral displacement). Therefore, they can be The upper limit satisfying the α-condition is
constrained by allowing them to slightly move left/right from dα ,j
their initial positions. The applied static constraints for output aα ,j =
(κ(α))1/2b j
MF parameters are
χ χ where
χlow − ≤ o1 ≤ χlow + 
Mout − 1 2(Mout − 1) min(IL (α, pj +1 ) − cj , cj − IR (α, pj −1 )),


(2k − 3)χ (2k − 1)χ if 1 < j < MA
χlow + ≤ ok ≤ χlow + dα ,j =
 c − I (α, p ), if j = MA
2(Mout − 1) 2(Mout − 1)  j R j −1
IL (α, pj +1 ) − cj , if j = 1
where k = 2, . . . , Mout − 1
χ χ is the minimum distance from cj to the point in which a neigh-
χh − ≤ oM o u t ≤ χhigh + . boring MF receives membership value α. IL , and IR are com-
2(Mout − 1) Mout − 1
puted according to (2) and (3), respectively.
This way of tuning resembles lateral-tuning method [9], and If aj is decreased (i.e., MF j becomes more narrow), the
it guarantees transparency of output fuzzy partition to a good lower limit satisfying the β-condition is
level. 
 dβ ,j
, if dβ ,j > 0
C. Dynamic Constraints to Ensure the Transparency of Input aβ ,j = (κ(β))1/2b j

Fuzzy Partition alow , if dβ ,j ≤ 0
This section presents the dynamic constraints guaranteeing where
transparent input fuzzy partition in case that a parameter of 
 max(IL (β, pj +1 ) − cj , cj − IR (β, pj −1 )),
an MF is modified. The genetic operators assuring transparent 
input fuzzy partition in case that the number of MFs is altered if 1 < j < MA
dβ ,j =
are introduced later in Section V-D. A prerequisite for these  max(χhigh − cj , cj − IR (β, pj −1 )), if j = MA

dynamic constraints is that initially (i.e., before modification) max(cj − χlow , IL (β, pj +1 ) − cj ), if j = 1
the input fuzzy partition is transparent. This is guaranteed by the is computed depending on the location of MF j. If dβ ,j ≤ 0,
two partition algorithms, which have already been introduced UoD will be strongly covered regardless of the decrement in
in Section III-B and Section III-C. the value of aj . In this case, the lower limit satisfying the β-
MF parameters are modified one at a time. After each modifi- condition is simply the static constraint alow .
cation, the resulting fuzzy partition must satisfy the transparency Combining the constraints yields to
conditions defined in Section III-A. As the initial fuzzy parti-
tions are created by the algorithms in Section III-B and III-C, the max(alow , aβ ,j ) ≤ aj ≤ min(aγ ,j , aα ,j , ahigh ).
ordering of MFs is initially known. The ordering is also known
after each modification, because the dynamic constraints and 2) Dynamic Constraints for Parameter b: If bj is increased
the genetic operators in Section V-D do not allow to change (i.e., MF j becomes crisper), the following upper limit guaran-
it. Therefore, for any two MFs with parameters ai , bi , and ci , tees the fulfillment of α-condition:
and aj , bj , and cj , where j, i, ∈ [1, MA ], with i = j, of an input 
 ln κ(α)
variable that currently has MA active MFs, it is guaranteed that , if dα ,j < aj
bα ,j = 2 ln (dα ,j /aj )
if i > j, then ci > cj and vice versa. This is beneficial to design 
bhigh , if dα ,j ≥ aj .
the dynamic constraints.
Besides the dynamic constraints, some static constraints also If dα ,j ≥ aj , MF j receives at most a membership value α at
need to be satisfied; aj ∈ [alow , ahigh ], where alow = 0.005χ, any intersection point, regardless of the increment in the value
and ahigh = χ. Furthermore, bj ≥ blow = 1, and it is preferred of bj . In this case, the upper limit is the static constraint bhigh .
PULKKINEN AND KOIVISTO: DYNAMICALLY CONSTRAINED MULTIOBJECTIVE GENETIC FUZZY SYSTEM FOR REGRESSION PROBLEMS 169

The following upper limit guarantees the fulfillment of β- D. Genetic Operators


condition:
 Five mutation and crossover operators are used. Some of them
 ln κ(β) are not always applicable; therefore, when mutation or crossover
, if dβ ,j > aj
bβ ,j = 2 ln (dβ ,j /aj ) is applied, one of the currently applicable operators is randomly

bhigh , if dβ ,j ≤ aj . selected by uniform chance. Crossover is applied with proba-
bility Pc = 0.1 + (G/GTot ), where G is the current generation,
If dβ ,j ≤ aj , MF j receives at least a membership value β
and GTot is the total number of generations. If crossover was
at the intersection point(s) of its neighboring MF(s), which is
applied, mutation is applied with probability Pm = 0.1, and
regardless of the increment in bj . In this case, the upper limit is
if crossover was not applied, mutation is always applied. This
the static constraint bhigh .
strategy is similar to strategy applied in [3].
If bj is decreased (i.e., MF j becomes fuzzier), the following
Upper and lower limits for each modified parameter are
lower limit satisfies the γ-condition:
computed according to Sections V-B and C and denoted by
ln κ(γ) Lupp er and Llower . Number of currently active MFs in an in-
bγ ,j = .
2 ln(dcenter,j /aj ) put variable is denoted by MA and a random real number by
r ∈ [0, 1] .
Combining the constraints yields to1
1) Mutation Operators: Operator 1 modifies the parameters
max(blow , bγ ,j ) ≤ bj , if bj ≥ bhigh of input MFs. First, the number of input variables that have at
least two active MFs is determined. This number is denoted here
max(blow , bγ ,j ) ≤ bj ≤ min(bhigh , bα ,j , bβ ,j ),
by nactive . Then, out of nactive input variables, nselect of them
if bj < bhigh . are randomly selected, where nselect ∈ [1, nactive ] is a random
integer. From each of these nselect input variables, an active MF
3) Dynamic Constraints for Parameter c: If cj is increased is randomly selected. Then, for each of them, a gbell parameter
(MF j is moving toward right), the following upper limit guar- (a, b, or c) is randomly selected. They are denoted by pi,l ,
antees the fulfillment of α-condition (only the right α-condition where i is 1, 2, or 3 depending upon which gbell parameter is
needs to be taken into account): modified, and l is the index of an active MF in P [see (17) and

+ chigh , if j = MA (18)]. Each pi,l is replaced by randomly selecting one of the
cα ,j = following replacement formulas: pi,l ← pi,l + r(Lupp er − pi,l )
cj + (IL (α, pj +1 ) − IR (α, pj )), if j < MA .
or pi,l ← pi,l − r(pi,l − Llower ).
Furthermore, the following upper limit guarantees the fulfill- Operator 2: The mutation operator 1 modifies input MF pa-
ment of β-condition (only the left β-condition needs to be taken rameters individually; however, sometimes more drastic mod-
into account): ification may be necessary. Therefore, this operator selects an

+ cj + (IR (β, pj −1 ) − IL (β, pj )), if j > 1 input variable for which MA ≥ 2 and creates a new unevenly
cβ ,j =
cj + (χlow − IL (β, pj )), if j = 1. distributed partition with MA MFs using the algorithm defined
in Section III-C.
Finally
Operator 3 modifies the rule base by randomly selecting

chigh , if j = MA nrulecond rule conditions ai,j [see (16)], where nrulecond ∈
c+
γ ,j = cj + min(cj +1 − IR (γ, pj ), IL (γ, pj +1 ) − cj ), [1, 10] is a random integer. The selected rule conditions are
if j < MA replaced with random rule conditions; however, as it is easier
is the upper limit guaranteeing the fulfillment of γ-condition to obtain compact than accurate FMs [8], this operator favors
(only the right γ-condition needs to be taken into account). nonzero-replacement conditions during the first half of the total
If the value of c is decreased (MF j is moving toward left), number of generations GTot . Therefore, if G < GTot /2, then
the applied constraints are the probability that a replacement condition is selected from
 [0, Mm ax j ] is Pz = 2G/GTot , and the probability that it is se-
− cj − (IL (α, pj ) − IR (α, pj −1 )), if j > 1 lected from [1, Mm ax j ] is 1 − Pz . When G ≥ GTot /2, replace-
cα ,j =
clow , if j = 1 ment conditions are always selected from [0, Mm ax j ].
 The resulting input fuzzy partition may not be transparent if
cj − (IR (β, pj ) − χhigh ), if j = MA
c−
β ,j = some MFs have become active or inactive, thus resulting into
cj − (IR (β, pj ) − IL (β, pj +1 )), if j < MA
 highly overlapping MFs or gaps in the fuzzy partition. Thus, the
 cj − min(IL (γ, pj ) − cj −1 , cj − IR (γ, pj −1 )), set of these input variables that use different MFs in the rules
c−γ ,j = if j > 1 than before this operator is determined. Then, MA for each of

clow , if j = 1. these input variables is determined. For these input variables for
which MA ≥ 2, new unevenly distributed partition with MA
Combining the constraints yields to
   +  MFs is created. If MA < 2, all nonzero conditions, if any, of
max c− − −
α ,j , cβ ,j , cγ ,j ≤ cj ≤ min cα ,j , cβ ,j , cγ ,j .
+ +
that input variable are forced to zero. This operation is called
repair operator, and it guarantees transparency of input fuzzy
1 Recall
partition.
that bj can be increased only if bj < bh ig h .
170 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 18, NO. 1, FEBRUARY 2010

Operator 4 modifies a consequent si , where i = 1, . . . , R, TABLE II


PROPERTIES OF THE DATASETS AND THE APPLIED PARAMETERS
of a randomly selected active rule by replacing it by random
consequent chosen from [1, Mout ]. A rule is active if it has at
least one nonzero-rule condition.
Operator 5 modifies the lateral displacement of a randomly
selected active-output-MF center (an output MF is active if it
is used in at least one of the active rules). The selected output-
MF center oi , where i = 1, . . . , Mout , is replaced by randomly
selecting one of the following formulas: oi ← oi + r(Lupp er −
oi ) or oi ← oi − r(oi − Llower ).
2) Crossover Operators: All five crossover operators ran-
domly select two FMs as parents and produce two FMs as chil- ters that are active in both of the parents. Out of them,
dren. They replace their parents in the offspring population. The one is randomly selected from both parents (the same from
crossover operators 1, 4, and 5 resemble the mutation operators both parents). They are denoted here by o1i and o2i , where
1, 4, and 5. i = 1, . . . , Mout . They are replaced by randomly selecting one
Operator 1 modifies the parameters of active input MFs us- of the following formulas: oki ← oki + r(min(I, Lupp er − oki ))
ing BLX-0.5 crossover [23], [32]. It can be applied to input or oi ← oki − r(min(I, (oki − Llower )), where k = 1 and 2, and
variables, which have the same amount (at least 2) of active I = 0.5|o1i − o2i |.
MFs in both parents. The number of input variables meeting
these requirements is denoted by nactive . Out of them, nselect VI. EXPERIMENTS
are randomly selected, where nselect ∈ [1, nactive ] is a random
integer. For each of these nselect input variables, an active MF Our multiobjective GFS is validated using nine datasets,
j ∈ [1, MA ] is randomly selected (the same j from both par- which represent different number of input variables and data
ents). Then, from each of these selected active MFs, a gbell points (see Table II). For all datasets, five-fold cross-validation
parameter (a, b, or c) is randomly selected (the same parameter was repeated six times (6 × 5CV) with different random seeds.
from both parents). The data partitions for Ele1, Ele2, Abalone, Mortgage, Trea-
Let p1i,l 1 and p2i,l 2 denote the selected parameters from parents sury, and Computer problems were downloaded from KEEL
1 and 2, respectively. The index i is 1, 2, or 3 depending on which Website.2 MG and Lorenz datasets were generated according
gbell parameter is selected [see (17)]. The indexes l1 and l2 are to [3] and [20]. Finally, Gas dataset was obtained from the
determined according to (18). The parameters are replaced by Website of Greg Reinsel.3 For Mackey–Glass (MG), Lorenz,
randomly selecting either pki,l k ← pki,l k + r(min(I, Lupp er − and Gas problems, the same data partitions as in the comparative
pki,l k )) or pki,l k ← pki,l k − r(min(I, pki,l k − Llower )), where k = study [20] were used. C4.5 was run with its default parameters
defined in [29]. Population size was fixed to 100 and the number
1 and 2, and I = 0.5|p1i,l 1 − p2i,l 2 |.
of generations was altered, such that, the same amount of fitness
Operator 2: First, an input variable, for which at least one of
evaluations was used as in the comparative studies. The settings
the parents has at least two active MFs, is randomly selected.
α = 0.8, β = 0.05, and γ = 0.25 are used in the experiments
After this, all rule conditions and input MF parameters of this
performed in Section VI-B–F. Furthermore, in VI-G, experi-
input variable are pairwisely swapped. Therefore, child 1 re-
ments with α = 0.6, β = 0.4, and γ = 0.1 will be performed
ceives all the parameters of parent 1, except rule conditions and
in order to study the tradeoff between transparency of fuzzy
input MF parameters of the selected input variable, which are
partitions and accuracy.
received from parent 2. Likewise, child 2 gets all the parameters
For six of the datasets (Ele1, Ele2, MG, Lorenz, Abalone, and
of parent 2, except rule conditions and input MF parameters of
Gas), there exist results of one or more recent GFSs presented
the selected input variable, which are received from parent 1.
in Table III. For these problems, the number of input and output
Operator 3 swaps some rules of the parents. It is applicable
MFs (Min and Mout ) were selected the same as in the compar-
to those rules that are active in at least one of the parents. Out of
ative studies. For treasury, mortgage, and computer problems,
these rules, Nselect of them are selected and their rule conditions
our method is compared against a baseline method. For these
are pairwisely swapped (Nselect is a random integer chosen from
higher dimensional problems, Min and Mout were both set to 3
[1, 5]).
in order to reduce the search space.
After this operator, input fuzzy partitions may not be trans-
Since MOEAs are applied, it is interesting to visualize the
parent. Therefore, for both children separately, the same repair
Pareto fronts. However, it is not meaningful to visualize the
operator as with the mutation operator 3 is applied.
Pareto fronts of all 30 CV runs for each dataset. The aver-
Operator 4 modifies the rule consequents si , where i =
aged results of the ith most accurate FMs from each of the
1, . . . , R. This operator is possible for those rules that are active
30 Pareto fronts were shown in [8] for five of the most accu-
in at least one of the parents. The operator selects one of these
rate FMs (i.e., i = 1, . . . , 5). These averages were computed,
rules randomly and swaps consequents of this rule.
Operator 5 modifies the lateral displacement of output MF
2 http://sci2s.ugr.es/keel/datasets.php
centers. This operator is possible for those output MF cen-
3 http://www.stat.wisc.edu/∼reinsel/bjr-data/index.html
PULKKINEN AND KOIVISTO: DYNAMICALLY CONSTRAINED MULTIOBJECTIVE GENETIC FUZZY SYSTEM FOR REGRESSION PROBLEMS 171

TABLE III
PROPERTIES OF THE COMPARATIVE GFSS

such that, none of the 30 Pareto fronts were excluded from This is because appropriate initialization eases the derivation of
computing the averages. Thus, in each of the Pareto fronts, better FMs due to reduction in the search space [15]. Because
there were at least five distinct FMs. In this paper, the max- this paper and the comparative studies apply different initial-
imum value of i (im ax ) is not the same for all datasets, but ization methods, the purpose of the results comparisons is not
depends on the minimum number of distinct FMs on the 30 to assess the superiority of any individual components, but to
Pareto fronts. More formally, im ax = min(L1 , L2 , . . . , L30 ), assess the superiority of different approaches as a whole. As-
where Lj , with j = 1, . . . , 30, is the number of distinct FMs sessing the superiority of individual components is, of course,
on the jth Pareto front of a given dataset. Thus, the length of the important, but requires another study in the future. It should be
averaged Pareto front equals to the length of the shortest Pareto noted, however, that the results comparisons can be considered
front of the 30 runs. fair, because the same amount of fitness evaluations, the same
Besides the Pareto fronts, the number of rules R, rule con- data partitions, and the same amount of input and output MFs
ditions Rcond , input MFs, and the number of input variables F are used, as in the comparative studies. Also, our approach does
for some of the ith most accurate FMs are tabulated. Moreover, not require any more a priori knowledge about the datasets than
the unequal variance t-test4 (denoted by t) with 95% confidence the comparative methods.
is reported for the MSEtrn and MSEtst . The same notations as To evaluate the transparency of fuzzy partitions, we fol-
in [6], [8], and [23] are used; stands for the best averaged low [6], which states that two-tuple representation leads to
result in the column, + means that the performance of the cor- more transparent fuzzy partition than three-tuple representation.
responding row is worse than the best result, and = means that Moreover, three-tuple representation is more transparent than
there is no significant statistical difference compared to the best classic three-parameter representation with static constraints.
result. In [8] and [23], static constraints were defined, such that, MF
parameters can vary within small intervals, whereas in [20],
A. Comparative Genetic Fuzzy Systems larger intervals were used. Therefore, we consider the trans-
parency of fuzzy partitions in [20], which is the poorest among
The comparative approaches global lateral tuning (GL),
the comparative GFSs. Both two-tuple presentation and the pro-
global lateral tuning with rule selection (GL+S), global lat-
posed dynamic constraint approach maintain the transparency
eral amplitude tuning (GLA), and global lateral amplitude tun-
of fuzzy partitions at a good level. Since the approaches are
ing with rule selection (GLA+S) minimize only one objective,
quite different and because transparency of fuzzy partitions is
namely, MSE, whereas the rest minimize two or more objectives
a subjective matter, it is difficult to judge which one of them
simultaneously and obtain a set of Pareto optimal FMs. All GFSs
yields into more transparent fuzzy partitions. Therefore, their
use globally defined MFs. The approaches [6], [8], [23] create
interpretability is considered equal.
the initial populations using WM algorithm, whereas in [20],
C4.5 algorithm is used. In this paper, the initial population is
created by a method combining the benefits of C4.5 and WM B. Estimating the Length of Low-Voltage Lines (Ele1)
algorithms. For this problem, 50 000 fitness evaluations were used in this
Performance of GFS designs depends on their individual com- paper and in [6]. Table IV shows that GLA+S has the lowest
ponents, such as initialization method and MFs tuning strategy. MSEtrn , and our most accurate FM (Final-1) has practically
For example, by applying different initialization methods, per- the same value. There is no statistical difference between the
formance of a GFS can be significantly improved or deteriorated. lowest MSEtrn and three of our most accurate FMs (Final-1,
Final-2, and Final-3). The lowest MSEtst is obtained by GLA,
4 Also called Welch’s t-test [33], [34]. If our multiobjective GFS could be but again, there is no statistical difference between the lowest
compared to other GFSs in all problems, nonparametric tests would be preferred. MSEtst and three of our most accurate FMs. There is no clear
172 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 18, NO. 1, FEBRUARY 2010

TABLE IV
RESULTS COMPARISON FOR ELE1 PROBLEM

TABLE V
RESULTS COMPARISON FOR ELE2 PROBLEM

difference between different approaches for this problem, be- accurate according to t-test. They are also clearly more compact
cause the search space is small due to small amount of input than the FMs in [8]. Finally, because in [8], three-parameter MFs
variables. It is also noticed that although Min was set to 5, the tuning with static constraints was used, the fuzzy partitions of
initial FM uses on average nine input MFs. Therefore, one of our FMs can be considered more transparent.
the input variables is usually partitioned with four and the other
one with five input MFs.

C. Estimating the Maintenance Costs of Medium-Voltage Lines D. Predicting the Age of Abalone
(Ele2) This problem has eight input variables and a very high noise
This problem is more interesting as it contains four input vari- level. According to [8], usually the learning methods yields into
ables. First, our multiobjective GFS was run for 50 000 fitness similar accuracy. Thus, it may not be possible to improve the
evaluations and compared to [6] and [23], which use the same accuracy, but only to improve the interpretability, compared to
amount of fitness evaluations. Table V shows that our multiob- existing methods in the literature. In this paper and also in the
jective GFS has the lowest MSE in train and test sets. There comparative study [8], the number of fitness evaluations was
is also statistical difference between our approach and all other set to 100 000. According to Table VI, there is no clear differ-
approaches when MSEtst is considered. When MSEtrn is con- ence in accuracy between different GFSs. The lowest MSEtrn
sidered, there is statistical difference between our approach and was obtained by TS-SPEA2Acc and the lowest MSEtst by our
all other approaches, except TS-SPEA2Acc . Our FMs can also approach (Final-1). On the other hand, our approach presents a
be considered as the most interpretable, because they are clearly significant improvement in interpretability. Our FMs are clearly
the most compact, and the transparency of fuzzy partitions is at more compact than the comparative FMs. They have much less
least the same as in the comparative FMs (see also Table III). rule conditions and use much less input variables. Furthermore,
Our approach was also run for 100 000 fitness evaluations (the according to Table III, our fuzzy partitions can be considered
same amount as in [8]). Table V shows that our FMs are the most more transparent than the fuzzy partitions in [8].
PULKKINEN AND KOIVISTO: DYNAMICALLY CONSTRAINED MULTIOBJECTIVE GENETIC FUZZY SYSTEM FOR REGRESSION PROBLEMS 173

TABLE VI
RESULTS COMPARISON FOR ABALONE PROBLEM

TABLE VII
RESULTS COMPARISON FOR MG, LORENZ, AND GAS PROBLEMS

E. Mackey–Glass, Lorenz, and Gas Problems about the performance of our approach. Thus, Genfis3, a fuzzy-
c-means (FCM) clustering-based method was used to identify
Our multiobjective GFS is compared to our former multiob-
jective GFS [20], which was run for 210 000 fitness evaluations Mamdani FMs. This method is part of MATLAB’s Fuzzy Logic
Toolbox 2. All settings, besides the type of FM, were kept at
on these problems. The same amount of fitness evaluations is
used here. Table VII shows that our most accurate FMs are their default values and 6 × 5CV with the same data partitions
significantly more accurate than the most accurate FMs of our as with our multiobjective GFS was performed.
Table VIII shows that our FMs are significantly more accu-
former study. On the other hand, they also contain much more
rules and rule conditions than FMs in [20]. rate than the comparative FMs. Moreover, they have less input
The least accurate FMs of the averaged Pareto fronts for each variables and input MFs than the comparative FMs. The compar-
ative FMs usually have less rules, but more rule conditions, than
problem are also presented and denoted by Final-8, Final-4, and
Final-9. One can notice that they are still more accurate than our FMs. By visual inspection, it was noticed that the fuzzy par-
the most accurate FMs in [20]. On the other hand, they are titions by Genfis3 often contain many highly overlapping MFs
and the UoD may not be strongly covered.
also more complex with regards to number of rules and rule
conditions. The number of input variables and the number of
MFs is approximately the same. Table III shows that the FMs G. Fuzzy Partition Transparency Versus Accuracy Tradeoff
in [20] have the worst transparency of fuzzy partitions and our The experiments in Section VI-B–VI-F were performed with
FMs have the best. α = 0.8, β = 0.05, and γ = 0.25. If one requires higher trans-
parency of fuzzy partitions, the settings α = 0.6, β = 0.4, and
F. Higher Dimensional Problems: Treasury, Mortgage, and γ = 0.1 could be used. The 6 × 5CV procedures for all nine
Computer Activity problems were repeated with these settings. The averaged re-
Our approach was run for 100 000 fitness evaluations on these sults of the most accurate FMs are shown in Table IX along with
problems. To the best of our knowledge, there are no results of the best and the worst results from Tables IV–VIII. In Fig. 4,
other GFSs available for these problems.5 Nonetheless, it is the averaged Pareto fronts for five of the studied problems are
important to include a baseline method in order to have an idea shown. It is seen from Table IX and Fig. 4 that by improving
transparency of fuzzy partitions, accuracy is deteriorated, but
5 At the time of writing the final version of this paper, this statement no longer remains at a reasonable level.
holds true. There are recently published results available for some [26] and Transparency of fuzzy partitions is evaluated against a fuzzy
all [22] of these problems. However, the experimental setup in those papers
differ significantly from the experimental setup of this paper. Thus, our results partition, which has three desirable properties: 1) The member-
are not compared to them. ship values at the intersections of neighboring MFs are always
174 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 18, NO. 1, FEBRUARY 2010

TABLE VIII
RESULTS COMPARISON FOR HIGHER DIMENSIONAL PROBLEMS

TABLE IX
AVERAGED RESULTS OF THE MOST ACCURATE FMS USING α = 0.6, β = 0.4, AND γ = 0.1

Fig. 4. Averaged Pareto fronts over 30 CV runs for Ele2 with 50 000 and 100 000 fitness evaluations, Abalone, Gas, Mortgage, and Computer problems. TP
stands for transparent fuzzy partitions obtained by α = 0.8, β = 0.05, and γ = 0.25, whereas HT stands for highly transparent fuzzy partitions obtained by
α = 0.6, β = 0.4, and γ = 0.1.
PULKKINEN AND KOIVISTO: DYNAMICALLY CONSTRAINED MULTIOBJECTIVE GENETIC FUZZY SYSTEM FOR REGRESSION PROBLEMS 175

TABLE X
COMPARISON OF THE AVERAGED FUZZY-PARTITION QUALITY INDEXES OF THE MOST ACCURATE FMS AND THE AVERAGE LENGTH
OF THE PARETO FRONTS WITH DIFFERENT SETTINGS OF α, β, AND γ

Fig. 5. Ele2 (50 000 fitness evaluations). Examples of the most accurate FMs of one run using the same data partition. (Left) α = 0.8, β = 0.05, γ = 0.25,
MSE trn = 13277, MSE tst = 12884, Q Int = 0.27, Q M id = 0.23, and Q E x t = 0.00. (Right) α = 0.6, β = 0.4, γ = 0.1, MSE trn = 18272, MSE tst =
19439, Q Int = 0.09, Q M id = 0.10, and Q E x t = 0.00.

0.5; 2) in the center of an MF, all other MFs receive membership Table X compares the averaged quality-index values of the
value 0; and 3) at the extreme points χlow and χhigh of UoD, most accurate FMs for different settings of α, β, and γ. More-
one MF receives membership value 1. Three quality indexes are over, the average number ND and standard deviation σN D of
therefore computed for each fuzzy partition: 1) QInt : the max- distinct FMs on a Pareto front are shown. It is clearly seen that
imum absolute difference from the desired intersection mem- with the settings α = 0.6, β = 0.4, and γ = 0.1, more transpar-
bership value 0.5; 2) QM id : the maximum membership value ent fuzzy partitions are obtained (i.e., the quality-index values
of an MF in the center of another MF; and 3) QExt : the max- are lower). The average length of Pareto fronts is, however,
imum absolute difference from the desired membership value not clearly affected by the settings, but depends on the charac-
1 at the extreme points of UoD. For a strong fuzzy partition, teristics of each problem. As the number of rule conditions is
QInt = QM id = QExt = 0. One must, however, note that even one of the two fitness objectives, the Pareto fronts tend to be
a strong fuzzy partition can be poorly transparent, for example, longer if the number of rule conditions in initial FM is high (see
when some of the MFs are very close to each other. These qual- Tables IV–VIII).
ity indexes do not take into account this kind of transparency Figs. 5 and 6 show examples of the most accurate FMs for
aspects. Ele2 and Mortgage problems with different settings of α, β,
176 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 18, NO. 1, FEBRUARY 2010

Fig. 6. Mortgage: Examples of the most accurate FMs of one run using the same data partition. (Left) α = 0.8, β = 0.05, γ = 0.25, MSE trn = 0.028,
MSE tst = 0.072, Q Int = 0.35, Q M id = 0.11, and Q E x t = 0.00. (Right) α = 0.6, β = 0.4, γ = 0.1, MSE trn = 0.036, MSE tst = 0.090, Q Int = 0.09,
Q M id = 0.10, and Q E x t = 0.00.

and γ. It is seen that the fuzzy partitions are more transparent recently proposed multiobjective and monoobjective GFSs on
when α = 0.6, β = 0.4, and γ = 0.1. One may notice that our six of these nine problems. It was seen that our approach always
approach performs input-variable selection, rule learning, gran- results into at least comparable accuracy and interpretability
ularity learning, and MF-parameters tuning. For example, it can with the comparative approaches. Moreover, on some bench-
be seen that one of the input variables for Mortgage problem is mark problems, it clearly outperformed some of the compara-
partitioned with three MFs, whereas the others are partitioned tive approaches. On the rest three datasets, which have up to 21
with two MFs. Moreover, these example FMs for Mortgage input variables, it was tested against a FCM clustering method.
problem use only three or four input variables, even though the It was seen that our FMs are more accurate and interpretable
problem has 15 input variables. than the FMs obtained by FCM.
Our approach is suitable for both lower and higher dimen-
sional problems. Suitability to higher dimensional problems is
VII. CONCLUSION aided by the initialization method, which usually reduces the
A dynamically constrained multiobjective GFS to learn the number of input variables. Naturally, if none of the input vari-
granularities of fuzzy partitions, tuning the MFs, and learning ables can be removed in initialization phase, the search space
the fuzzy rules was proposed. It uses dynamic constraints, which will be larger. This poses a challenge to any GFS and requires
enable application of three-parameter MFs tuning to improve further research. By our approach, fuzzy partitions with differ-
the accuracy without deteriorating the transparency of fuzzy ent levels of transparency can be obtained by different settings
partitions. A new initialization method was also proposed. It of α, β, and γ. It was shown that there exists a clear tradeoff
combines the benefits of WM and DT algorithms, and reduces between transparency of fuzzy partitions and accuracy. Finally,
the number of rules, rule conditions, and input variables, while in this paper, regression problems were considered. However,
preserving the transparency of fuzzy partitions. Being a heuristic our approach can be made suitable for classification problems
and suboptimal method, its main purpose is not to obtain very as well [35].
accurate and compact initial FMs, rather, its main purpose is
to reduce the search space and, therefore, to ease the further REFERENCES
optimization. [1] H. Ishibuchi and Y. Nojima, “Analysis of interpretability-accuracy tradeoff
Nine benchmark problems having 2 up to 21 input variables of fuzzy systems by multiobjective fuzzy genetics-based machine learn-
were studied, and our multiobjective GFS was tested against 11 ing,” Int. J. Approx. Reason., vol. 44, no. 1, pp. 4–31, Jan. 2007.
PULKKINEN AND KOIVISTO: DYNAMICALLY CONSTRAINED MULTIOBJECTIVE GENETIC FUZZY SYSTEM FOR REGRESSION PROBLEMS 177

[2] C. Setzkorn and R. C. Paton, “On the use of multi-objective evolutionary [24] M. Antonelli, P. Ducange, B. Lazzerini, and F. Marcelloni, “Learning con-
algorithms for the induction of fuzzy classification rule systems,” BioSys- currently partition granularities and rule bases of Mamdani fuzzy systems
tems, vol. 81, no. 2, pp. 101–112, 2005. in a multi-objective evolutionary framework,” Int. J. Approx. Reason.,
[3] M. Cococcioni, P. Ducange, B. Lazzerini, and F. Marcelloni, “A Pareto- vol. 50, no. 7, pp. 1066–1080, Jul. 2009.
based multi-objective evolutionary approach to the identification of Mam- [25] M. Antonelli, P. Ducange, B. Lazzerini, and F. Marcelloni, “Learning
dani fuzzy systems,” Soft Comput., vol. 11, no. 11, pp. 1013–1031, Sep. concurrently granularity, membership function parameters and rules of
2007. Mamdani fuzzy rule-based systems,” in Proc. IFSA-EUSFLAT, Lisbon,
[4] H. Wang, S. Kwong, Y. Jin, W. Wei, and K. Man, “Multi-objective hier- Portugal, Jul. 2009, pp. 1033–1038.
archical genetic algorithm for interpretable fuzzy rule-based knowledge [26] J. Casillas, “Embedded genetic learning of highly interpretable fuzzy par-
extraction,” Fuzzy Sets Syst., vol. 149, no. 1, pp. 149–186, Jan. 2005. titions,” in Proc. IFSA-EUSFLAT, Lisbon, Portugal, Jul. 2009, pp. 1631–
[5] M.-S. Kim, C.-H. Kim, and J.-J. Lee, “Evolving compact and interpretable 1636.
Takagi–Sugeno fuzzy models with a new encoding scheme,” IEEE Trans. [27] A. Botta, B. Lazzerini, F. Marcelloni, and D. C. Stefanescu, “Context
Syst., Man, Cybern. B, Cybern., vol. 36, no. 5, pp. 1006–1023, Oct. adaptation of fuzzy systems through a multi-objective evolutionary ap-
2006. proach based on a novel interpretability index,” Soft Comput., vol. 13,
[6] R. Alcalá, J. Alcalá-Fdez, M. J. Gacto, and F. Herrera, “Rule base reduc- no. 5, pp. 437–449, Mar. 2009.
tion and genetic tuning of fuzzy systems based on the linguistic 3-tuples [28] J. V. de Oliveira, “Semantic constraints for membership function opti-
representation,” Soft Comput., vol. 11, no. 5, pp. 401–419, Mar. 2007. mization,” IEEE Trans. Syst., Man, Cybern. A, Syst. Humans, vol. 29,
[7] P. Pulkkinen and H. Koivisto, “Fuzzy classifier identification using de- no. 1, pp. 128–138, Jan. 1999.
cision tree and multiobjective evolutionary algorithms,” Int. J. Approx. [29] J. R. Quinlan, C4.5: Programs for Machine Learning. San Mateo, CA:
Reason., vol. 48, no. 2, pp. 526–543, Jun. 2008. Morgan Kaufmann, 1993.
[8] M. J. Gacto, R. Alcalá, and F. Herrera, “Adaptation and application of [30] B. Zhang and J. Edmunds, “On fuzzy logic controllers,” in Proc. Int. Conf.
multi-objective evolutionary algorithms for rule reduction and parameter Control, Edinburgh, U.K., Mar. 1991, pp. 961–965.
tuning of fuzzy rule-based systems,” Soft Comput., vol. 13, no. 5, pp. 419– [31] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist
436, Mar. 2009. multiobjective genetic algorithm: NSGA-II,” IEEE Trans. Evol. Comput.,
[9] R. Alcalá, J. Alcalá-Fdez, and F. Herrera, “A proposal for the genetic vol. 6, no. 2, pp. 182–197, Apr. 2002.
lateral tuning of linguistic fuzzy systems and its interaction with rule [32] F. Herrera, M. Lozano, and A. M. Sánchez, “A taxonomy for the crossover
selection,” IEEE Trans. Fuzzy Syst., vol. 15, no. 4, pp. 616–635, Aug. operator for real-coded genetic algorithms: An experimental study,” Int.
2007. J. Intell. Syst., vol. 18, pp. 309–338, 2003.
[10] O. Cordón, F. Gomide, F. Herrera, F. Hoffmann, and L. Magdalena, “Ten [33] B. L. Welch, “The generalization of “student’s” problem when several
years of genetic fuzzy systems: Current framework and new trends,” Fuzzy different population variances are involved,” Biometrika, vol. 34, no. 1/2,
Sets Syst., vol. 141, no. 1, pp. 5–31, Jan. 2004. pp. 28–35, Jan. 1947.
[11] F. Herrera, “Genetic fuzzy systems: Taxonomy, current research trends [34] G. D. Ruxton, “The unequal variance t-test is an underused alternative
and prospects,” Evol. Intell., vol. 1, no. 1, pp. 27–46, Mar. 2008. to student’s t-test and the Mann–Whitney U test,” Behav. Ecol., vol. 17,
[12] H. Ishibuchi and T. Yamamoto, “Fuzzy rule selection by multi-objective no. 4, pp. 688–690, 2006.
genetic local search algorithms and rule evaluation measures in data min- [35] P. Pulkkinen, “A multiobjective genetic fuzzy system for obtaining com-
ing,” Fuzzy Sets Syst., vol. 141, no. 1, pp. 59–88, Jan. 2004. pact and accurate fuzzy classifiers with transparent fuzzy partitions,” in
[13] H. Huang, M. Pasquier, and C. Quek, “Optimally evolving irregular- Proc. 8th Int. Conf. Mach. Learning Appl., Miami Beach, FL, Dec. 2009,
shaped membership function for fuzzy systems,” in Proc. IEEE Congr. pp. 89–94.
Evol. Comput., Vancouver, BC, Canada, Jul. 2006, pp. 11078–11085.
[14] M. Setnes, R. Babuŝka, U. Kaymak, and H. R. van N. Lemke, “Similarity
measures in fuzzy rule base simplification,” IEEE Trans. Syst., Man,
Cybern. B, Cybern., vol. 28, no. 3, pp. 376–386, Jun. 1998.
[15] J. Abonyi, J. A. Roubos, and F. Szeifert, “Data-driven generation of Pietari Pulkkinen received the M.Sc. degree
compact, accurate, and linguistically-sound fuzzy classifiers based on from Tampere University of Technology, Tampere,
a decision-tree initialization,” Int. J. Approx. Reason., vol. 32, no. 1, Finland, in 2006.
pp. 1–21, 2003. He is currently with the Tampere University
[16] P. Pulkkinen, J. Hytönen, and H. Koivisto, “Developing a bioaerosol detec- of Technology. His research interests include soft-
tor using hybrid genetic fuzzy systems,” Eng. Appl. Artif. Intell., vol. 21, computing methods, especially multiobjective ge-
no. 8, pp. 1330–1346, Dec. 2008. netic fuzzy systems, and applying them to real-world
[17] H. Ishibuchi, N. Tsukamoto, and Y. Nojima, “Evolutionary many-objective problems. He has authored or coauthored five inter-
optimization: A short review,” in Proc. IEEE Congr. Evol. Comput., Hong national journal articles and four international con-
Kong, Jun. 2008, pp. 2424–2431. ference papers. He is a Reviewer for several interna-
[18] E. H. Mamdani and S. Assilian, “An experimental in linguistic synthesis tional journals, such as, the International Journal of
with a fuzzy logic controller,” Int. J. Man–Mach. Stud., vol. 7, pp. 1–13, Approximate Reasoning.
1975. Mr. Pulkkinen is a Reviewer of the IEEE TRANSACTIONS ON FUZZY
[19] L.-X. Wang and J. M. Mendel, “Generating fuzzy rules by learning SYSTEMS.
from examples,” IEEE Trans. Syst., Man, Cybern., vol. SMC-22, no. 6,
pp. 1414–1427, Nov./Dec. 1992.
[20] P. Pulkkinen and H. Koivisto, “A genetic fuzzy system with inconsistent
rule removal and decision tree initialization,” in Applications of Soft Com- Hannu Koivisto received the M.Sc. degree in elec-
puting, AISC 58, J. Mehnen, A. Tiwari, M. Köppen, and A. Saad, Eds.
trical engineering and the Doctor of Technology de-
Berlin, Germany: Springer-Verlag, 2009, pp. 275–284.
gree from Tampere University of Technology (TUT),
[21] P. Ducange, R. Alcalá, F. Herrera, B. Lazzerini, and F. Marcelloni,
Tampere, Finland, in 1978 and 1995, respectively.
“Knowledge base learning of linguistic fuzzy rule-based systems in a From 2002 to 2007, he was the Head of the
multi-objective evolutionary framework,” in HAIS 2008 (Lecture Notes in
Automation and Control Institute, TUT, where he
Artificial Intelligence 5271), E. Corchado and A. Abraham, Eds. Berlin,
has been a Professor since 1999, and a Professor
Germany: Springer-Verlag, 2008, pp. 747–754.
with the Department of Automation Science and
[22] R. Alcalá, P. Ducange, F. Herrera, B. Lazzerini, and F. Marcelloni, “A
Engineering. His current research interests include
multiobjective evolutionary approach to concurrently learn rule and data
applied intelligent data-analysis and neurofuzzy com-
bases of linguistic fuzzy-rule-based systems,” IEEE Trans. Fuzzy Syst.,
putation, modern-telecommunication-based automa-
vol. 17, no. 5, pp. 1106–1122, Oct. 2009. tion, and system theoretic approach to supply-chain modeling and control. He
[23] R. Alcalá, M. J. Gacto, F. Herrera, and J. Alcalá-Fdez, “A multi-objective
has authored or coauthored more than 90 publications on these topics. He was
genetic algorithm for tuning and rule selection to obtain accurate and com-
a Reviewer of various journal and conference articles.
pact linguistic fuzzy rule-based systems,” Int. J. Uncertainty, Fuzziness
Prof. Koivisto is a Member of the International Federation of Automatic
Knowl.-Based Syst., vol. 15, no. 5, pp. 539–557, 2007. Control Technical Committee 3.2 (Computational Intelligence in Control).

You might also like