Bazan 2013
Bazan 2013
Bazan 2013
Jan G. Bazan
Institute of Computer Science, University of Rzeszów, Dekerta 2, 35 - 030 Rzeszów, Poland
Institute of Mathematics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland
e-mail: [email protected]
Stanisława Bazan-Socha
Second Department of Internal Medicine, Jagiellonian University Medical College,
Skawinska 8, 31-066 Cracow, Poland
e-mail: [email protected]
Sylwia Buregwa-Czuma
Institute of Computer Science, University of Rzeszów, Dekerta 2, 35 - 030 Rzeszów, Poland
e-mail: [email protected]
Przemysław Wiktor Pardel
Institute of Computer Science, University of Rzeszów, Dekerta 2, 35 - 030 Rzeszów, Poland
Institute of Mathematics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland
e-mail: [email protected]
Andrzej Skowron
Institute of Mathematics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland
e-mail: [email protected]
Barbara Sokołowska
Second Department of Internal Medicine, Jagiellonian University Medical College,
Skawińska 8, 31-066 Cracow, Poland
e-mail: [email protected]
A. Skowron and Z. Suraj (Eds.): Rough Sets and Intelligent Systems, ISRL 43, pp. 93–136.
springerlink.com
c Springer-Verlag Berlin Heidelberg 2013
94 J.G. Bazan et al.
7.1 Introduction
The main aim of the chapter is to present the developed methods for approxi-
mation of complex vague concepts involved in specification of real-life problems
and approximate reasoning used in solving these problems. However, methods pre-
sented in the chapter are assuming that additional domain knowledge in the form
of the concept ontology is given. Concepts from ontology are often vague and ex-
pressed in natural language. Therefore, an approximation of ontology is used to
create hints in searching for approximation of complex concepts from sensory (low
level) data.
We propose to link automatic methods of complex concept learning, and models
of detection of processes and their properties with domain knowledge obtained in
a dialog with an expert. Interaction with a domain expert facilitates guiding the
process of discovery of patterns and models of processes and makes the process
computationally feasible.
As we mentioned before, our methods for approximating complex spatio-
temporal concepts and relations among them assuming that the information about
concepts and relations is given in the form of ontology. To meet these needs, by
ontology we understand a finite set of concepts creating a hierarchy and relations
among these concepts which link concepts from different levels of the hierarchy. At
the same time, on top of this hierarchy there are always the most complex concepts
whose approximations we are interested in aiming at practical applications. More-
over, we assume that the ontology specification contains incomplete information
about concepts and relations occurring in ontology, particularly for each concept,
sets of objects constituting examples and counterexamples for these concepts are
given. Additionally, for concepts from the lowest hierarchical level (sensor level)
it is assumed that there are also sensor attributes available which enable to ap-
proximate these concepts on the basis of positive and negative examples given (see
example of ontology from Fig. 7.1 and [44]).
In this chapter, we present the following four types of methods for approximating
spatial or spatio-temporal complex concepts.
1. Methods of approximation of spatial concepts — when a complex concept
is a spatial concept not requiring an observation of changes over time (see
Section 7.2).
2. Methods of approximation of spatio-temporal concepts — when a complex con-
cept is a spatio-temporal concept; it requires observing changes of complex ob-
jects over time (see Section 7.3).
3. Methods of behavioral pattern identification — when a complex concept is rep-
resented as a certain directed graph which is called a behavioral graph (see
Section 7.4).
4. Methods of automated behavior planning for complex object - when the states
of objects are represented by spatio-temporal concepts requiring approximation
(see Section 7.5).
The result of the works conducted is also a programming system the Rough Set
Interactive Classification Engine (RoughICE), supporting the approximation of
spatio-temporal complex concepts in the given ontology in the dialog with the user.
7 Classifiers Based on Data Sets and Domain Knowledge 97
Safe driving
Forcing the
right of way
Safe
overtaking
S E N S O R DATA
Project Graph
designer editor
Script
editor
the street and at crossroads. During simulation each vehicle appearing on the simu-
lation board behaves as an independently acting agent. On the basis of observation
of the surroundings (other vehicles, its own location, weather conditions, etc.) this
agent makes an independent decision what maneuvers it should make to achieve its
goal which is to go safely across the simulation board and to leave the board using
the outbound way given in advance. At any given moment of the simulation, all cru-
cial vehicle parameters may be recorded, and thanks to this data sets for experiments
can be obtained. The results of experiments with the data sets recorded in the road
simulator were presented in Section 7.2.
The second collection of data sets used in computer experiments was provided
by Second Department of Internal Medicine, Collegium Medicum, Jagiellonian
University, Cracow, Poland. This data includes characteristics of patients with sta-
ble coronary heart disease: clinical status, past history, the laboratory tests results,
electrocardiographic (ECG) recordings, applied therapeutic procedures, and coro-
nary angiography outcomes. In the chapter we present recent results of experiments
performed for this collection of data sets (see Section 7.4).
The third collection of data sets used in computer experiments was provided by
Neonatal Intensive Care Unit, First Department of Pediatrics, Polish-American In-
stitute of Pediatrics, Collegium Medicum, Jagiellonian University, Cracow, Poland.
This data constitutes a detailed description of treatment of 300 infants, i.e., treat-
ment results, diagnosis, operations, medication (see [4, 6–9]). The results for this
data collection we present in Section 7.5.
7 Classifiers Based on Data Sets and Domain Knowledge 99
concept from the higher level of the ontology (connected with concepts from the
rule predecessor) where both patterns from the predecessor and the successor of the
rule are chosen from patterns constructed earlier for concepts from both adjacent
levels of the ontology. In Fig. 7.3 we present an example of production rule for
concepts C1 , C2 and C. This production rule has the following interpretation: if in-
clusion degree to a concept C1 is at least “possibly YES" and to concept C2 at least
“rather YES" then the inclusion degree to a concept C is at least “rather YES".
C ≥ ”rather YES”
Although a single production rule may be used as a classifier for the concept
appearing in a rule successor, it is not a complete classifier yet, i.e., classifying
all objects belonging to an approximated concept and not only those matching
7 Classifiers Based on Data Sets and Domain Knowledge 101
u1
C1
”certainly YES”
C2
”rather YES” → C ≥ ”rather YES”
u2 C1
”rather YES”
C2
”possibly YES” →
/ C ≥ ”rather YES”
C ≥ ”rather YES”
aC 1 aC 2 aC
certainly YES certainly YES certainly YES C≥ rather YES
certainly NO certainly NO certainly NO
rather YES certainly YES rather YES
possibly YES possibly NO possibly YES
possibly YES possibly NO rather NO
possibly YES rather YES rather YES
C1 ≥ possibly YES C2 ≥ rather YES
possibly YES certainly NO possibly NO
certainly YES rather YES certainly YES
The source patterns of
certainly NO possibly NO certainly NO production rule
certainly NO < rather NO < possibly NO < possibly YES < rather YES < certainly YES
C3 ≥ ”certainly YES”
C3 ≥ ”rather YES”
In the case of production from Fig. 7.6, concept C is the target concept and C1 , C2
are the source concepts.
Such production makes is possible to classify much more objects than a single
production rule where these objects are classified into different layers of the concept
occurring in a rule successor. Both productions and production rules themselves are
only constructed for the two adjacent levels of ontology. Therefore, in order to use
the whole ontology fully, there are constructed the so-called AR-schemes, i.e., ap-
proximate reasoning schemes which are hierarchical compositions of production
rules (see, e.g., [10, 14, 39]). The synthesis of an AR-scheme is carried out in a way
that to a particular production from a lower hierarchical level of the AR-scheme
under construction another production rule on a higher level may be attached, but
only that one where one of the concepts for which the pattern occurring in the pre-
decessor was constructed is the concept connected with the rule successor from the
previous level. Additionally, it is required that the pattern occurring in a rule prede-
cessor from the higher level is a subset of the pattern occurring in a rule successor
from the lower level (in the sense of inclusion of object sets matching both patterns).
To the two combined production rules other production rules can be attached (from
above, from below, or from the side) and in this way a multilevel structure is made
7 Classifiers Based on Data Sets and Domain Knowledge 103
AR-scheme
C5 ≥ ”certainly YES”
C5 ≥ ”possible YES”
C5 ≥ ”possible YES”
C3 ≥ ”certainly YES”
C3 ≥ ”possible YES”
AR-scheme
C5 ≥ ”possible YES” as a new
C1 ≥ ”rather YES” C2 ≥ ”certainly YES” production
C1 ≥ ”possible YES” C2 ≥ ”possible” YES” rule
C3 ≥ ”rather YES”
Production
for C3
C1 ≥ ”possible YES” C2 ≥ ”rather YES” C1 ≥ ”possible YES” C2 ≥ ”rather YES” C4 ≥ ”possible YES”
For example, in Fig. 7.7 we have two productions. The target concept of the first
production is C5 and the target concept of the second production is the concept C3 .
We select one production rule from the first production and one production rule from
the second production. These production rules are composed and then a simple AR-
scheme is obtained that can be treated as a new two-levels production rule. Notice,
that the target pattern of lower production rule in this AR-scheme is the same as
one of the source patterns from the higher production rule. In this case, the common
pattern is described as follows: inclusion degree (of some pattern) to a concept C3
is at least “possibly YES".
In this way, we can compose AR-schemes into hierarchical and multilevel struc-
tures using productions constructed for various concepts. AR-scheme constructed
in such a way can be used as a hierarchical classifier whose input is given by
104 J.G. Bazan et al.
Table 7.1 Results of experiments for the concept: Is the vehicle driving safely?
Table 7.2 Learning time and the rule set size for concept: Is the vehicle driving safely?
Table 7.1 shows the results of the considered classification algorithms for the
concept Is the vehicle driving safely? (see Fig. 7.1). Together with the results we
present a standard deviation of the obtained results.
One can see that accuracy of algorithm ARS for the decision class NO is higher
than the accuracy of the algorithm RS for analyzed data set. The decision class NO
is smaller than the class Y ES. It represents atypical cases in whose recognition we
are most interested in (dangerous driving a vehicle on a highway).
Table 7.2 shows the learning time and the number of decision rules induced for
the considered classifiers. In the case of the algorithm ARS, we present the aver-
age number of decision rules over all concepts from the relationship diagram (see
Fig. 7.1).
One can see that the learning time for ARS is much shorter than for RS and
the average number of decision rules (over all concepts from the relationship dia-
gram) for ARS algorithm is much lower than the number of decision rules induced
for RS.
The experiments showed that classification quality obtained through classifiers
based on AR-schemes is higher than classification quality obtained through tradi-
tional classifiers based on decision rules (especially in the case of the class NO).
Apart from that the time spent on classifier construction based on AR-schemes is
shorter than when constructing classical rule classifiers. Also, the structure of a sin-
gle rule classifier (inside the ARS classifier) is less complicated than the structure of
RS classifier (a considerably smaller average number of decision rules). It is worth
noticing that the the performance of the ARS classifier is much more stable than
the RS classifier because of the differences in data in samples supplied for learning
(e.g., to change the simulation scenario).
106 J.G. Bazan et al.
by a human expert using domain knowledge accumulated for the given complex
dynamical system.
On a slightly higher abstraction level, the spatio-temporal concepts (also called
temporal concepts) are directly used to describe complex object behaviors (see [4]).
Those concepts are defined by an expert in a natural language and they are usually
formulated using questions about the current status of spatio-temporal objects, e.g.,
Does the vehicle examined accelerate in the right lane?, Does the vehicle maintain
a constant speed during lane changing? The method proposed here is based on ap-
proximating temporal concepts using temporal patterns with the help of classifiers.
In order to do this, a special decision table is constructed called a temporal con-
cept table (see [4]). In case of method presented in this chapter, the rows of this
table represent the parameter vectors of lower level ontology concepts observed in
a time window. Columns of this table (apart from the last one) are determined using
temporal patterns. However, the last column represents membership of an object,
described by parameters (features, attributes) from a given row, to the approximated
temporal concept (see Fig. 7.8).
It is worth noticing that the presented above approach to temporal concept ap-
proximation can be extended to the case when higher ontology level concepts
are defined on a set of objects which are structured objects in relation to objects
(examples) of the lower ontology level concepts, that is, the lower ontology level
objects are parts of objects from the higher ontology level. This case concerns a
situation when during a structured object observation, which serves the purpose of
capturing its behavior described by a higher ontology level concept, we must ob-
serve this object longer than it is required to capture the behavior of a single part of
the structured object described by lower ontology level concepts (see [4] for more
details).
108 J.G. Bazan et al.
Acceleration and
Acceleration Acceleration
changing lanes from
on the right lane on the left lane
right to left
Deceleration and
Deceleration Deceleration
changing lanes from
on the right lane on the left lane
left to right
Data sets used for complex object information storage occurring in a given complex
dynamical system may be represented using information systems (see, e.g., [4, 32]).
This representation is based on representing individual complex objects by ob-
ject (rows) of information system and information system attributes represent the
properties of these objects at the current time point.
110 J.G. Bazan et al.
The concepts concerning properties of complex objects at the current time point
(spatial concepts) can be defined on the basis of domain knowledge by human ex-
perts and can be approximated by properties (attributes) of these objects at the cur-
rent time point (for instance, using the standard rough set approach to classifier
construction [4, 32]).
The concepts concerning properties of complex objects at the current time point
in relation to the previous time point are a way of representing very simple behaviors
of the objects. However, the perception of more complex types of behavior requires
the examination of behavior of complex objects over a longer period of time. This
period is usually called the time window, which is to be understood as a sequence
of objects of a given temporal information system (a kind of information system
with special attribute represents time) registered for the established complex object
starting from the established time point over the established period of time or as
long as the expected number of time points are obtained. Therefore, learning to
recognize complex types of behavior of complex objects with use of gathered data
as well as the further use of learned classifiers to identify the types of behavior
of complex objects, requires working out of the mechanisms of extraction of time
windows from the data and their properties. Hence, if we want to predict such more
complex behaviors or discover a behavioral pattern, we have to investigate values of
attributes registered in the current time window. Such investigation can be expressed
using temporal patterns (see Section 7.3). For example, in the case of the medical
example one can consider patterns expressed by following questions: “Did HRV
increase in the time window?", “Was the heart rate stable in the time window?", “Did
ST interval level increase?", or “Was the QT segment time higher then the right time
at any point in time window?". Notice that all such patterns ought to be defined by a
human, medical expert using domain knowledge accumulated for the coronary heart
disease.
The temporal patterns can be treated as new features that can be used to approxi-
mate temporal concepts. In the case of the treatment of patient with cardiovascu-
lar failure, one can define temporal concepts such as “Is the patient’s SCD risk on
low level?", “Is the patient’s SCD risk on medium level?", or “Was high SCD risk
detected?".
Temporal concepts defined for objects from a complex dynamical system and
approximated by classifiers, can be treated as nodes of a graph called a behav-
ioral graph, where connections between nodes represent temporal dependencies.
Fig. 7.10 presents a behavioral graph for a single patient exhibiting a behavioral
pattern of patient by analysis of the circulatory system failure caused by coronary
heart disease. This graph has been created on the basis of observation of medical
data sets and known factors for SCD risk stratification. In this behavioral graph, for
7 Classifiers Based on Data Sets and Domain Knowledge 111
example, connection between node “Medium SCD risk" and node “High SCD risk"
indicates that after some period of progress in cardiovascular failure on medium
level, a patient can change his behavior to the period, when progress in cardiovas-
cular failure is high.
This behavioral graph is an example of risk pattern. If the patient matches the
“Low SCD risk" concept in the first time window, “Medium SCD risk" in the fol-
lowing window, after which his state returned to the previous one, then the patient’s
behavior does not match this behavioral graph.
The next experiments were performed on data obtained from Second Department of
Internal Medicine, Collegium Medicum, Jagiellonian University, Cracow, Poland.
The data collection contains informations about 95 patients with stable coronary
heart disease, collected between 2006 and 2009. It includes a detail description of
clinical status (age, sex, diagnosis), coexistent diseases, pharmacological manage-
ment, the laboratory tests outcomes (level of cholesterol, troponin I, LDL — low
density lipoproteins), Holter ECG recordings (long term, 24-hour signals) and var-
ious Holter-based indices such as: ST-segment deviations, HRV, arrythmias, or QT
dispersion. Sinus (normal) rhythm was observed in 73 patients, while 22 patients
had permanent FA (atrial fibrillation). Two 24-hour Holter ECG recordings were
performed using Aspel’s HolCARD 24W system. There was coronary angiography
after first Holter ECG.
All data was imported to Infobright Community Edition (ICE) environment
(see [48]). ICE is an open source software solution designed to deliver a scalable
data warehouse optimized for analytic queries (data volumes up to 50 TB, market-
leading data compression (from 10:1 to over 40:1)). Database schema was designed
to store all information about patients, including supplementing the target database
in the future. For further processing data have been imported into the RoughICE
environment.
112 J.G. Bazan et al.
For the experiment, one table with 744 objects was formed. Each object (row)
contains information about the parameters of one patient with one hour of observa-
tion, being the average hourly values of observed parameters.
The experiments were performed in order to predict the behavioral pattern re-
lated to a high risk of SCD. This pattern was defined by medical experts on the
base of well-known predictors of SCD. The evaluation of SCD risk includes: ad-
vanced age, male sex, coexisting diseases like DM (diabetes mellitus), HA (arterial
hypertension), history of stroke, previous myocardial infarction, CRP (C-phase re-
action protein) level, depressed LVEF (left ventricular ejection fraction), presence
of arrhythmias and ischaemias, high heart rate, decreased HRV, and HRT (heart rate
turbulence). Taking into account simplicity of example model and temporal aspect
of patterns, in this approach only few factors were chosen, such as HRV index:
SDNN (standard deviation of NN intervals — normal to normal beat segments),
average heart rate, ST interval decrease, and QT segment changes. HRV parameter
was calculated upon one hour period, though usually it is analyzed within 24 hour
interval. Because of the lack of the appropriate data, such standard analyzes were
not performed in this experiment.
We have applied the train-and-test method. However, because of the specificity
of the analyzed data the method of data division differed slightly from the standard
method. Namely, in each experiment the whole patient set was randomly divided
into two groups (training group: 60% of patients and testing group: 40% of patients).
As a result of the above mentioned division of patients into training and testing
ones, each of these parts made it possible to create time windows having duration
of 2 time points (2 h of patients observation) and sequences of such time windows
(training part: approximately 400 time windows, testing part: approximately 270 se-
quences of time windows). Time windows created on the basis of training patients
created a training table for a given experiment, while time windows sequences cre-
ated on the basis of tested patients created a test table for the experiment.
In order to determine the standard deviation of the obtained results each experi-
ment was repeated for 10 random divisions of the whole data set.
A single experiment was as follows (see also Figure7.11). First, for the training
data the family of all time windows having duration of 2 time points were gener-
ated. Then, on the basis of temporal patterns proposed by experts, the behavioral
graph from Figure 7.10 and the additional domain knowledge (represented by ex-
perts scripts in RoughICE — see [42] for more details) the temporal pattern tables
were constructed for all concepts from the behavioral graph from the Figure 7.10.
Then for all these tables a family of stratifying classifiers were generated that are
able to classify objects (patients) to different concepts from the sequence of ordered
layers. The first layer in this sequence represents objects which, without any doubt
do not belong to the concept. The next layers in the sequence represent objects be-
longing to the concept more and more certainly. The last layer in this sequence rep-
resents objects certainly belonging to the concept (see [4] for more details). Next,
a complex classifier was constructed on the basis of stratifying classifiers family
that allow us to predict membership of a particular time window to various tempo-
ral concepts from the behavioral graph (see Figure7.11). The main idea of working
7 Classifiers Based on Data Sets and Domain Knowledge 113
Fig. 7.11 A general scheme of experiments for the risk pattern of SCD
for every time window separately. In this way, we obtain the sequence of concepts
labels, that can be also treated as a potential path of nodes from the behavioral
graph. This method we call as an classifier method and the sequence of concepts
labels generated with usage of this method we call as a sequence of concepts labels
based on classifier.
Our method of presented approach evaluation is based on comparison of the ex-
pert and the classifier methods results. For a given sequence of time windows stw,
the accuracy of identification of the sequence stw is computed in the following way:
• if the expert sequence of concepts labels computed for stw matches a path from
the behavioral graph and a sequence of concepts labels based on the classifier
also matches a path from the behavioral graph, the accuracy of identification of
the sequence stw is equal 1,
• if the expert sequence of concepts labels computed for stw matches a path from
the behavioral graph and a sequence of concepts labels based on classifier does
not match a path from the behavioral graph, the accuracy of identification of the
sequence stw is equal 0,
• if the expert sequence of concepts labels computed for stw does not match a path
from the behavioral graph and a sequence of concepts labels based on classifier
matches a path from the behavioral graph, the accuracy of identification of the
sequence stw is equal 0,
• if the expert sequence of concepts labels computed for stw does not match a path
from the behavioral graph and a sequence of concepts labels based on classifier
does not match a path from the behavioral graph, the accuracy of identification
of the sequence stw is equal 1.
The accuracy of identification of the whole family of time windows sequences is
computed as an average value of accuracies computed for every sequence separately.
Table 7.3 shows the results of applying this algorithm for the concept related to the
risk pattern of SCD. We present the accuracy, the coverage, the accuracy for positive
examples (the expert sequence of concepts labels matches a path from the behavioral
graph) and negative examples (the expert sequence of concepts labels computed does
not match a path from the behavioral graph), the coverage for positive and negative
examples and the real accuracy (Real accuracy = Accuracy * Coverage). Together
with the results we present a standard deviation of the obtained results.
Notice, that the accuracy of decision class Yes in medical statistics [2] is called
a sensitivity (the proportion of those cases having a true positive test result of all
7 Classifiers Based on Data Sets and Domain Knowledge 115
positive cases tested), whereas the accuracy of decision class No is called a speci-
ficity (the proportion of true negatives of all the negative samples tested). We
see both main parameters of our classifier (i.e., sensitivity and specificity) are
sufficiently high.
Experimental results showed that the suggested method of behavioral patterns
identification gives good results, also in the opinion of medical experts (compatible
enough with the medical experience) and may be applied in medical practice as a
supporting tool for medical diagnosis and treatment evaluation.
Finally, let us notice that the specific feature of the methods considered here is
not only high accuracy (with low standard deviation) but also very high coverage
(equal 1.0).
All planning rules may be represented in a form of the so-called planning graphs
whose nodes are state descriptions (occurring in predecessors and successors of
planning rules) and action names occurring in planning rules. Let us consider plan-
ning graph from the Fig. 7.13, where the states are represented using ovals, and
actions are represented using rectangles. Each link between the nodes of this graph
represents a time dependencies. For example, the link between state s1 and action a1
tells us that in state s1 of the complex object action a1 may be performed, whereas
the link between action a1 and state s3 means that after performing action a1 the
state of the complex object may change to s1 . An example of a path in this graph is
sequence (a2 , s2 , a3 , s4 ) whereas path (s1 , a2 , s2 , a3 , s3 ) is an exemplary plan in this
graph.
Initial state a2 s2
s1 a3
a1 s3
s4
Plan: s1 a2 s2 a3 s4
Target state
be compatible with the treatment schemes used there), there has also been proposed
an additional mechanism enabling to resolve the nondeterminism occurring in the
application of planning rules. This mechanism is an additional classifier based on
data sets and domain knowledge. Such classifier (called a resolving classifier) sug-
gests the action to be performed in a given state and show the state which is the
result of the indicated action. A resolving classifier is a kind of stratifying classifier
and is constructed on the basis of resolving table (see Fig. 7.15 and [4] for more
details).
Fig. 7.15 The scheme of construction of the resolving table for a given state
118 J.G. Bazan et al.
Moderate
Severe respiratory respiratory failure
failure
Fig. 7.16 A planning graph for the treatment of infants during the respiratory failure
that for any part of the structured object, the sequence of action should be planned
in order to obtain meta-action on the level of the structured object.
The plan of execution of a single meta-action, which consists of short plans which
execute this meta-action on the levels of individual parts of the structured object,
is called a g-plan (see [4]). The g-plan is, thus, a family of plans assigned to be
executed for all parts of the established structured object.
Let us notice that determining the plan for a structured object requires not only
determining sets of plans for all parts of the structured object but also synchroniz-
ing them in time. In practise, all constructed plans for objects (parts) belonging to
a given structured object should be compatible. Therefore, during planning a meta-
action for a structured object, we use a special tool for verifying the compatibility
of plans generated for all members of a structured object. This verification can be
performed by using some special decision rules that we call elimination rules. Such
rules make it possible to eliminate combination of plans that are not compatible rel-
ative to domain knowledge. This is possible because elimination rules describe all
important dependencies between plans that are joined together. If any combination
of plans is not consistent with any elimination rule, then it is eliminated. A set of
elimination rules can be specified by human experts or can be computed from data
sets. In both of these cases, we need a set of attributes (features) defined for a single
plan that are used for explaining elimination rules. Such attributes are specified by
human experts on the basis of domain knowledge and they describe some impor-
tant features of the plan (generated for some part of structured object) with respect
to proper joining of a plan with plans generated for other parts of structured ob-
ject. These features are used as a set of attributes in the special table that we call
an elimination table. Any row of an elimination table represents information about
features of plans assigned for structured objects from the training data. For example,
the respiratory failure may be treated as a result of four following diseases: RDS,
PDA, sepsis, and Ureaplasma. Therefore, treating respiratory failure requires simul-
taneous treatment of all of these diseases. This means that the treatment plan of
respiratory failure comes into existence by joining the treatment plans for diseases
RDS, PDA, sepsis, and Ureaplasma, and at the same time the synchronization of
the plans is very important. In this chapter, one of the synchronizing tools for this
type of plans is the elimination table. In constructing the elimination table for treat-
ment of respiratory failure, patterns describing the properties of the joint plans are
needed. Moreover, planning graphs for all four diseases are necessary. In Fig. 7.17
the planning graph for RDS treatment is shown. In a very similar way the features
of treatment plans for PDA, sepsis, and Ureaplasma diseases may be defined.
On the basis of the elimination table, a set of elimination rules can be computed
that can be used to eliminate inappropriate plan arrangements for individual parts of
the structured object. So, the set of elimination rules can be used as a filter of incon-
sistent combinations of plans generated for members of groups. Any combination
of plans is eliminated when there exists an elimination rule that is not supported by
features of a combination, while the combination matches a predecessor of this rule.
In other words, a combination of plans is eliminated when the combination matches
7 Classifiers Based on Data Sets and Domain Knowledge 121
Mechanical
RDS with very severe
ventilation
hypoxemia
MAP3 mode
Mechanical
ventilation
RDS with mild MAP1 mode
hypoxemia (possibility of
additional Surfactant Mechanical
administration ) ventilation
MAP2 mode
(possibility of
additional Surfactant
administration )
Mechanical
RDS with moderate
ventilation
hypoxemia
CPAP mode
RDS excluded
Fig. 7.17 A planning graph for the treatment of infants during the RDS
to the predecessor of some elimination rule and does not match the successor of
a rule.
Fig. 7.18 shows the scheme of elimination rules of not acceptable g-plans con-
structed in the case of the treatment of respiratory failure, which is a result of the
four following diseases: sepsis, Ureaplasma, RDS, and PDA.
As we see, for any attribute from the elimination table, we compute the set of
rules with minimal number of descriptors treating this attribute as a decision at-
tribute. In this way, we obtain a set of dependencies in the elimination table ex-
plained by decision rules. In practice, it is necessary to filter elimination rules to
remove the rules with low support, because such rules can be too strongly matched
to the training data.
On the basis of the set of elimination rules, an elimination classifier may be con-
structed that enable elimination of inappropriate plan arrangements for individual
parts of the structured object.
If the combination of plans for parts of the structured object is consistent (it was
not eliminated by elimination rules), we should check if the execution of this com-
bination allows us to realize the expected meta action from the level of structured
objects. This can be done by a special classifier constructed for a table called a
meta action table. The structure of a meta action table is similar to the structure
of an elimination table, i.e., attributes are defined by human experts, where rows
represent information about features of plans assigned for parts of exemplary struc-
tured objects from the training data. In addition, we add to this table a decision at-
tribute. Values of such decision attributes represent names of meta actions which are
122 J.G. Bazan et al.
Fig. 7.18 The scheme of construction of elimination rules for group of four diseases: sepsis,
Ureaplasma, RDS and PDA
Collections
Maintenance of the efficient
respiratory system
meta-action
Maintenance of the severe
Severe respiratory failure
respiratory failure
. . .
object
Elimination of
not-acceptable
Maintenance of the efficient
respiratory system
Efficiency of the
respiratory system
and by meta-
respiratory failure from severe to moderate
object
The collection
Maintenance of the efficient
respiratory system
Efficiency of the
respiratory system
Recovery from the
mild respiratory
of acceptable plans
Automated Collections for meta action
failure
Maintenance of the mild
Mild respiratory failure
respiratory failure
Improvement of
respiratory
Moderate respiratory failure
automated planning methods, that is, to compare the plan generated automatically
with the plan suggested by experts from a given domain.
The problem of inducing classifiers for similarity relations is one of the challeng-
ing problems in data mining and knowledge discovery (see bibliography from [4]).
The existing methods are based on building models for similarity functions using
simple strategies for fusion of local similarities. The optimization of the assumed
parameterized similarity formula is performed by tuning parameters relative to lo-
cal similarities and their fusion. For instance, if we want to compare two medical
plans of treatments, e.g., one plan generated automatically by our computer system
and another one proposed by medical expert, we need a tool to estimate the similar-
ity. This problem can be solved by introducing a function measuring the similarity
between medical plans. For example, in the case of our medical data, a formula is
used to compute a similarity between two plans as the arithmetic mean of similarity
between all corresponding pairs of actions (nodes) from both plans, where the sim-
ilarity for the single corresponding pair of actions is defined by a consistence mea-
sure of medicines and medical procedures comprised in these actions. For example,
let M = {m1 , ..., mk } be a set consisting of k medicines. Let us assume that actions in
medical plans are specified by subsets of M. Hence, any medical plan P determines
a sequence of actions A(P) = (A1 , ..., An ), where Ai ⊆ M for i = 1, . . . , n and n is the
number of actions in P. In our example, the similarity between plans is defined by
a similarity function Sim established on pairs of medical plans (P1 , P2 ) (of the same
124 J.G. Bazan et al.
length) with the sequences of actions A(P1 ) = (A1 , ..., An ) and A(P2 ) = (B1 , ..., Bn ),
respectively as follows
1 n |Ai ∩ Bi | + |M \ (Ai ∪ Bi )|
Sim(P1 , P2 ) = ∑
n i=1 |M|
.
However, such an approach seems to be very abstract and ad hoc, because it does
not take into account any deeper knowledge about the similarity of plans, e.g., do-
main knowledge. Whereas, the similarity relations for real-life problems are usu-
ally more complex objects, i.e., their construction from local similarities cannot be
obtained by simple fusion functions. Hence, such similarity relations cannot be ap-
proximated with the satisfactory quality by employing the existing simple strategies.
For this reason we treat this similarity measure, Sim, only as an example and do not
take into account in our further research (and in our proposed method). Whereas,
to support the process of similarity relation approximation, we propose to use do-
main knowledge represented by concept ontology expressed in natural language.
The ontology consists of concepts used by expert in his explanation of similarity
and dissimilarity cases. Approximation of the ontology makes it possible to obtain
some relevant concepts for approximation of the similarity relation.
According to the domain knowledge, it is quite common, that there are many as-
pects of similarity between plans. For example, in case of comparison of medical
plans used for the treatment of infants with respiratory failure, we should take into
consideration, e.g., the similarity of the antibiotics use, the ventilation mode and the
similarity of PDA closing (see [4] for mor medical details). Moreover, every aspect
of the similarity should be understood in a different way. For example, in estimation
of the similarity in the antibiotic treatment, it should be evaluated the kind of antibi-
otic, as well as the time of administration. Therefore, it is necessary to investigate
and take into account all incompatibilities of the antibiotic use between correspond-
ing pairs of nodes from both plans. Excessive doses are rather acceptable (based on
expert knowledge), whilst the lack of medicine (if it is necessary) should be taken
as a very serious mistake. In such situation, the difference in our assessment is esti-
mated as very significant. A bit different interpretation of similarity should be used
in case of the ventilation. As in antibiotic use, we investigate all incompatibilities of
the ventilation mode between corresponding pairs of nodes from both plans. How-
ever, sometimes, according to expert knowledge, we simplified our assessments,
e.g., respiration unsupported and CPAP are estimated as similar for more medical
details). More complicated situation is present if we want to judge the similarity
in treatment of PDA. We have to assign the ventilation mode, as well as the simi-
larity of PDA closing procedure. In summary, any aspect of the similarity between
plans should be taken into account in the specific way and the domain knowledge
7 Classifiers Based on Data Sets and Domain Knowledge 125
is necessary for joining all these similarities (obtained for all aspects). Therefore,
the similarity between plans should be assigned on the basis of a special ontology
specified in a dialog with human experts. Such ontology we call similarity ontology.
Using such similarity ontology, we developed methods for inducing classifiers pre-
dicting the similarity between two plans (generated automatically and proposed by
human experts).
In the chapter, we assume that each similarity ontology between plans has a tree
structure. The root of this tree is always one concept representing general similar-
ity between plans. In each similarity ontology there may exist concepts of two-way
type. In this chapter, the concepts of the first type will be called internal concepts
of ontology. They are characterized by the fact that they depend on other ontology
concepts. The concept of the second type will be called input concepts of ontol-
ogy (in other words the concepts of the lowest ontology level). The input concepts
are characterized by the fact that they do not depend on other ontology concepts.
Fig. 7.20 shows an exemplary ontology of similarity between plans of the treatment
of newborn infants with the respiratory failure. This ontology has been provided by
human experts. However, it is also possible to present some other versions of such
ontology, instead of that presented above, according to opinions of some other group
of human experts.
Fig. 7.20 An exemplary ontology of similarity between plans of the treatment of newborn
infants with respiratory failure
126 J.G. Bazan et al.
Using the similarity ontology (e.g., the ontology presented in Fig. 7.20), we devel-
oped methods for inducing classifiers predicting the similarity between two plans
(generated automatically and proposed by human experts).
Pairs of
plans C1 ................ Ck C
(pa1, pe1) 0.2 0.3 0.1
......
Row corresponds to
(pa2, pe2) 0.4 0.5 0.5
a pair of plans: the ......
first generated
(pa3, pe3) 0.2 0.8 0.8
automatically and ......
the second proposed
(pa4, pe4) 0.8 0.1 0.2
by experts ......
(pa5, pe5) 0.3 0.2 0.6
......
The method for construction of such classifier can be based on a similarity table
of plans. The similarity table of plans is the decision table which may be constructed
for any concept from the similarity ontology. The similarity table is created in order
to approximate a concept for which the table has been constructed. The approxima-
tion of the concept takes place with the help of classifiers generated for the similarity
table. However, because of the fact that in the similarity ontology there occur two
types of concepts (internal and input), there are also two types of similarity tables.
Similarity tables of the first type are constructed for internal concepts, whereas the
tables of the second type are constructed for input concepts.
Similarity tables for internal concepts of similarity ontology are constructed for a
certain fragment of similarity ontology which consists of a concept of this ontology
and concepts on which this concept depends. In the case of ontology from Fig. 7.20
it may be for instance the concept Similarity of a symptom treatment of sepsis and
concepts Similarity of corticosteroid use, Similarity of catecholamin use, and Simi-
larity of hemostatic agents use. To simplify further discussion, let us assume that it
is the concept C that depends in the similarity ontology on the concepts C1 , ..., Ck .
The aim of constructing a similarity table is approximation of concept C using con-
cepts C1 , ..., Ck (see Fig. 7.21). Condition columns of such similarity table represent
concepts C1 , ..., Ck . Any row corresponds to a pair of plans: generated automatically
and proposed by experts. Values of all attributes have been provided by experts from
the set {0.0, 0.1, ..., 0.9, 1.0}. Finally, the decision column represents the concept C.
The stratifying classifier computed for a similarity table (called a similarity clas-
sifier) can be used to determine the similarity between plans (generated by our
7 Classifiers Based on Data Sets and Domain Knowledge 127
C1
C2 C3
C4 C5 C6
value μC3 (u). Finally, values μC2 (u) and μC3 (u) are used as the values of conditional
attributes of the similarity table constructed for concept C1 . Thus, the object u may
be classified by classifier μC1 to layer μC1 (u).
The complex classifier described above can be used to determine the general
similarity between plans generated by our methods of automated planning and
plans proposed by human experts, e.g., during the real-life clinical treatment (see
Section 7.5.5).
We have applied the train-and-test method. In each experiment the whole set
of patients was randomly divided into two groups (training and tested one). Each
of these groups allowed creating approximately 4000 time windows which have
duration of 7 time points. Time windows created on the basis of patients from the
training part created a training table for a given experiment (when plans of treatment
have been assigned), whereas time windows created on the basis of patients from the
tested part created a test table for the experiment (when plans have been generated
by automated method and expert plans are known in order to compare both plans)
In the discussed experiments, the distance between time points recorded for a
specific patient was constant (one day). In a single experiment concerning a pa-
tient’s treatment, a 7-point sequence of time points was used. In terms of planning
the treatment each such sequence may be written as s1 , a1 , s2 , a2 , s3 , a3 , s4 , a4 , s5 ,
a5 , s5 , a6 , s7 , where si (for i = 1, ..., 7) is a patient state and ai (for i = 1, ..., 6) is
a complex medical action performed in the state si . The first part of the above se-
quence of states and actions, that is, from state s1 to state s3 , was used by the method
of automated planning as the input information (corresponding to the values of con-
ditional attributes in the classic approach to constructing classifiers). The remaining
actions and states were automatically generated to create plan (s3 , a3 , s4 , a4 , s5 , a5 ,
s6 , a6 , s7 ). This plan may be treated as a certain type of a complex decision value.
Verification of the quality of the generated plan consisted in comparing plan (s3 , a3 ,
s4 , a4 , s5 , a5 , s6 , a6 , s7 ) with plan (s3 , a3 , s4 , a4 , s5 , a5 , s5 , a6 , s7 ). It is worth adding
that a single complex action concerned one time point, meta action concerned two
time points and a single experiment consisted in planning two meta actions. Hence,
in a single experiment four actions were planned (patient’s treatment for 4 days). In
other words, at the beginning of the automated planning procedure the information
about the patient’s state in the last 3 days of his hospitalization was used (s1 , s2 ,
s3 ) together with the information about complex medical actions undertaken one or
2 days before (a1 , a2 ). The generated plan included information about a suggested
complex medical action on a given day of hospitalization (a3 ), information about
actions which should be undertaken in the 3 following days of hospitalization (a4 ,
a5 , a6 ) and information about the patient’s state anticipated as a result of the planned
treatment in the four following days of hospitalization (s4 , s5 , s6 , s7 ).
As a measure of planning success (or failure) in our experiments, we use the spe-
cial classifier that can predict the similarity between two plans as a number between
0.0 (very low similarity between two plans) and 1.0 (very high similarity between
two plans) (see Section 7.5.4). We use this classifier to determine the similarity be-
tween plans generated by our methods of automated planning and plans proposed be
human experts during the real-life clinical treatment. In order to determine the stan-
dard deviation of the obtained results, each experiment was repeated for 10 random
divisions of the whole data set.
The average similarity between plans for all tested situations was 0.802. The
corresponding standard deviations was 0.041. The coverage of tested situation by
generated plans was 0.846 with standard deviation 0.018.
Due to the fact that the average similarity is not too high (less than 0.9) and
the standard deviation is relatively high for our algorithm, we present also the
130 J.G. Bazan et al.
Table 7.4 The average percent of plans belonging to the specified interval and the average
similarity of plans in this interval
distribution of the results. We describe results in such a way that we present how
many generated plans belong to the specified interval of similarity. For this reason
we divided interval [0.0, 1.0] into 5 equal intervals, i.e., [0.0, 0.2], [0.2, 0.4], [0.4,
0.6], [0.6, 0.8], and [0.8, 1.0]. Table 7.4 shows the average percent of the plans be-
longing to the specified interval and the average similarity of plans in this interval.
It is easy to see that some group of plans generated automatically is not enough
similar to the plans proposed by the experts. If we assume that inadequate similarity
is lower than 0.6, in this group we found about 25% of all plans (see Table 7.4). To
explain this issue, we should observe more carefully plans, which are incompatible
with the proposals prepared by experts. In practice, the main medical actions influ-
encing the similarity of plans in accordance with ontology of the similarity from
Fig. 7.20 are mechanical ventilation, antibiotics, anti mycotic agents, and macrolide
antibiotics. Therefore, it may be interesting how the treatment similarity changed in
the range of applying these actions in the individual intervals of similarity between
the plans.
On Fig. 7.23 we can see that a significant incompatibility of treatment plans most
often concerns mechanical ventilation and perhaps antibiotic therapy — the situ-
ation when a patient develops a sudden and severe infection (e.g., sepsis). Such
circumstances cause rapid exacerbation of respiratory failure are required higher
level of mechanical ventilation and immediate antibiotic treatment. For example, al-
though microbiological confirmation of current infection is achieved after 2-3 days,
physician starts treatment after first symptoms of suspected disease and often inten-
sify mechanical ventilation mode. It would seem that the algorithms of automated
planning presented in this chapter may imitate the strategy of treatment described
above. Unfortunately, in practice, these algorithms are not able to learn this strat-
egy for a lot of information because they were not introduced to the base records or
were introduced with delay. For instance, hemoglobin saturation which is measured
for the whole time, as the dynamic marker of patients respiratory status, was not
found in the data, whilst results of arterial blood gases were introduced irregularly,
with many missing values. So, the technical limitation of the current data collection
lead to the intensive work modifying and extending both, the equipment and soft-
ware, served for gathering clinical data. It may be expected that in several years the
7 Classifiers Based on Data Sets and Domain Knowledge 131
1,0
0,8
0,7
0,6
0,5
0,4
0,3
0,2
0,1
0,0
(0.0, 0.2] (0.2, 0.4] (0.4, 0.6] (0.6, 0.8] (0.8, 1.0]
Threshold intervals
Fig. 7.23 The average similarity of plans in the specified interval for medical actions
automated planning algorithms, described in this chapter, will achieve much better
and useful results.
A separate problem is a relatively low coverage of the algorithms described in
this chapter which equals averagely 0.846. Such a low coverage results from the
specificity of the automated planning method used which synchronizes the treat-
ment of four diseases (RDS, PDA, sepsis, and Ureaplasma). We may identify two
reasons of a low coverage. First, because of data shortage the algorithm in many
situations may not synchronize the treatment of the above mentioned diseases. It
happens this way because each proposed comparison of plans may be debatable in
terms of the knowledge gathered in the system. Therefore, in these cases the system
does not suggest any treatment plan and says I do not know. The second reason for
low coverage is the fact that the automated planning method used requires applica-
tion of a complex classifier which consists of many classifiers of lesser complexity.
Combining these classifiers together often causes the effect of decreasing the com-
plex classifier coverage. For instance, let us assume that making decision for tested
object u requires application of complex classifier μ , which consists of two classi-
fiers μ1 and μ2 . We apply classifier μ1 directly to u, whereas classifier μ2 is applied
to the results of classification of classifier μ1 . In other words, to make classifier μ2
work for a given tested object u we need value μ1 (u). Let us assume that the cover-
age for classifiers μ1 and μ2 equals respectively 0.94 and 0.95. Hence, the coverage
of classifier μ is equal 0.94 · 0.95 = 0.893, that is the coverage of classifier μ is
smaller than the coverage of classifier μ1 as well as the coverage of classifier μ2 .
In summation, we conclude that experimental results showed that the proposed
automated planning method gives good results, also in the opinion of medical
132 J.G. Bazan et al.
experts (compatible enough with the plans suggested by the experts), and may be
applied in medical practice as a supporting tool for planning the treatment of infants
suffering from respiratory failure.
7.6 Conclusion
The aim of this chapter was to present new methods of approximating complex
concepts on the basis of experimental data and domain knowledge which is mainly
represented using concept ontology.
At the beginning of the chapter, a method of spatial complex concepts approxima-
tion was presented (see Section 7.2). Next, in Sections 7.3 we presented the method
of approximate spatio-temporal complex concepts. In the further part of the chapter,
the method of behavioral pattern identification was overviewed (see Section 7.4).
Finaly, in Section 7.5, we described the method of automated planning of behavior
of complex objects when the states of objects are represented by spatio-temporal
concepts which require an approximation.
We have also described the results of computer experiments conducted on real-
life data sets which were obtained from the road traffic simulator (see [44]) and
on medical data sets which were made available by Neonatal Intensive Care Unit,
First Department of Pediatrics, Polish-American Institute of Pediatrics, Collegium
Medicum, Jagiellonian University, Cracow, Poland and by Second Department of
Internal Medicine, Collegium Medicum, Jagiellonian University, Cracow, Poland.
In light of theoretical discourse and the results of computer experiments
presented in the chapter the following conclusions may be drawn:
1. The method of approximation of complex spatial concepts, described in the
chapter (see Section 7.2), with the help of approximate reasoning schemes (AR-
schemes) leads to better results than the classical methods based on decision
rules induced directly from sensor data, because the quality of classifier classifi-
cation based on AR-schemes is higher than the quality of classification obtained
by classifiers based on decision rules, particularly for small decision classes rep-
resenting atypical cases in the recognition of which we are most interested in,
e.g., a dangerous driving vehicle on a highway. Moreover, for larger data sets,
the time of constructing classifiers based on AR-schemes is much shorter than
the time of inducing classifiers based on decision rules, and the structure of
classifiers based on AR-schemes is less complex than the structure of classifiers
based on decision rules. It is also worth mentioning that the classifiers based
on AR-schemes are more robust (stable or tolerant) when it comes to changes
in training data sets serving the construction of classifiers, that is, a classifier
based on AR-schemes, constructed for one data set, often proves itself good for
another data set. For example, a classifier constructed for data generated from
the traffic simulator with one simulation scenario proves itself useful in classifi-
cation of objects generated by the simulator with the use of another simulation
scenario.
7 Classifiers Based on Data Sets and Domain Knowledge 133
2. The methodology of modeling complex object behavior with the use of behav-
ioral graphs of these objects, proposed in the chapter (see Section 7.4), is a
convenient and effective tool for identifying behavioral or risk patterns of com-
plex objects. On the one hand this methodology, enables to represent concepts
on a high abstraction level, and on the other hand, owing to the use of a domain
knowledge, it enables to approximate these concepts on the basis of sensor data
and using a domain knowledge.
3. The methods of automated planning of complex object behavior proposed in
the chapter facilitate an effective planning of behavior of objects whose states
are defined in a natural language using vague spatio-temporal conditions (see
Section 7.5). The authenticity of conditions of this type is usually not possible
to be verified on the basis of a simple analysis of available information about
the object and that is why these conditions must be treated as spatio-temporal
complex concepts and their approximation requires methods described in this
chapter which are based on data sets and domain knowledge.
In summation, it may be concluded that in executing real-life projects related to the
construction of the intelligent systems supporting decision-making, apart from data
sets it is necessary to apply domain knowledge. Without its application successful
execution of many such projects becomes extremely difficult or impossible. On the
other hand, appropriate space must be found for the automated methods of classifier
construction wherever it is feasible. It means, thus, finding a certain type of “the
golden mean" to apply appropriate proportions in domain knowledge usage and au-
tomated methods of data analysis. Certainly, it will determine the success or failure
of many projects.
Acknowledgement. This work was supported by the grant N N516 077837 from the Ministry
of Science and Higher Education of the Republic of Poland, the Polish National Science
Centre (NCN) grant 2011/01/B/ST6/03867 and by the Polish National Centre for Research
and Development (NCBiR) grant No. SP/I/1/77065/10 in frame of the the strategic scientific
research and experimental development program: “Interdisciplinary System for Interactive
Scientific and Scientific-Technical Information”.
References
1. van der Aalst, W.M.P.: Process Mining: Discovery, Conformance and Enhancement of
Business Processes. Springer, Berlin (2011)
2. Altman, D.G.: Practical Statistics for Medical Research. Chapman and Hall/CRC, Lon-
don (1997)
3. Bar-Yam, Y.: Dynamics of Complex Systems. Addison-Wesley, New York (1997)
4. Bazan, J.G.: Hierarchical Classifiers for Complex Spatio-temporal Concepts. In: Peters,
J.F., Skowron, A., Rybiński, H. (eds.) Transactions on Rough Sets IX. LNCS, vol. 5390,
pp. 474–750. Springer, Heidelberg (2008)
5. Bazan, J.G.: Rough sets and granular computing in behavioral pattern identification and
planning. In: Pedrycz, W., Skowron, A., Kreinovich, V. (eds.) Handbook of Granular
Computing, pp. 777–799. John Wiley & Sons, The Atrium (2008)
134 J.G. Bazan et al.
6. Bazan, J.G., Kruczek, P., Bazan-Socha, S., Skowron, A., Pietrzyk, J.: Automatic plan-
ning based on rough set tools: Towards supporting treatment of infants with respiratory
failure. In: Proceedings of the Workshop on Concurrency, Specification, and Program-
ming (CS&P 2006), Wandlitz, Germany, September 27-29. Informatik-Bericht, vol. 170,
pp. 388–399. Humboldt University, Berlin (2006)
7. Bazan, J.G., Kruczek, P., Bazan-Socha, S., Skowron, A., Pietrzyk, J.: Automatic Plan-
ning of Treatment of Infants with Respiratory Failure Through Rough Set Modeling. In:
Greco, S., Hata, Y., Hirano, S., Inuiguchi, M., Miyamoto, S., Nguyen, H.S., Słowiński,
R. (eds.) RSCTC 2006. LNCS (LNAI), vol. 4259, pp. 418–427. Springer, Heidelberg
(2006)
8. Bazan, J.G., Kruczek, P., Bazan-Socha, S., Skowron, A., Pietrzyk, J.: Risk pattern iden-
tification in the treatment of infants with respiratory failure through rough set modeling.
In: Proceedings of the Eleventh Conference of Information Processing and Management
of Uncertainty in Knowledge-Based Systems (IPMU 2006), Paris, France, July 2-7, pp.
2650–2657 (2006)
9. Bazan, J.G., Kruczek, P., Bazan-Socha, S., Skowron, A., Pietrzyk, J.: Rough set approach
to behavioral pattern identification. Fundamenta Informaticae 75(1-4), 27–47 (2007)
10. Bazan, J.G., Skowron, A.: Classifiers based on approximate reasoning schemes. In:
Dunin-K˛eplicz, B., Jankowski, A., Skowron, A., Szczuka, M. (eds.) Monitoring, Secu-
rity, and Rescue Techniques in Multiagent Systems. Advances in Soft Computing, pp.
191–202. Springer, Heidelberg (2005)
11. Borrett, S.R., Bridewell, W., Langley, P., Arrigo, K.R.: A method for representing and
developing process models. Ecological Complexity 4, 1–12 (2007)
12. Breiman, L.: Statistical modeling: the two cultures. Statistical Science 16(3), 199–231
(2001)
13. Desai, A.: Adaptive complex enterprises. Communications ACM 5(48), 32–35 (2005)
14. Doherty, P., Łukaszewicz, W., Skowron, A., Szałas, A.: Knowledge Engineering: A
Rough Set Approach. Springer, Heidelberg (2006)
15. Domingos, P.: Toward knowledge-rich data mining. Data Mining and Knowledge Dis-
covery 1(15), 21–28 (2007)
16. Friedman, J., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning: Data Min-
ing, Inference, and Prediction, vol. I-V. Springer, Heidelberg (2001)
17. Guarino, N.: Formal ontology and information systems. In: Proceedings of the First Inter-
national Conference on Formal Ontology in Information Systems (FOIS 1998), Trento,
Italy, June 6-8, pp. 3–15. IOS Press (1998)
18. Hillerbrand, R., Sandin, P., Peterson, M., Roeser, S. (eds.): Handbook of Risk Theory:
Epistemology, Decision Theory, Ethics, and Social Implications of Risk. Springer, Berlin
(2012)
19. Hoen, P.J., Tuyls, K., Panait, L., Luke, S., Poutré, J.A.L.: An Overview of Cooperative
and Competitive Multiagent Learning. In: Tuyls, K., ’t Hoen, P.J., Verbeeck, K., Sen, S.
(eds.) LAMAS 2005. LNCS (LNAI), vol. 3898, pp. 1–46. Springer, Heidelberg (2006)
20. Ignizio, J.P.: An Introduction to Expert Systems. McGraw-Hill, New York (1991)
21. Jarrar, M.: Towards Methodological Principles for Ontology Engineering. Ph.D. thesis,
Vrije Universiteit Brussel (2005)
22. Keefe, R.: Theories of Vagueness. Cambridge University Press, New York (2000)
23. Kloesgen, E., Zytkow, J. (eds.): Handbook of Knowledge Discovery and Data Mining.
Oxford University Press, Oxford (2002)
24. Kriegel, H.P., Borgwardt, K.M., Kröger, P., Pryakhin, A., Schubert, M., Zimek, A.: Fu-
ture trends in data mining. Data Mining and Knowledge Discovery 1(15), 87–97 (2007)
25. Langley, P.: Cognitive architectures and general intelligent systems. AI Magazine 27,
33–44 (2006)
26. Liu, J., Jin, X., Tsui, K.: Autonomy Oriented Computing: From Problem Solving to
Complex Systems Modeling. Kluwer/Springer, Heidelberg (2005)
7 Classifiers Based on Data Sets and Domain Knowledge 135
27. Luck, M., McBurney, P., Shehory, O., Willmott, S.: Agent technology: Computing as in-
teraction. a roadmap for agent-based computing. Agentlink iii, the european coordination
action for agent-based computing. University of Southampton, UK (2005)
28. Michalski, R., et al. (eds.): Machine Learning, vol. I-IV. Morgan Kaufmann, Los Altos
(1983, 1986, 1990, 1994)
29. Michie, D., Spiegelhalter, D.J., Taylor, C.C.: Machine learning, neural and statistical
classification. Ellis Horwood Limited, England (1994)
30. Mitchel, T.M.: Machine Learning. McGraw-Hill, Boston (1997)
31. Pancerz, K., Suraj, Z.: Rough sets for discovering concurrent system models from data
tables. In: Hassanien, A.E., Suraj, Z., Śl˛ezak, D., Lingras, P. (eds.) Rough Computing:
Theories, Technologies and Applications. Idea Group, Inc. (2007)
32. Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data. In: D: System
Theory, Knowledge Engineering and Problem Solving, vol. 9, Kluwer Academic Pub-
lishers, Dordrecht (1991)
33. Pawlak, Z., Skowron, A.: Rudiments of rough sets. Information Sciences 177, 3–27
(2007)
34. Peters, J.F.: Rough Ethology: Towards a Biologically-Inspired Study of Collective Be-
havior in Intelligent Systems with Approximation Spaces. In: Peters, J.F., Skowron, A.
(eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 153–174. Springer, Heidel-
berg (2005)
35. Peters, J.F., Skowron, A.: Zdzisław Pawlak life and work (1926–2006). Information Sci-
ences 177, 1–2 (2007)
36. Bazan, J.G., Szczuka, M.S.: The Rough Set Exploration System. In: Peters, J.F.,
Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 37–56.
Springer, Heidelberg (2005)
37. Peters, J.F., Skowron, A., Rybiński, H. (eds.): Transactions on Rough Sets IX. LNCS,
vol. 5390. Springer, Heidelberg (2008)
38. Poggio, T., Smale, S.: The mathematics of learning: Dealing with data. Notices of the
American Mathematical Society (AMS) 5(50), 537–544 (2003)
39. Polkowski, L., Skowron, A.: Rough Mereology. In: Raś, Z.W., Zemankova, M. (eds.)
ISMIS 1994. LNCS, vol. 869, pp. 85–94. Springer, Heidelberg (1994)
40. Read, S.: Thinking about Logic: An Introduction to the Philosophy of Logic. Oxford
University Press, New York (1994)
41. Ripley, B.D.: Pattern Recognition and Neural Networks. Cambridge University Press,
Cambridge (1996)
42. Rough ICE: Project web site, http://logic.mimuw.edu.pl/~bazan/
roughice/
43. RSES: Project web site, http://logic.mimuw.edu.pl/~rses
44. Simulator: Project web site, http://logic.mimuw.edu.pl/~bazan/
simulator
45. Skowron, A., Stepaniuk, J., Swiniarski, R.W.: Modeling rough granular computing based
on approximation spaces. Information Sciences 184(1), 20–43 (2012)
46. Stone, P., Sridharan, M., Stronger, D., Kuhlmann, G., Kohl, N., Fidelman, P., Jong, N.K.:
From pixels to multi-robot decision-making: A study in uncertainty. Robotics and Au-
tonomous Systems 54(11), 933–943 (2006)
47. Suraj, Z.: Discovering concurrent data models and decision algorithms from data: A
rough set approach. International Journal on Artificial Intelligence and Machine Learn-
ing, IRSI, 51–56 (2004)
48. The Infobright Community Edition (ICE) Homepage at, http://www.
infobright.org/
49. Unnikrishnan, K.P., Ramakrishnan, N., Sastry, P.S., Uthurusamy, R.: Service-oriented
science: Scaling escience impact. In: Proceedings of the Fourth KDD Workshop on Tem-
poral Data Mining: Network Reconstruction from Dynamic Data, The Twelfth ACM
SIGKDD International Conference on Knowledge Discovery and Data (KDD 2006),
Philadelphia, USA, August 20-23 (2006)
136 J.G. Bazan et al.
50. Urmson, C., et al.: High speed navigation of unrehearsed terrain: Red team technology
for grand challenge. Report CMU-RI-TR-04-37, The Robotics Institute, Carnegie Mel-
lon University (2004)
51. Vapnik, V. (ed.): Statistical Learning Theory. Wiley, New York (1998)
52. Zadeh, L.A.: From computing with numbers to computing with words – from manipula-
tion of measurements to manipulation of perceptions. IEEE Transactions on Circuits and
Systems – I: Fundamental Theory and Applications 1(45), 105–119 (1999)
53. Zadeh, L.A.: Toward a generalized theory of uncertainty (GTU) - an outline. Information
Sciences 171, 1–40 (2005)