Tutorial
Tutorial
Juan A. Recio-Garca
Beln Daz-Agudo
Pedro Gonzlez-Calero
This work is supported by the MID-CBR project of the Spanish Ministry of Education
& Science TIN2006-15140-C03-02 and the G.D. of Universities and Research of the
Community of Madrid (UCM-CAM-910494 research group grant).
jCOLIBRI2 Tutorial
Juan A. Recio-Garca
Beln Daz-Agudo
Pedro Gonzlez Calero
Technical Report IT/2007/02
Department of Software Engineering and Artificial Intelligence.
University Complutense of Madrid
ISBN: 978-84-691-6204-0
Contents
Contents
1 Introduction
9
9
14
17
28
33
35
35
36
39
41
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
13 Retrieval
13.1 Similarity functions . . . . . . . . . . . . . . . .
13.2 Cases selection . . . . . . . . . . . . . . . . . .
13.3 Using the k-NN retrieval . . . . . . . . . . . . .
13.4 Retrieval in the Travel Recommender application
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
54
55
55
56
58
.
.
.
.
61
62
63
64
66
jCOLIBRI2 Tutorial
Contents
14 Reuse
67
14.1 Adapting the Travel Recommender retrieved trips . . . . . . . . . . . . 67
15 Revise
69
15.1 Travel Recommender revision . . . . . . . . . . . . . . . . . . . . . . 69
16 Retain
70
16.1 Saving the new trips of the Travel Recommender . . . . . . . . . . . . 70
17 Shutting down a CBR application
71
18 Textual CBR
18.1 Semantic retrieval . . . . . . . . . . . . . . . . . . . . . . . . .
18.1.1 Representation of the texts . . . . . . . . . . . . . . . .
18.1.2 IE methods implementation . . . . . . . . . . . . . . .
18.1.3 Computing similarity . . . . . . . . . . . . . . . . . . .
18.1.4 The Restaurant Recommender example . . . . . . . . .
18.2 Statistical retrieval . . . . . . . . . . . . . . . . . . . . . . . .
18.2.1 The Restaurant Recommender using statistical methods .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
72
72
73
75
76
77
81
82
83
20 Recommenders
20.1 Templates guided design of recommendation systems
20.1.1 One-Off Preference Elicitation . . . . . . . .
20.1.2 Retrieval . . . . . . . . . . . . . . . . . . .
20.1.3 Iterated Preference Elicitation . . . . . . . .
20.2 Methods for recommender systems . . . . . . . . . .
20.2.1 User interaction methods . . . . . . . . . . .
20.2.2 New Nearest Neighbor similarity measures .
20.2.3 Conditional methods . . . . . . . . . . . . .
20.2.4 Navigation by Asking methods . . . . . . . .
20.2.5 Navigation by Proposing . . . . . . . . . . .
20.2.6 Profile management methods . . . . . . . . .
20.2.7 Collaborative Recommendations . . . . . . .
20.2.8 Retrieval methods . . . . . . . . . . . . . .
20.2.9 Cases selection . . . . . . . . . . . . . . . .
86
86
87
89
89
90
90
91
92
92
93
94
94
95
97
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
21 Other Features
99
21.1 Visualization of a Case Base . . . . . . . . . . . . . . . . . . . . . . . 99
21.2 Classification and Maintenance . . . . . . . . . . . . . . . . . . . . . . 99
22 Getting support
101
23 Contributing to jCOLIBRI
102
23.1 Required elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
23.2 Example applications . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
jCOLIBRI2 Tutorial
Contents
jCOLIBRI2 Tutorial
List of Figures
List of Figures
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 10
. 12
. 15
. 15
. 16
. 18
. 19
. 20
. 21
. 22
. 23
. 24
. 25
. 26
. 27
. 39
. 42
. 50
. 55
. 56
. 57
. 58
. 59
. 60
. 74
. 75
. 76
. 84
. 87
. 88
. 88
. 100
jCOLIBRI2 Tutorial
Listings
Listings
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
StandardCBRApplication interface . . . . . . . . . . . . . . . . . . . . 28
TravelRecommender initial code . . . . . . . . . . . . . . . . . . . . . 29
TravelRecommender singleton . . . . . . . . . . . . . . . . . . . . . . 29
TravelRecommender GUI code . . . . . . . . . . . . . . . . . . . . . . 30
TravelRecommender main() method . . . . . . . . . . . . . . . . . . . 31
Travel Recommender data base schema . . . . . . . . . . . . . . . . . 33
TravelRecommender configure() method (version 1) . . . . . . . . . . . 33
CaseComponent interface . . . . . . . . . . . . . . . . . . . . . . . . . 35
Bean example code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Using Attribute example code . . . . . . . . . . . . . . . . . . . . . . 36
CBRQuery code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
CBRCase code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
TravelDescription initial code . . . . . . . . . . . . . . . . . . . . . . . 39
TravelSolution initial code . . . . . . . . . . . . . . . . . . . . . . . . 40
TypeAdaptor interface . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Connector interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
CBRCaseBase interface . . . . . . . . . . . . . . . . . . . . . . . . . . 44
TravelRecommender configure() (version 2) and precycle() . . . . . . . 46
databaseconfig.xml . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
hibernate.cfg.xml . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
TravelSolution.hbm.xml . . . . . . . . . . . . . . . . . . . . . . . . . 50
TravelDescription.hbm.xml . . . . . . . . . . . . . . . . . . . . . . . . 51
Mapping template for user-defined types . . . . . . . . . . . . . . . . . 52
Mapping template for enumerates . . . . . . . . . . . . . . . . . . . . 52
TravelRecommender configure() method (final version) . . . . . . . . . 58
NNScoringMethod signature . . . . . . . . . . . . . . . . . . . . . . . 61
Local and Global similarity interfaces . . . . . . . . . . . . . . . . . . 62
Cases selection methods . . . . . . . . . . . . . . . . . . . . . . . . . 63
Example code for the k-NN Retrieval . . . . . . . . . . . . . . . . . . 64
TravelRecommender.cycle() code (step1) . . . . . . . . . . . . . . . . 66
TravelRecommender.cycle() code (step2) . . . . . . . . . . . . . . . . 68
TravelRecommender.cycle() code (step3) . . . . . . . . . . . . . . . . 69
TravelRecommender.cycle() code (step4) . . . . . . . . . . . . . . . . 70
TravelRecommender.postCycle() code . . . . . . . . . . . . . . . . . . 71
The Restaurant Recommender precycle using semantic TCBR methods
78
The Restaurant Recommender cycle using semantic TCBR methods . . 79
The Restaurant Recommender using statistical TCBR methods . . . . . 82
Evaluation code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Example of an evaluable application . . . . . . . . . . . . . . . . . . . 85
Visualization code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Config file for the Examples application . . . . . . . . . . . . . . . . . 103
jCOLIBRI2 Tutorial
1. Introduction
1 Introduction
Case-based reasoning (CBR) has became a mature and established subfield of Artificial
Intelligence (AI), both as a mean for addressing AI problems and as a basis for fielded
AI technology.
Now that CBR fundamental principles have been established and numerous applications have demonstrated CBR is an useful technology, many researchers agree about
the increasing necessity to formalise this kind of reasoning, define application analysis methodologies, and provide a design and implementation assistance with software
engineering tools [5, 4, 27, 16, 20]. While the underlying ideas of CBR can be applied consistently across application domains, the specific implementation of the CBR
methods in particular retrieval and similarity functions is highly customised to the
application at hand. Two factors have became critical: the availability of tools to build
CBR systems, and the accumulated practical experience of applying CBR techniques to
real-world problems.
Our work goes along all these increasing necessities. We have designed a tool to help
application designers to develop and quickly prototyping CBR systems. Besides we
want to provide a software tool useful for students who have little experience with the
development of different types of CBR systems. jCOLIBRI has been designed as a wide
spectrum framework able to support several types of CBR systems from the simple
nearest-neighbor approaches based on flat or simple structures to more complex Knowledge Intensive ones. It also supports the development of textual and conversational CBR
applications [41, 21]. Other features of the framework like: ontology integration, visualization of case bases, evaluation of CBR applications, classification and maintenance
methods, ... will be explained in this tutorial.
The framework implementation is evolving as new methods are included. Our (ambitious) goal is to provide a reference framework for CBR development that would grow
with contributions from the community. We invite the jCOLIBRI developers to send us
their methods to enrich the functionality of the framework and make them available to
the whole CBR community.
jCOLIBRI2 Tutorial
jCOLIBRI2 Tutorial
2.1
10
success, e.g. by being applied to the real world environment or evaluated by a teacher,
and repaired if failed. During RETAIN (or REMEMBER), useful experience is retained
for future reuse, and the case base is updated by a new learned case, or by modification
of some existing cases.
As indicated in the figure, general knowledge usually plays a part in this cycle, by supporting the CBR processes. This support may range from very weak (or none) to very
strong, depending on the type of CBR method. By general knowledge we mean general
domain-dependent knowledge, as opposed to specific knowledge embodied by cases.
For example, in diagnosing a patient by retrieving and reusing the case of a previous patient, a model of anatomy together with causal relationships between pathological states
may constitute the general knowledge used by a CBR system. A set of rules may have
the same role.
jCOLIBRI2 Tutorial
11
jCOLIBRI2 Tutorial
3.1
jCOLIBRI2 Architecture
12
The bottom layer contains the basic components of the framework with well defined and
clear interfaces. This layer does not contain any kind of graphical tool for developing
CBR applications; it is simply a white-box object-oriented framework that must be used
by programmers. The top layer contains semantic descriptions of the components and
several tools that aid users in the development of CBR applications (black-box with
visual builder framework).
The bottom layer has new features that solve most of the problems identified in the
first version. It takes advantage of the new possibilities offered by the newest versions
of the Java language. The most important change is the representation of cases as Java
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
3.1
jCOLIBRI2 Architecture
13
jCOLIBRI2 Tutorial
14
jCOLIBRI2 Tutorial
15
These short cuts link to the files located in the installation directory.
There is an application tester that can be launched using the following scripts:
In Windows: jCOLIBRI2-Tester.bat
In UNIX: jCOLIBRI2-Tester.sh
The tester let you run each one of the 30 test applications included in the release to
exemplify the main features of jCOLIBRI. In the Tester you can access to the javadocs
of the tests which are the main source of documentation as shown in Figure 4.
jCOLIBRI2 Tutorial
16
which features are explained by each test (see Figure 5). This table is also shown
through the Map button of the Tester application.
The first 16 examples illustrate the behavior of general CBR application meanwhile the
following 14 show how to implement recommender systems. The reading of these tests
is recommended while studying this tutorial as many sections cite them for details.
There is also an example of using the jCOLIBRI 2 jar library in a stand-alone application
named "Travel Recommender". It can be launched using the following scripts:
In Windows: TravelRecommender.bat
In UNIX: TravelRecommender.sh
This tutorial will show you how to develop the Travel Recommender application following some simple steps.
jCOLIBRI2 Tutorial
17
jCOLIBRI2 Tutorial
18
Define Query. In this step the user defines her query to the system. She has to define
which are the values of the different attributes of a trip: duration, season, transportation,
etc.
jCOLIBRI2 Tutorial
19
Configure Similarity. Here the user configures the similarity measure used to retrieve
the cases most similar to the query. jCOLIBRI 2 implements several similarity functions
that can be used depending on the type of the attribute (integers, strings, etc.). Moreover,
developers can define their own similarity measures.
You can also assign a weight to each attribute of the query that will be taken into account
when computing the average of all attributes. Also, some similarity functions can have
parameters used to configure the similarity measure.
Finally, the k value indicates how many cases must be retrieved. We are using an algorithm named K Nearest Neighbor (k-NN) that computes the similarity of the query with
all the cases and then orders the result depending on that similarity value. Then the first
k most similar cases are returned.
jCOLIBRI2 Tutorial
20
Retrieved Cases. This step shows the documents retrieved by the k-NN method. It
shows each case and its similarity to the query.
jCOLIBRI2 Tutorial
21
Adaptation. In this step the system adapts the retrieved cases to the requirements of the
user depending on the values defined in the query. This stage use to be very domain
dependent and will be different in other CBR systems. In our Travel Recommender
application we will adapt the price of the trips depending on the number of persons and
duration defined in the query.
Our system will use simple direct proportions to perform the adaptation. For example,
imagine that the retrieved case has a duration of 7 days and costs 1000. If the user is
looking for a 14 days trip we have to adapt the price using a direct proportion: if a 7
days trip costs 1000, then a 14 days trip costs 2000. This process is also repeated for
the number of persons in an analogous way.
After adapting the solution of the cases, their description can be substituted by the description of the query. At this point, the system will manage a list of working cases that
are different from the cases in the case base. These working cases represent possible
solutions to the problem described in the query.
jCOLIBRI2 Tutorial
22
Revise Cases. Once cases have been adapted, the user (or a domain expert, in this
case the trip agent) would adjust the values of the working cases in a manual way.
For example, imagine that the hotel has not available rooms and clients have to go to
a similar one. Another situation is that retrieved cases are not similar enough to the
requirements of the query and the trip agent has to define manually the solution of the
case.
jCOLIBRI2 Tutorial
23
Retain Cases. Finally, the trip agent would save the new trip case into the case base for
being used in future queries. If it is done, a new Id must be assigned to the trip.
jCOLIBRI2 Tutorial
24
jCOLIBRI2 Tutorial
6.1
25
Once the project is imported, you can navigate through its contents and source files
using the "Package Explorer"
jCOLIBRI2 Tutorial
6.1
26
The contents of your project before beginning with the tutorial is also shown in Figure
15.
jCOLIBRI2 Tutorial
6.1
27
jCOLIBRI2 Tutorial
28
the
storage .
@throws E x e c u t i o n E x c e p t i o n
/
p u b l i c CBRCaseBase p r e C y c l e ( ) throws E x e c u t i o n E x c e p t i o n ;
/
E x e c u t e s a CBR c y c l e w i t h t h e g i v e n q u e r y .
@throws E x e c u t i o n E x c e p t i o n
/
p u b l i c v o i d c y c l e ( CBRQuery q u e r y ) throws E x e c u t i o n E x c e p t i o n
;
/
Runs t h e c o d e t o s h u t d o w n t h e a p p l i c a t i o n . T y p i c a l l y
i t closes the connector .
@throws E x e c u t i o n E x c e p t i o n
/
p u b l i c v o i d p o s t C y c l e ( ) throws E x e c u t i o n E x c e p t i o n ;
}
jCOLIBRI2 Tutorial
jcolibri.examples.TravelRecommender.
method in the class:
29
p u b l i c c l a s s TravelRecommender implements
StandardCBRApplication {
p u b l i c v o i d c o n f i g u r e ( ) throws E x e c u t i o n E x c e p t i o n {
}
p u b l i c v o i d c y c l e ( CBRQuery q u e r y ) throws E x e c u t i o n E x c e p t i o n
{
}
p u b l i c v o i d p o s t C y c l e ( ) throws E x e c u t i o n E x c e p t i o n {
}
p u b l i c CBRCaseBase p r e C y c l e ( ) throws E x e c u t i o n E x c e p t i o n {
}
p u b l i c s t a t i c v o i d main ( S t r i n g [ ] a r g s ) {
}
}
To ease the access to this class from the different GUI frames and to ensure that there is
only one instance of this class we are going to implement a singleton pattern. Include
the following code into the class:
p r i v a t e s t a t i c TravelRecommender _ i n s t a n c e = n u l l ;
public
s t a t i c TravelRecommender g e t I n s t a n c e ( )
{
i f ( _ i n s t a n c e == n u l l )
_ i n s t a n c e = new TravelRecommender ( ) ;
return _ i n s t a n c e ;
}
p r i v a t e TravelRecommender ( )
{
}
This way, the unique instance of the TravelRecommender class must be accessed
through the getInstance() method.
As the application uses several frames we need a main frame that acts as the parent
of all of them. We are going to create a simple main frame that only contains the
jCOLIBRI 2 logo. Then we will have a separated frame for each step of the application.
Include the following code in the class:
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
30
...
SimilarityDialog similarityDialog ;
ResultDialog resultDialog ;
AutoAdaptationDialog autoAdaptDialog ;
RevisionDialog revisionDialog ;
RetainDialog retainDialog ;
...
p u b l i c v o i d c o n f i g u r e ( ) throws E x e c u t i o n E x c e p t i o n {
try {
/ / Create the dialogs
s i m i l a r i t y D i a l o g = new S i m i l a r i t y D i a l o g ( main ) ;
resultDialog
= new R e s u l t D i a l o g ( main ) ;
a u t o A d a p t D i a l o g = new A u t o A d a p t a t i o n D i a l o g ( main ) ;
revisionDialog
= new R e v i s i o n D i a l o g ( main ) ;
retainDialog
= new R e t a i n D i a l o g ( main ) ;
} catch ( Exception e ) {
throw new E x e c u t i o n E x c e p t i o n ( e ) ;
}
}
...
s t a t i c JFrame main ;
v o i d showMainFrame ( )
{
main = new JFrame ( " T r a v e l Recommender " ) ;
main . s e t R e s i z a b l e ( f a l s e ) ;
main . s e t U n d e c o r a t e d ( t r u e ) ;
J L a b e l l a b e l = new J L a b e l ( new I m a g e I c o n ( j c o l i b r i . u t i l .
F i l e I O . f i n d F i l e ( " / j c o l i b r i / t e s t / main / j c o l i b r i 2 . j p g " )
));
main . g e t C o n t e n t P a n e ( ) . add ( l a b e l ) ;
main . p a c k ( ) ;
D i m e n s i o n s c r e e n S i z e = j a v a . awt . T o o l k i t .
getDefaultToolkit () . getScreenSize () ;
main . s e t B o u n d s ( ( s c r e e n S i z e . w i d t h main . g e t W i d t h ( ) ) /
2,
( s c r e e n S i z e . h e i g h t main . g e t H e i g h t ( ) ) / 2 ,
main . g e t W i d t h ( ) ,
main . g e t H e i g h t ( ) ) ;
main . s e t V i s i b l e ( t r u e ) ;
}
jCOLIBRI2 Tutorial
31
Now lets create the main method of the Travel Recommender application. It configures the application, executes the precycle (only once), executes the cycle several times
and finally calls the postcycle code:
Listing 5: TravelRecommender main() method
p u b l i c s t a t i c v o i d main ( S t r i n g [ ] a r g s ) {
/ / Obtain TravelRecommender o b j e c t
TravelRecommender recommender = g e t I n s t a n c e ( ) ;
/ / Show t h e main f r a m e
recommender . showMainFrame ( ) ;
try
{
/ / Configure the application
recommender . c o n f i g u r e ( ) ;
/ / Execute the Precycle
recommender . p r e C y c l e ( ) ;
/ / Create th e frame t h a t o b t a i n s th e query
Q u e r y D i a l o g q f = new Q u e r y D i a l o g ( main ) ;
/ / Main CBR c y c l e
boolean cont = true ;
while ( cont )
{
/ / Show t h e q u e r y f r a m e
qf . s e t V i s i b l e ( true ) ;
/ / Obtain the query
CBRQuery q u e r y = q f . g e t Q u e r y ( ) ;
/ / Call the cycle
recommender . c y c l e ( q u e r y ) ;
/ / Ask i f c o n t i n u e
i n t ans = j a v a x . swing . JOptionPane . showConfirmDialog ( null ,
"CBR c y c l e f i n i s h e d , q u e r y a g a i n ? " , " C y c l e f i n i s h e d " ,
j a v a x . s w i n g . J O p t i o n P a n e . YES_NO_OPTION ) ;
c o n t = ( a n s == j a v a x . s w i n g . J O p t i o n P a n e . YES_OPTION ) ;
}
/ / Execute p o s t c y c le
recommender . p o s t C y c l e ( ) ;
} catch ( Exception e ) {
/ / Errors
o r g . a p a c h e . commons . l o g g i n g . L o g F a c t o r y . g e t L o g (
TravelRecommender . c l a s s ) . e r r o r ( e ) ;
j a v a x . swing . JOptionPane . showMessageDialog ( null , e .
getMessage ( ) ) ;
}
jCOLIBRI2 Tutorial
32
System . e x i t ( 0 ) ;
}
This code shows how to access the unique instance of the TravelRecommender
class through the getInstance() method. Then the main frame is shown. After that
the CBR application is configured and the precycle is executed. That code is executed
only once.
Then we create the frame that obtains the query: QueryDialog. As explained before,
this class must receive the main frame in the constructor. This frame (shown in Figure
6) returns the query defined by the user.
jCOLIBRI 2
defines
a
class
to
store
the
queries
named
jcolibri.cbrcore.CBRQuery. It will be explained later in Section 9.2.
Finally, the code shows that jCOLIBRI 2 uses the Apache Log4j library to manage the
logging messages.
jCOLIBRI2 Tutorial
33
create table t r a v e l (
c a s e I d VARCHAR( 1 5 ) ,
H o l i d a y T y p e VARCHAR( 2 0 ) ,
P r i c e INTEGER,
NumberOfPersons INTEGER,
R e g i o n VARCHAR( 3 0 ) ,
T r a n s p o r t a t i o n VARCHAR( 3 0 ) ,
D u r a t i o n INTEGER,
S e a s o n VARCHAR( 3 0 ) ,
Accommodation VARCHAR( 3 0 ) ,
H o t e l VARCHAR( 5 0 ) ) ;
To simplify the use of the examples, jCOLIBRI 2 includes the HSQLDB Data Base Manager (www.hsqldb.org). This DBM is completely implemented in Java and can be easily
included in any other Java project. This way, HSQLDB is used by the examples of the
framework and by the Travel Recommender application.
To launch the DBM you can use the jcolibri.test.database.HSQLDBserver
class. Its init() method initializes the DBM loading the data bases used by the
examples. This initialization also includes the tables used by our Travel Recommender
defined in travel.sql.
Now you have to add the following code into the TravelRecommender class to
lauch and stop the DBM:
p u b l i c v o i d c o n f i g u r e ( ) throws E x e c u t i o n E x c e p t i o n {
jCOLIBRI2 Tutorial
34
try {
/ / Emulate data base s e r v e r
j c o l i b r i . t e s t . d a t a b a s e . HSQLDBserver . i n i t ( ) ;
/ / Create the dialogs
...
} catch ( Exception e ) {
throw new E x e c u t i o n E x c e p t i o n ( e ) ;
}
}
p u b l i c v o i d p o s t C y c l e ( ) throws E x e c u t i o n E x c e p t i o n {
_connector . close () ;
j c o l i b r i . t e s t . d a t a b a s e . HSQLDBserver . shutDown ( ) ;
}
Note that this code is not the final code for those methods. They will be extended in
following sections.
jCOLIBRI2 Tutorial
35
p u b l i c i n t e r f a c e CaseComponent {
/
R e t u r n s t h e a t t r i b u t e t h a t i d e n t i f i e s t h e component .
An i d a t t r i b u t e m u s t be u n i q u e f o r e a c h c o m p o n e n t .
/
j c o l i b r i . cbrcore . Attribute getIdAttribute () ;
}
p u b l i c c l a s s MyBean{
int data ;
int getData ( ) {
return data ;
}
jCOLIBRI2 Tutorial
9.2
36
void s e t D a t a ( i n t _data ) {
this . data = _data ;
}
}
You can create an Attribute object that represent the data attribute of MyBean. It is
done with the following code:
The Attribute class allows jCOLIBRI 2 to manage the contents of the cases (Java Beans).
p u b l i c c l a s s CBRQuery{
CaseComponent d e s c r i p t i o n ;
/
R e t u r n s t h e d e s c r i p t i o n component .
/
p u b l i c CaseComponent g e t D e s c r i p t i o n ( ) {
return d e s c r i p t i o n ;
}
/
S e t s t h e d e s c r i p t i o n component .
/
p u b l i c v o i d s e t D e s c r i p t i o n ( CaseComponent d e s c r i p t i o n ) {
jCOLIBRI2 Tutorial
9.2
37
If a query contains a description, we can say that a case is a query plus solution, result and justification of the solution. This way, the jcolibri.cbrcore.CBRCase
extends jcolibri.cbrcore.CBRQuery adding those components:
p u b l i c c l a s s CBRCase e x t e n d s CBRQuery
CaseComponent s o l u t i o n ;
CaseComponent j u s t i f i c a t i o n O f S o l u t i o n ;
CaseComponent r e s u l t ;
/
Returns the j u s t i f i c a t i o n O f S o l u t i o n .
/
p u b l i c CaseComponent g e t J u s t i f i c a t i o n O f S o l u t i o n ( ) {
return j u s t i f i c a t i o n O f S o l u t i o n ;
}
/
S e t s t h e J u s t i f i c a t i o n o f S o l u t i o n component .
@param j u s t i f i c a t i o n O f S o l u t i o n t o s e t .
/
jCOLIBRI2 Tutorial
9.2
38
p u b l i c v o i d s e t J u s t i f i c a t i o n O f S o l u t i o n ( CaseComponent
justificationOfSolution ) {
this . justificationOfSolution = justificationOfSolution ;
}
/
Returns the r e s u l t .
/
p u b l i c CaseComponent g e t R e s u l t ( ) {
return r e s u l t ;
}
/
S e t s t h e R e s u l t component
/
p u b l i c v o i d s e t R e s u l t ( CaseComponent r e s u l t ) {
this . result = result ;
}
/
Returns the solution .
/
p u b l i c CaseComponent g e t S o l u t i o n ( ) {
return s o l u t i o n ;
}
/
S e t s t h e s o l u t i o n component
/
p u b l i c v o i d s e t S o l u t i o n ( CaseComponent s o l u t i o n ) {
this . solution = solution ;
}
public String t o S t r i n g ( )
{
return super . t o S t r i n g ( ) +" [ S o l u t i o n : "+ s o l u t i o n +" ] [ Sol .
J u s t . : "+ j u s t i f i c a t i o n O f S o l u t i o n +" ] [ R e s u l t : "+ r e s u l t
+" ] " ;
}
}
Each CaseComponent bean can have attributes that also are CaseComponents
beans. This way, developers can create case structures with nested attributes. This
feature is shown in the Test 3 of the examples.
The UML diagram in Figure 16 shows the relationship between cases, queries and casecomponents.
jCOLIBRI2 Tutorial
9.3
39
p u b l i c c l a s s T r a v e l D e s c r i p t i o n implements j c o l i b r i . c b r c o r e .
CaseComponent {
p u b l i c enum AccommodationTypes { O n e S t a r , TwoStars ,
ThreeStars , HolidayFlat , FourStars , FiveStars };
p u b l i c enum S e a s o n s { J a n u a r y , F e b r u a r y , March , A p r i l , May , June
, J u l y , August , S e p t e m b e r , O c t o b e r , November , December } ;
String
caseId ;
String
HolidayType ;
I n t e g e r NumberOfPersons ;
I n s t a n c e Region ;
String
Transportation ;
Integer
Duration ;
Seasons Season ;
AccommodationTypes Accommodation ;
jCOLIBRI2 Tutorial
9.3
40
public A t t r i b u t e g e t I d A t t r i b u t e ( ) {
r e t u r n new A t t r i b u t e ( " c a s e I d " , t h i s . g e t C l a s s ( ) ) ;
}
public String t o S t r i n g ( )
{
r e t u r n " ( " + c a s e I d + " ; " + H o l i d a y T y p e + " ; " + NumberOfPersons + "
; "+Region+" ; "+ T r a n s p o r t a t i o n +" ; "+ D u r a t i o n +" ; "+Season
+ " ; " +Accommodation+ " ) " ;
}
...
Now you have to add a get() and set() method for each attribute. But, dont worry
because Eclipse does it automatically selecting the menu item: Source - Generate Getters and Setters....
In this description we are using two enumerate types to define the accommodation and
season. Also, there is an strange type named Instance that defines the region. This type
is defined in jcolibri.datatypes.Instance and represents an instance of an
ontology. Now, we are not going into detail with this because it will be explained in
Section 12.4.
The solution bean must be created in a similar way. Its name is TravelSolution
and this is the code without the getters and setters (that you have to generate):
Listing 14: TravelSolution initial code
implements j c o l i b r i . c b r c o r e .
String id ;
Integer price ;
String hotel ;
public String t o S t r i n g ( )
{
return " ( "+ id +" ; "+ p r i c e +" ; "+ h o t e l +" ) " ;
}
public A t t r i b u t e g e t I d A t t r i b u t e ( ) {
r e t u r n new A t t r i b u t e ( " i d " , t h i s . g e t C l a s s ( ) ) ;
}
...
jCOLIBRI2 Tutorial
9.4
41
public i n t e r f a c e TypeAdaptor {
/
Returns a s t r i n g representation of the type .
/
public abstract String t o S t r i n g ( ) ;
/
Reads t h e t y p e f r o m a s t r i n g .
/
p u b l i c a b s t r a c t v o i d f r o m S t r i n g ( S t r i n g c o n t e n t ) throws
Exception ;
/
You m u s t d e f i n e t h i s method t o a v o i d p r o b l e m s w i t h t h e
data base connector ( Hibernate )
/
public a b s t r a c t boolean e q u a l s ( Object o ) ;
}
jCOLIBRI2 Tutorial
42
jCOLIBRI2 Tutorial
10.1
jcolibri.connectors.OntologyConnector.
bases stored into ontologies.
43
The obvious interface for a connector must include methods to read the Case Base into
memory and update it back into persistent media. More specifically, jCOLIBRI 2 includes
an interface named Connector that belongs to package jcolibri.cbrcore. Every connector is supposed to implement the methods defined by this interface:
Listing 16: Connector interface
public i n t e r f a c e Connector {
/
I n i t i a l i c e s t h e c o n n e c t o r w i t h t h e g i v e n XML f i l e
/
p u b l i c v o i d i n i t F r o m X M L f i l e ( j a v a . n e t . URL f i l e ) throws
InitializingException ;
/
C l e a n u p any r e s o u r c e t h a t t h e c o n n e c t o r m i g h t be u s i n g ,
and s u s p e n d s t h e s e r v i c e
/
public void c l o s e ( ) ;
/
S t o r e s g i v e n c l a s s e s on t h e s t o r a g e media
/
p u b l i c v o i d s t o r e C a s e s ( C o l l e c t i o n <CBRCase> c a s e s ) ;
/
D e l e t e s g i v e n c a s e s f o r t h e s t o r a g e media
/
p u b l i c v o i d d e l e t e C a s e s ( C o l l e c t i o n <CBRCase> c a s e s ) ;
/
R e t u r n s a l l t h e c a s e s i n t h e s t o r a g e media
/
p u b l i c C o l l e c t i o n <CBRCase> r e t r i e v e A l l C a s e s ( ) ;
/
R e t r i e v e s some c a s e s d e p e n d i n g on t h e f i l t e r . TODO .
/
p u b l i c C o l l e c t i o n <CBRCase> r e t r i e v e S o m e C a s e s ( C a s e B a s e F i l t e r
filter );
}
jCOLIBRI2 Tutorial
10.2
44
Connectors are configured through XML configuration files. Each jCOLIBRI 2 connector
defines the XML schema of its configuration file. These schemes can be found in the
documentation.
An interface such that assumes that the whole Case Base can be read into memory for
the CBR processes to work with it. However, in a real sized CBR application this
approach may not be feasible. For that reason, we are working to extend connector interface to retrieve those cases that satisfy a query expressed in a subset of SQL
(retrieveSomeCases(CaseBaseFilter)). This way the designer can decide
what part of the Case Base is loaded into memory.
If a developer requires a specific connector, she can create her own one extending the
Connector interface. This is shown in the Test 13 of the code examples of the framework.
p u b l i c i n t e r f a c e CBRCaseBase {
/
I n i t i a l i z e s t h e case base . This methods r e c i e v e s
t h e c o n n e c t o r t h a t manages t h e p e r s i s t e n c e media .
/
p u b l i c v o i d i n i t ( C o n n e c t o r c o n n e c t o r ) throws j c o l i b r i .
exception . InitializingException ;
/
De I n i t i a l i z e s t h e c a s e b a s e .
/
public void c l o s e ( ) ;
/
R e t u r n s a l l t h e c a s e s a v a i l a b l e on t h i s c a s e b a s e
/
p u b l i c C o l l e c t i o n <CBRCase> g e t C a s e s ( ) ;
jCOLIBRI2 Tutorial
10.2
45
/
R e t u r n s some c a s e s d e p e n d i n g on t h e f i l t e r
/
p u b l i c C o l l e c t i o n <CBRCase> g e t C a s e s ( C a s e B a s e F i l t e r f i l t e r ) ;
/
Adds a c o l l e c t i o n o f new CBRCase o b j e c t s t o
the c u r r e n t case base
/
p u b l i c v o i d l e a r n C a s e s ( C o l l e c t i o n <CBRCase> c a s e s ) ;
/
Removes a c o l l e c t i o n o f new CBRCase o b j e c t s t o t h e
c u r r e n t case base
/
p u b l i c v o i d f o r g e t C a s e s ( C o l l e c t i o n <CBRCase> c a s e s ) ;
Analogous to the Connector interface, developers can create their in-memory organizations of cases implementing the CBRCaseBase interface. jCOLIBRI 2 includes the
following Case Bases:
jcolibri.casebase.LinealCaseBase: Basic Lineal Case Base that
stores cases into a List.
jcolibri.casebase.CachedLinealCaseBase: Cached case base that
only persists cases when closing the application.
jcolibri.casebase.IDIndexedLinealCaseBase:
Extension of
LinealCaseBase that also keeps an index of cases using their IDs.
jCOLIBRI2 Tutorial
46
...
/ Connector o b j e c t /
Connector _connector ;
/ CaseBase o b j e c t /
CBRCaseBase _ c a s e B a s e ;
...
p u b l i c v o i d c o n f i g u r e ( ) throws E x e c u t i o n E x c e p t i o n {
try {
/ / Emulate data base s e r v e r
j c o l i b r i . t e s t . d a t a b a s e . HSQLDBserver . i n i t ( ) ;
/ / Create a data base connector
_ c o n n e c t o r = new D a t a B a s e C o n n e c t o r ( ) ;
/ / I n i t t h e ddbb c o n n e c t o r w i t h t h e c o n f i g f i l e
_connector . initFromXMLfile ( j c o l i b r i . u t i l . FileIO
. f i n d F i l e ( " j c o l i b r i / examples /
TravelRecommender / d a t a b a s e c o n f i g . xml " ) ) ;
/ / C r e a t e a L i n e a l c a s e b a s e f o r i n memory
organization
_ c a s e B a s e = new L i n e a l C a s e B a s e ( ) ;
/ / Create the dialogs
...
} catch ( Exception e ) {
throw new E x e c u t i o n E x c e p t i o n ( e ) ;
}
p u b l i c CBRCaseBase p r e C y c l e ( ) throws E x e c u t i o n E x c e p t i o n {
/ / Load c a s e s f r o m c o n n e c t o r i n t o t h e c a s e b a s e
_caseBase . i n i t ( _connector ) ;
/ / Print the cases
j a v a . u t i l . C o l l e c t i o n <CBRCase> c a s e s = _ c a s e B a s e .
getCases () ;
f o r ( CBRCase c : c a s e s )
System . o u t . p r i n t l n ( c ) ;
return _caseBase ;
}
jCOLIBRI2 Tutorial
11.1
47
Firstly, we are creating the two variables that will contain the connector and case base:
_connector and _caseBase. They are defined with the type of the interfaces
Connector and CBRCaseBase, but in the configure() method we will assign
an instance of DataBaseConnector and LinealCaseBase that implement these
interfaces. As explained before, each connector can use a xml file that defines its configuration. In this case, we are using the file jcolibri/examples/TravelRecommender/databaseconfig.xml to configure the Data Base connector. This file is explained in the
following subsection.
In the preCycle() method we initializes the case base object with the connector
through the init() method. This action will load the cases from the persistence into
the memory. Then we can access the cases in the case base object (here to print them to
console).
<DataBaseConfiguration>
<HibernateConfigFile>
j c o l i b r i / e x a m p l e s / TravelRecommender / h i b e r n a t e . c f g . xml
</ HibernateConfigFile>
<DescriptionMappingFile>
j c o l i b r i / e x a m p l e s / TravelRecommender / T r a v e l D e s c r i p t i o n . hbm .
xml
</ DescriptionMappingFile>
<DescriptionClassName>
j c o l i b r i . e x a m p l e s . TravelRecommender . T r a v e l D e s c r i p t i o n
< / DescriptionClassName>
<SolutionMappingFile>
j c o l i b r i / e x a m p l e s / TravelRecommender / T r a v e l S o l u t i o n . hbm . xml
</ SolutionMappingFile>
<SolutionClassName>
j c o l i b r i . e x a m p l e s . TravelRecommender . T r a v e l S o l u t i o n
< / SolutionClassName>
</ DataBaseConfiguration>
jCOLIBRI2 Tutorial
11.1
48
jCOLIBRI2 Tutorial
11.1
49
To use Hibernate with other DBMs, developers should modify the following properties
(described in the Hibernate documentation at http://www.hibernate.org/
hib_docs/v3/reference/en/html/session-configuration.html#
configuration-hibernatejdbc):
connection.driver_class: jdbc driver class of your DBMs (must be included in the
classpath).
connection.url: jdbc connection url.
username and password.
dialect: choose one from the table at: http://www.hibernate.org/hib_
docs/v3/reference/en/html/session-configuration.html#
configuration-optional-dialects
These are the small changes required to use the Hibernate connector with other DBM.
Now, we must define the mapping files.
jCOLIBRI2 Tutorial
11.1
50
Lets begin with the mapping file of the solution. You must create the file TravelSolution.hbm.xml into jcolibri.example.travelrecommender with the following content:
jCOLIBRI2 Tutorial
11.1
51
The
into
jCOLIBRI2 Tutorial
11.1
52
jCOLIBRI2 Tutorial
11.1
53
enumerate_class_that_defines_attribute_type
< / param >
</ type>
</ property>
jCOLIBRI2 Tutorial
54
http://jena.sourceforge.net/
http://pellet.owldl.com
jCOLIBRI2 Tutorial
12.1
55
To explain in detail how works the OntologyConnector lets use the example shown
in Figure 20. A concept of the ontology will define the cases (VACATION_CASE)
and the other related concepts define the attributes (CATEGORY, PRICE and HOLIDAY_TYPE). This way, the OntologyConnector obtains the instances of the VACATION_CASE concept (case1, and case2) and follows their relationships to obtain the
values of the attributes (That will be instances of the concepts that define the attributes:
CATEGORY, PRICE and HOLIDAY_TYPE).
jCOLIBRI2 Tutorial
12.3
56
based similarity that depends on the location of the cases in the ontology. Of course,
other similarity functions can easily be included. These similarity functions are shown
in Figure 21 whereas Figure 22 explains how they work with a small ontology taken as
example.
These similarity measures are implemented in the package:
jcolibri.
method.retrieve.NNretrieval.similarity.local.ontology. The
use of these measures will be explained in Section 13.
jCOLIBRI2 Tutorial
12.3
fdeep_basic(i1 , i2 ) =
max(prof(LCS(i1 , i2 )))
max (prof(Ci ))
Ci CN
57
fdeep(i1 , i2 ) =
max(prof(LCS(i1 , i2 )))
max(prof(i1 ), prof(i2 ))
cosine(i1 , i2 ) =
[
\
[
(super(d
,
CN
))
(super(d
,
CN
))
i
i
di t(i1 )
di t(i2 )
v
sim(t(i1 ), t(i2 )) = v
u
u
u [
u
[
u
u
t
(super(di , CN )) t
(super(di , CN ))
di t(i1 )
di t(i2 )
detail(i1 , i2 ) =
detail(t(i1 ), t(i2 )) = 1
1
[
\
[
2
(super(di , CN ))
(super(di , CN ))
di t(i1 )
di t(i2 )
Where:
CN is the set of all the concepts in the current knowledge base
super(c, C) is the subset of concepts in C which are superconcepts of c
LCS(i1 , i2 ) is the set of the least common subsumer concepts of the two
given individuals
prof(c) is the depth of concept c
t(i) is the set of concepts the individual i is instance of
Figure 21: Concept based similarity functions in jCOLIBRI2
Example: In the travel domain, lets suppose we have an existing case base where
it is defined an enumerated type for the Region attribute where the allowed values are
countries: Spain, France, Italy and others. Suppose that casei is a case whose destination is Spain. We do not want to restrict the query vocabulary to the same type but allow
broader queries, for example:
Query1: "I want to go to Madrid"
Query2: "My favorite destination is Europe"
Query3: "I would like to travel to Spain"
In the three queries and using the ontology of Figure 22 we could find casei as a suitable
candidate.
jCOLIBRI2 Tutorial
12.4
Sim(Mad,Bcn)
Sim(Mad,Paris)
Sim(NY,Seattle
Sim(Seattle, Vanc)
Sim(Bcn,Bogot)
Sim(Mad,Bogot)
Sim(Vanc,Bogot)
fdeep_basic
3/4
1/2
1
3/4
0
0
1/2
fdeep
1
1/2
1
3/4
0
0
1/2
58
cosine
1
2/3
1
3/4
1/12
1/ 12
1/2
detail
5/6
3/4
7/8
5/6
1/2
1/2
3/4
p u b l i c v o i d c o n f i g u r e ( ) throws E x e c u t i o n E x c e p t i o n {
try {
/ / Emulate data base s e r v e r
j c o l i b r i . t e s t . d a t a b a s e . HSQLDBserver . i n i t ( ) ;
/ / Create a data base connector
_ c o n n e c t o r = new D a t a B a s e C o n n e c t o r ( ) ;
/ / I n i t t h e ddbb c o n n e c t o r w i t h t h e c o n f i g f i l e
_connector . initFromXMLfile ( j c o l i b r i . u t i l . FileIO
http://protege.stanford.edu/
jCOLIBRI2 Tutorial
12.4
59
. f i n d F i l e ( " j c o l i b r i / examples /
TravelRecommender / d a t a b a s e c o n f i g . xml " ) ) ;
/ / C r e a t e a L i n e a l c a s e b a s e f o r i n memory o r g a n i z a t i o n
_ c a s e B a s e = new L i n e a l C a s e B a s e ( ) ;
/ / Obtain a r e f e r e n c e to OntoBridge
O n t o B r i d g e ob = j c o l i b r i . u t i l . O n t o B r i d g e S i n g l e t o n .
getOntoBridge ( ) ;
/ / C o n f i g u r e i t t o work w i t h t h e P e l l e t r e a s o n e r
ob . i n i t W i t h P e l l e t R e a s o n e r ( ) ;
/ / S e t u p t h e main o n t o l o g y
OntologyDocument mainOnto = new OntologyDocument ( " h t t p : / /
g a i a . f d i . ucm . e s / o n t o l o g i e s / t r a v e l d e s t i n a t i o n s . owl " ,
FileIO . findFile ( " j c o l i b r i /
examples /
TravelRecommender /
t r a v e l d e s t i n a t i o n s . owl
" ) . toExternalForm () ) ;
/ / There are not s u b o n t o l o g i e s
A r r a y L i s t < OntologyDocument > s u b O n t o l o g i e s = new
A r r a y L i s t < OntologyDocument > ( ) ;
/ / Load t h e o n t o l o g y
ob . l o a d O n t o l o g y ( mainOnto , s u b O n t o l o g i e s , f a l s e ) ;
/ / Create the dialogs
s i m i l a r i t y D i a l o g = new S i m i l a r i t y D i a l o g ( main ) ;
jCOLIBRI2 Tutorial
12.4
resultDialog
autoAdaptDialog
revisionDialog
retainDialog
=
=
=
=
new
new
new
new
60
R e s u l t D i a l o g ( main ) ;
A u t o A d a p t a t i o n D i a l o g ( main ) ;
R e v i s i o n D i a l o g ( main ) ;
R e t a i n D i a l o g ( main ) ;
} catch ( Exception e ) {
throw new E x e c u t i o n E x c e p t i o n ( e ) ;
}
}
The remaining steps to use the ontology have been already implemented. In the
TravelDescription code (Listing 13) we defined the type of the Region attribute
as Instance. This way, the jCOLIBRI 2 methods will connect to OntoBridge to manage that attribute. Moreover, we need to indicate the connector how to manage this
attribute, but we did that when defining the mapping file of TravelDescription in
TravelDescription.hbm.xml. Then, our connector will use OntoBridge to link
the values in the data base with the instances of the loaded ontology.
In this case we are not storing the whole case base into the ontology. The ontology
only defines the type and values of the Region attribute. The data base connector will
read the values in the table and look for an instance with the same name in the ontology
(through OntoBridge). Once found, the connector will fill the value of the Region
attribute of each case with the corresponding instance of the ontology. This is shown
in Figure 24 and the code required to map the data base with the ontology appears in
Listing 22.
jCOLIBRI2 Tutorial
13. Retrieval
61
13 Retrieval
At this point, we have completed the configure() and precycle() methods of
our Travel Recommender application. Those methods allow us to load the cases from
the persistence into memory. The precycle() method loads the cases and stores
them into the _caseBase object of our main application (review Listing 18). This
section and the following ones show how to fill the cycle() method to perform the
4Rs tasks (retrieval, reuse, revise, and retain).
The retrieval step obtains the most similar cases given a query.
The main method of jCOLIBRI 2 for computing the retrieval is the
jcolibri.method.retrieve.NNretrieval.NNScoringMethod class.
This method performs a Nearest Neighbor numeric scoring comparing attributes. It
uses global similarity functions to compare compound attributes (CaseComponents)
and local similarity functions to compare simple attributes.
For example, the TravelDescription case component of our cases is a compound
attribute composed by several simple attributes (season, duration, etc.). So, we assign
a global similarity function to the description like the average function. Then, simple
similarity functions like equal, numeric interval, enumerate distance, ... are configured
for each simple attribute. The NNScoringMethod will compute the similarity of
each simple attribute and then compute the global similarity: the average of the simple
similarities.
The configuration of those similarity functions is stored in
jcolibri.method.retrieve.NNretrieval.NNConfig object.
values configured in this object are:
the
The
/
P e r f o r m s t h e NN s c o r i n g o v e r a c o l l e c t i o n o f c a s e s
c o m p a r i n g them w i t h a q u e r y .
T h i s method i s c o n f i g u r e d t h r o u g h t h e NNConfig o b j e c t .
/
public s t a t i c Collection < RetrievalResult > e v a l u a t e S i m i l a r i t y (
C o l l e c t i o n <CBRCase> c a s e s , CBRQuery q u e r y , NNConfig
simConfig )
jCOLIBRI2 Tutorial
13.1
Similarity functions
62
The parameters of this method are very intuitive: the cases to compare with the
query, the query and the similarity configuration. The method returns a collection
of jcolibri.method.retrieve.RetrievalResult objects. These objects
contain the retrieved case and a double that represents the similarity of that case to the
query.
jCOLIBRI2 Tutorial
13.2
Cases selection
63
p u b l i c d o u b l e compute ( O b j e c t c a s e O b j e c t , O b j e c t q u e r y O b j e c t )
throws j c o l i b r i . e x c e p t i o n .
NoApplicableSimilarityFunctionException ;
/
I n d i c a t e s i f t h e f u n c t i o n i s a p p l i c a b l e t o two o b j e c t s
/
public boolean i s A p p l i c a b l e ( Object caseObject , Object
queryObject ) ;
}
As the previous listing shows, the local similarity functions cannot access to the
other attributes of the case to compute their measures. This can be a problem in
some applications where the similarity of two attributes depends on the value of
other attributes in the same case component. To solve this drawback, jCOLIBRI 2
includes the jcolibri.method.retrieve.NNretrieval.similarity.
InContextLocalSimilarityFunction abstract class that incorporates information about the context (CaseComponent) of the attribute. Extended information
about this class can be found in its documentation.
/
Selects a l l cases
@param c a s e s t o s e l e c t
@return a l l c a s e s
/
p u b l i c s t a t i c C o l l e c t i o n <CBRCase> s e l e c t A l l ( C o l l e c t i o n <
RetrievalResult > cases )
/
S e l e c t s top K cases
@param c a s e s t o s e l e c t
@param k i s t h e number o f c a s e s t o s e l e c t
@return t o p k c a s e s
/
p u b l i c s t a t i c C o l l e c t i o n <CBRCase> s e l e c t T o p K ( C o l l e c t i o n <
RetrievalResult > cases , int k )
jCOLIBRI2 Tutorial
13.3
64
/
S e l e c t s a l l c a s e s b u t r e t u r n s them i n t o R e t r i e v a l R e s u l t
objects
@param c a s e s t o s e l e c t
@return a l l c a s e s i n t o R e t r i e v a l R e s u l t o b j e c t s
/
public s t a t i c Collection < RetrievalResult > selectAllRR (
Collection <RetrievalResult > cases )
/
S e l e c t s t o p k c a s e s b u t r e t u r n s them i n t o
RetrievalResult objects
@param c a s e s t o s e l e c t
@return t o p k c a s e s i n t o R e t r i e v a l R e s u l t o b j e c t s
/
p u b l i c s t a t i c C o l l e c t i o n < R e t r i e v a l R e s u l t > selectTopKRR (
Collection < RetrievalResult > cases , int k )
Previous listing shows that there are two versions of the methods: one returning
CBRCase objects and other returning RetrievalResult object.
There are another more sophisticated ways to select cases.
Sometimes
it is important to select cases that are similar to the query but also diverse among them.
jCOLIBRI2 includes some of these methods in the
jcolibri.method.retrieve.selection. As these methods were included
in the recommenders extension of the 2.1 version, they are detailed in Section 20.2.9.
Note that there are important changes in the k-NN retrieval implementation from version
2.0 to version 2.1. These changes are detailed in Section 24.
p u b l i c v o i d c y c l e ( CBRQuery q u e r y ) throws E x e c u t i o n E x c e p t i o n
{
/ / F i r s t c o n f i g u r e t h e NN s c o r i n g
jCOLIBRI2 Tutorial
13.3
65
The code is very simple: the NNConfig class has methods to set the similarity function and weight for the attributes. The attributes of a bean are represented using the
jcolibri.cbrcore.Attribute class as explained in Section 9.1. Once configured, NNScoringMethod.evaluateSimilarity() is executed obtaining a list
of RetrievalResult objects that contain the most similar cases to the query. Fi-
jCOLIBRI2 Tutorial
13.4
66
p u b l i c v o i d c y c l e ( CBRQuery q u e r y ) throws E x e c u t i o n E x c e p t i o n {
/ / O b t a i n c o n f i g u r a t i o n f o r kNN
s i m i l a r i t y D i a l o g . s e t V i s i b l e ( true ) ;
NNConfig s i m C o n f i g = s i m i l a r i t y D i a l o g . g e t S i m i l a r i t y C o n f i g ( ) ;
s i m C o n f i g . s e t D e s c r i p t i o n S i m F u n c t i o n ( new A v e r a g e ( ) ) ;
/ / E x e c u t e NN
C o l l e c t i o n < R e t r i e v a l R e s u l t > e v a l = NNScoringMethod .
e v a l u a t e S i m i l a r i t y ( _caseBase . g e t C a s e s ( ) , query , simConfig )
;
/ / Select k cases
C o l l e c t i o n <CBRCase> s e l e c t e d c a s e s = S e l e c t C a s e s . s e l e c t T o p K (
e v a l , s i m i l a r i t y D i a l o g . getK ( ) ) ;
/ / Show r e s u l t
r e s u l t D i a l o g . showCases ( e v a l , s e l e c t e d c a s e s ) ;
r e s u l t D i a l o g . s e t V i s i b l e ( true ) ;
...
}
With this code we have implemented the functionality shown in Figures 7 and 8.
jCOLIBRI2 Tutorial
14. Reuse
67
14 Reuse
The reuse step (also named adaptation step) adapts the solution of the retrieved cases
to the requirements of the query. This step is very domain dependent and use to vary
depending on the application. jCOLIBRI 2 leaves this step open to developers. They
should create their own adaptation methods customized for the application.
Anyway, jCOLIBRI 2 offers two basic adaptation methods:
jcolibri.method.reuse.DirectAttributeCopyMethod.
This
method copies the value of an attribute in the query to an attribute of a case.
jcolibri.method.reuse.NumericDirectProportionMethod.
Performs a numerical direct proportion among attributes of the query and the
case.
Besides these methods, the jcolibri.method.reuse.classification package includes some classification reuse methods implemented by Lisa Cummins & Derek
Bridge. (University College Cork, Ireland).
Ontologies can be used to guide the adaptation of the cases. There are some experimental methods (not included in the framework) that are explained in [43].
Once the retrieved cases are adapted, their description can be substituted
by the description of the query.
This way, we obtain a list of cases
with the same description than the query.
This step is performed by the
jcolibri.method.reuse.CombineQueryAndCasesMethod method, and is
optional depending on the application.
After adapting the cases, the CBR system proposes them as a suggested solution of the
problem (review Figure 1). Then, the system user (or a domain expert) could revise
these solutions in the following step.
jCOLIBRI2 Tutorial
14.1
68
To perform these adaptations in the Travel Recommender application append the following code to the cycle() method:
...
/ / Show a d a p t a t i o n d i a l o g
autoAdaptDialog . s e t V i s i b l e ( true ) ;
/ / A d a p t d e p e n d i n g on u s e r s e l e c t i o n
i f ( autoAdaptDialog . adapt_Duration_Price ( ) )
{
/ / Compute a d i r e c t p r o p o r t i o n b e t w e e n t h e " D u r a t i o n " and "
Price " a t t r i b u t e s .
N u m e r i c D i r e c t P r o p o r t i o n M e t h o d . d i r e c t P r o p o r t i o n ( new
A t t r i b u t e ( " D u r a t i o n " , T r a v e l D e s c r i p t i o n . c l a s s ) , new
A t t r i b u t e ( " p r i c e " , T r a v e l S o l u t i o n . c l a s s ) , query ,
selectedcases ) ;
}
i f ( autoAdaptDialog . adapt_NumberOfPersons_Price ( ) )
{
/ / Compute a d i r e c t p r o p o r t i o n b e t w e e n t h e " D u r a t i o n " and "
Price " a t t r i b u t e s .
N u m e r i c D i r e c t P r o p o r t i o n M e t h o d . d i r e c t P r o p o r t i o n ( new
A t t r i b u t e ( " NumberOfPersons " , T r a v e l D e s c r i p t i o n . c l a s s ) ,
new A t t r i b u t e ( " p r i c e " , T r a v e l S o l u t i o n . c l a s s ) , q u e r y ,
selectedcases ) ;
}
...
This code shows the dialog in Figure 9 and performs the required adaptations.
jCOLIBRI2 Tutorial
15. Revise
69
15 Revise
In the revise step the proposed solution is tested for success, e.g. by being applied to the
real world environment or evaluated by a domain expert, and repaired if failed.
This step is also very domain dependent and may change among applications.
jCOLIBRI 2 only includes a method to define new Ids to the cases
as they will be stored into the data base during the following step and
they cannot use the original Id of the retrieved cases.
This method is
jcolibri.method.revise.DefineNewIdsMethod.
There are also some classification revise methods implemented by Lisa
Cummins & Derek Bridge.
(University College Cork, Ireland) in the
jcolibri.method.revise.classification package.
...
/ / Revise
r e v i s i o n D i a l o g . showCases ( s e l e c t e d c a s e s ) ;
r e v i s i o n D i a l o g . s e t V i s i b l e ( true ) ;
...
jCOLIBRI2 Tutorial
16. Retain
70
16 Retain
In the retain step useful new cases are stored in the case base for future reuse. This way
the CBR system has learned a new experience.
jCOLIBRI 2 includes the jcolibri.method.retain.StoreCasesMethod to
include new cases into the case base. The new added cases will be stored in the
persistence media depending on the chosen implementation of CBRCaseBase. The
LinealCaseBase class will store the cases directly into the persistence layer, but
the CachedLinealCaseBase will keep the new cases into memory and save them
only when closing the CBR application (this is, when the postCycle() method is
invoked).
/ / Retain
r e t a i n D i a l o g . showCases ( s e l e c t e d c a s e s , _ c a s e B a s e . g e t C a s e s ( ) .
size () ) ;
r e t a i n D i a l o g . s e t V i s i b l e ( true ) ;
C o l l e c t i o n <CBRCase> c a s e s T o R e t a i n = r e t a i n D i a l o g .
getCasestoRetain () ;
_caseBase . learnCases ( casesToRetain ) ;
This code code shows the the window illustrated in Figure 11.
With this retain step we have completed the code of the cycle() method of the Travel
Recommender application.
jCOLIBRI2 Tutorial
71
p u b l i c v o i d p o s t C y c l e ( ) throws E x e c u t i o n E x c e p t i o n {
_connector . close () ;
j c o l i b r i . t e s t . d a t a b a s e . HSQLDBserver . shutDown ( ) ;
}
This method completes the tutorial. Now the Travel Recommender code should
compile and work properly. If you find some problems read the complete source
code of the located into the example subfolder of the framework and compressed
as travelrecommender-source.zip.
jCOLIBRI2 Tutorial
72
18 Textual CBR
Textual Case-Based Reasoning (TCBR) is a subfield of CBR the cases are available in
textual format. This type of CBR systems is very interesting because in most of the
domains where CBR can be applied the available experiences are in textual format.
Some examples are: laws, medicine, help-desk, etc.
There does not appear to be a standard or consensus about the structure of a textual
CBR system. This is mainly due to the different knowledge requirements in application
domains. For classification applications typically only a basic stemmer algorithm and
a cosine similarity function is needed, while with other applications more intense NLP
derived structures are employed (see [9] and [10]). Although a common functionality
for TCBR systems is difficult to establish, several researchers have attempted to define
the different knowledge requirements for TCBR([24],[55]).
To support the development of TCBR systems, jCOLIBRI 2 includes a complete extension with useful methods for this kind of applications. As adaptation continues being
a domain specific task, the framework only includes methods for the retrieval step.
There are two big groups of retrieval algorithms that can be used in TCBR:
The first group is based on IE methods to capture features from the text and then
perform standard Nearest Neighbor similarity based computation over those features. With these methods developers extract the information contained in the
texts and represent them in a structured way that allows to apply the typical retrieval and reuse methods in common CBR applications.
The second one is composed by the broadly applied IR algorithms used by search
engines and based in the Vector Space Model. These methods can achieve very
good results in the retrieval step, but make difficult the adaptation because there
is not a structure representation of the cases.
We could refer to the first group as semantic retrieval because it tries to capture the
semantics of the texts and the second as statistical retrieval because it only takes into
account the frequency of terms.
jCOLIBRI2 Tutorial
18.1
Semantic retrieval
73
1. Keyword Layer. This layer separates texts into terms, removes stop-words, stem
terms and calculates statistics about frequency of terms. It also proposes a partof-speech tagger in this layer that could be useful by the following ones. This
layer is domain-independent, so it can be shared between applications.
2. Phrase Layer. Recognises domain-specific phrases using a dictionary. Here, the
problems are that some parts of the phrase can be separated and that the dictionary
must be built manually.
3. Thesaurus Layer. This layer identifies synonyms and related terms. Methods
implemented in this layer must be reusable in the query stage of the CBR cycle.
WordNet can be used as an english thesaurus. This phase is domain-independent.
4. Glossary Layer. Is the domain-specific version of the thesaurus layer. So it is
desirable to define a common interface for both layers. The main difficulty with
this layer resides in the glossary acquisition.
5. Feature Value Layer. With semi-structured cases, this layer extracts features about
the case and stores it as <attribute,value> pairs in the case representation. It is also
domain-specific.
6. Domain Structure Layer. Uses the previous layer to classify documents in a high
level. It assigns "topic" features to the cases that can be useful in the indexing
process.
7. Information Extraction Layer. Some parts of the texts can be better represented
with a structured approximation. This layer accomplish this task. (note that this
functionality can overlap with the two previous layers).
The last IE layer applies user defined rules to extract the features using the information
obtained in previous layers. This way, developers obtain a structured representation of
cases that can be managed by standard similarity matching techniques from CBR.
18.1.1 Representation of the texts
jCOLIBRI 2has the generic jcolibri.datatypes.Text object to store
texts into cases.
These objects are managed by the methods of the
jcolibri.extensions.textual package.
This
object
is
not
enough
to
manage
the
required
information defined by the Lenz steps.
So, there is a subclass named
jcolibri.extensions.textual.IE.representation.IEText (texts
for Information Extraction) that allows it.
An IE text receives its content as String and later a method will organize this content.
This way, a text is composed by paragraphs, paragraphs by sentences and sentences by
tokens as shown in Figure 25
Tokens represent a word in the text. These objects store information like:
jCOLIBRI2 Tutorial
18.1
Semantic retrieval
74
jCOLIBRI2 Tutorial
18.1
Semantic retrieval
75
http://opennlp.sourceforge.net
http://gate.ac.uk
jCOLIBRI2 Tutorial
18.1
Semantic retrieval
76
Implementation
OpenNLP
GATE
Generic
Textual Object
IETextOpenNLP
IETextGate
IETextOpenNLP, IETextGate, IEText
Package
IE.opennlp
IE.gate
IE.common
Layers
Organize Text
OpennlpSplitter
GateSplitter
Keyword: StopWords
StopWordsDetector
Keyword: Stemmer
TextStemmer
Keyword: POS tagging
OpennlpPOStagger
GatePOStagger
Keyword: Main Names OpennlpMainNamesExtractor
Phrase
GatePhrasesExtractor
PhrasesExtractor
Glossary
GlossaryLinker
Thesaurus
ThesaurusLinker
Feature Value
GateFeaturesExtractor
FeaturesExtractor
Domain Structure
DomainTopicClassifier
Information Extraction
BasicInformationExtractor
Textual
attributes
can
also
be
compared
using
specific
similarity
functions
located
in
the
package
jcolibri.method.
retrieve.NNretrieval.similarity.local.textual. Most of them
are applicable to any Text attribute, but some few are only applicable to IEText
objects (or its subclasses) because require information stored in the tokens.
Cosine (similarity.local.textual.CosineCoefficient)
|(t1 t2 )|
cosine(t1 , t2 ) = p
|t1 |, |t2 |
jCOLIBRI2 Tutorial
18.1
Semantic retrieval
77
Dice (similarity.local.textual.DiceCoefficient)
dice(t1 , t2 ) =
2 |(t1 t2 )|
(|t1 | + |t2 |)
Jaccard (similarity.local.textual.JaccardCoefficient)
jaccard(t1 , t2 ) =
|(t1 t2 )|
(|t1 | |t2 |)
Overlap (similarity.local.textual.OverlapCoefficient)
overlap(t1 , t2 ) =
|(t1 t2 )|
min (|t1 |, |t2 |)
Compression (similarity.local.textual.compressionbased.
CompressionBased)
CDM (x, y) =
C(xy)
C(x) + C(y)
where C(x) is the size of string x after compression (and C(y) similarly) and
C(xy) is the size, after compression, of the string that comprises y concatenated
to the end of x.
Developed and implemented by Derek Bridge. See following papers [30, 14] and
framework documentation.
Normalised Compression (similarity.local.textual.
compressionbased.NormalisedCompression)
N CD(x, y) =
where C(x) is the size of string x after compression (and C(y) similarly) and
C(xy) is the size, after compression, of the string that comprises y concatenated
to the end of x.
Developed and implemented by Derek Bridge. See following papers [34, 15] and
framework documentation.
18.1.4 The Restaurant Recommender example
To illustrate the textual methods of jCOLIBRI 2, we have developed a restaurant adviser
system. The entire case base contains roughly 100 different restaurants extracted from
an online magazine. This recommender is implemented using both the semantic and
statistical methods. The complete implementation of the Restaurant Recommender application can be found in the Tests 13a and 13b of the framework examples.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
18.1
Semantic retrieval
78
Following listings contains the most important code of the Restaurant Recommender
implementation that uses the semantic textual methods of jCOLIBRI 2. In this case we
are using the OpenNLP version of the methods.
In the precycle, the recommender application executes the textual methods over the
complete case base. It illustrates the advantages of having a precycle in the CBR system
because this computation is an very takes a long time but is only executed once.
Listing 35: The Restaurant Recommender precycle using semantic TCBR methods
p u b l i c CBRCaseBase p r e C y c l e ( ) throws E x e c u t i o n E x c e p t i o n
{
/ / I n t h e p r e c y c l e we prec o m p u t e t h e i n f o r m a t i o n e x t r a c t i o n
in the case base
/ / I n i t i a l i z e Wordnet
T h e s a u r u s L i n k e r . loadWordNet ( ) ;
/ / Load u s e r s p e c i f i c g l o s s a r y
GlossaryLinker . loadGlossary ( " j c o l i b r i / t e s t
txt ") ;
/ / Load p h r a s e s r u l e s
PhrasesExtractor . loadRules ( " j c o l i b r i / t e s t /
phrasesRules . txt " ) ;
/ / Load f e a t u r e s r u l e s
FeaturesExtractor . loadRules ( " j c o l i b r i / t e s t
featuresRules . txt " ) ;
/ / Load t o p i c r u l e s
DomainTopicClassifier . loadRules ( " j c o l i b r i /
domainRules . t x t " ) ;
/ test13 / glossary .
test13 /
/ test13 /
test / test13 /
/ / Obtain cases
_caseBase . i n i t ( _connector ) ;
C o l l e c t i o n <CBRCase> c a s e s = _ c a s e B a s e . g e t C a s e s ( ) ;
/ / P e r f o r m IE m e t h o d s i n t h e c a s e s
/ / O r g a n i z e c a s e s i n t o p a r a g r a p h s , s e n t e n c e s and t o k e n s
OpennlpSplitter . s p l i t ( cases ) ;
/ / Detect stopwords
StopWordsDetector . detectStopWords ( cases ) ;
/ / Stem t e x t
TextStemmer . s t e m ( c a s e s ) ;
/ / P e r f o r m POS t a g g i n g
OpennlpPOStagger . t a g ( c a s e s ) ;
/ / E x t r a c t main names
OpennlpMainNamesExtractor . extractMainNames ( c a s e s ) ;
/ / Extract phrases
PhrasesExtractor . extractPhrases ( cases ) ;
/ / Extract features
FeaturesExtractor . extractFeatures ( cases ) ;
jCOLIBRI2 Tutorial
18.1
Semantic retrieval
79
In the cycle(), the application only has to execute the TCBR methods over the query.
However, there are some methods that relate query and case terms that can only be
applied in this cycle instead of the precycle.
The semantic TCBR methods extract the information from the text into the case (or
query). This way, we can compute a typical k-NN similarity measure to obtain the most
suitable restaurant for the query. Following listing shows the code:
Listing 36: The Restaurant Recommender cycle using semantic TCBR methods
p u b l i c v o i d c y c l e ( CBRQuery q u e r y ) throws E x e c u t i o n E x c e p t i o n
{
C o l l e c t i o n <CBRCase> c a s e s = _ c a s e B a s e . g e t C a s e s ( ) ;
/ / P e r f o r m IE m e t h o d s i n t h e c a s e s
/ / O r g a n i z e t h e q u e r y i n t o p a r a g r a p h s , s e n t e n c e s and t o k e n s
O p e n n l p S p l i t t e r . s p l i t ( query ) ;
/ / Detect stopwords
StopWordsDetector . detectStopWords ( query ) ;
/ / Stem q u e r y
TextStemmer . s t e m ( q u e r y ) ;
/ / P e r f o r m POS t a g g i n g i n t h e q u e r y
OpennlpPOStagger . t a g ( query ) ;
/ / E x t r a c t main names
OpennlpMainNamesExtractor . extractMainNames ( query ) ;
/ / Now t h a t we h a v e t h e q u e r y we r e l a t e c a s e s t o k e n s w i t h
the query tokens
/ / U s i n g t h e u s e r d e f i n e d g l o s s a r y
GlossaryLinker . LinkWithGlossary ( cases , query ) ;
/ / Using wordnet
ThesaurusLinker . linkWithWordNet ( cases , query ) ;
/ / Extract phrases
P h r a s e s E x t r a c t o r . e x t r a c t P h r a s e s ( query ) ;
/ / Extract features
F e a t u r e s E x t r a c t o r . e x t r a c t F e a t u r e s ( query ) ;
/ / Class ify with a topic
DomainTopicClassifier . c lassi fyWi thTo pic ( query ) ;
jCOLIBRI2 Tutorial
18.1
Semantic retrieval
80
/ / P e r f o r m IE c o p y i n g e x t r a c t e d f e a t u r e s o r p h r a s e s i n t o
other a t t r i b u t e s of the query
B a s i c I n f o r m a t i o n E x t r a c t o r . e x t r a c t I n f o r m a t i o n ( query ) ;
/ / Now we c o n f i g u r e t h e kNN r e t r i e v a l w i t h some u s e r
d e f i n e d s i m i l a r i t y measures
NNConfig n n C o n f i g = new NNConfig ( ) ;
n n C o n f i g . s e t D e s c r i p t i o n S i m F u n c t i o n ( new A v e r a g e ( ) ) ;
n n C o n f i g . addMapping ( new A t t r i b u t e ( " l o c a t i o n " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new E q u a l ( ) ) ;
/ / To compare t e x t we u s e t h e O v e r l a p C o f f i c i e n t
n n C o n f i g . addMapping ( new A t t r i b u t e ( " d e s c r i p t i o n " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new O v e r l a p C o e f f i c i e n t ( ) ) ;
/ / This function takes a s t r i n g with several numerical
v a l u e s and c o m p u t e s t h e a v e r a g e
n n C o n f i g . addMapping ( new A t t r i b u t e ( " p r i c e " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new
AverageMultipleTextValues (1000) ) ;
/ / T h i s f u n c t i o n t a k e s a s t r i n g w i t h s e v e r a l words s e p a r a t e d
by w h i t e s p a c e s , c o n v e r t s i t t o a s e t o f t o k e n s and
/ / computes the s i z e of the i n t e r s e c t i o n of the query s e t
and t h e c a s e s e t n o r m a l i z e d w i t h t h e c a s e s e t
n n C o n f i g . addMapping ( new A t t r i b u t e ( " f o o d T y p e " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new T o k e n s C o n t a i n e d ( ) ) ;
n n C o n f i g . addMapping ( new A t t r i b u t e ( " f o o d " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new T o k e n s C o n t a i n e d ( ) ) ;
n n C o n f i g . addMapping ( new A t t r i b u t e ( " a l c o h o l " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new E q u a l ( ) ) ;
n n C o n f i g . addMapping ( new A t t r i b u t e ( " t a k e o u t " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new E q u a l ( ) ) ;
n n C o n f i g . addMapping ( new A t t r i b u t e ( " d e l i v e r y " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new E q u a l ( ) ) ;
n n C o n f i g . addMapping ( new A t t r i b u t e ( " p a r k i n g " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new E q u a l ( ) ) ;
n n C o n f i g . addMapping ( new A t t r i b u t e ( " c a t e r i n g " ,
R e s t a u r a n t D e s c r i p t i o n . c l a s s ) , new E q u a l ( ) ) ;
C o l l e c t i o n < R e t r i e v a l R e s u l t > r e s = NNScoringMethod .
e v a l u a t e S i m i l a r i t y ( cases , query , nnConfig ) ;
r e s = S e l e c t C a s e s . selectTopKRR ( r e s , 5 ) ;
/ / Show t h e r e s u l t
...
}
jCOLIBRI2 Tutorial
18.2
Statistical retrieval
81
http://www.carrot2.org
jCOLIBRI2 Tutorial
18.2
Statistical retrieval
82
p u b l i c CBRCaseBase p r e C y c l e ( ) throws E x e c u t i o n E x c e p t i o n
{
_caseBase . i n i t ( _connector ) ;
/ / Here we c r e a t e t h e L u c e n e i n d e x
l u c e n e I n d e x = j c o l i b r i . method . p r e c y c l e . L u c e n e I n d e x C r e a t o r .
createLuceneIndex ( _caseBase ) ;
return _caseBase ;
}
p u b l i c v o i d c y c l e ( CBRQuery q u e r y ) throws E x e c u t i o n E x c e p t i o n
{
C o l l e c t i o n <CBRCase> c a s e s = _ c a s e B a s e . g e t C a s e s ( ) ;
NNConfig n n C o n f i g = new NNConfig ( ) ;
n n C o n f i g . s e t D e s c r i p t i o n S i m F u n c t i o n ( new A v e r a g e ( ) ) ;
/ / We o n l y compare t h e " d e s c r i p t i o n " a t t r i b u t e u s i n g L u c e n e
A t t r i b u t e t e x t u a l A t t r i b u t e = new A t t r i b u t e ( " d e s c r i p t i o n " ,
RestaurantDescription . class ) ;
n n C o n f i g . addMapping ( t e x t u a l A t t r i b u t e , new
L u c e n e T e x t S i m i l a r i t y ( luceneIndex , query , t e x t u a l A t t r i b u t e ,
true ) ) ;
jCOLIBRI2 Tutorial
83
jCOLIBRI2 Tutorial
84
...
H o l d O u t E v a l u a t o r e v a l = new H o l d O u t E v a l u a t o r ( ) ;
e v a l . i n i t ( new E v a l u a b l e A p p ( ) ) ;
e v a l . HoldOut ( 5 , 1 ) ;
System . o u t . p r i n t l n ( E v a l u a t o r . g e t E v a l u a t i o n R e p o r t ( ) ) ;
j c o l i b r i . e v a l u a t i o n . t o o l s . E v a l u a t i o n R e s u l t G U I . show ( E v a l u a t o r
. getEvaluationReport ( ) , " Test8 Evaluation " , f a l s e ) ;
...
jCOLIBRI2 Tutorial
85
For example, in the Test 8, we are evaluating the similarity value of the most similar
case:
Listing 39: Example of an evaluable application
p u b l i c c l a s s E v a l u a b l e A p p implements S t a n d a r d C B R A p p l i c a t i o n {
...
p u b l i c v o i d c y c l e ( CBRQuery q u e r y ) throws E x e c u t i o n E x c e p t i o n
{
NNConfig s i m C o n f i g = new NNConfig ( ) ;
...
C o l l e c t i o n < R e t r i e v a l R e s u l t > e v a l = NNScoringMethod .
e v a l u a t e S i m i l a r i t y ( _caseBase . g e t C a s e s ( ) , query ,
simConfig ) ;
/ / Now we add t h e s i m i l a r i t y o f t h e m o s t s i m i l a r c a s e
to the serie " S i m i l a r i t y ".
Evaluator . getEvaluationReport ( ) . addDataToSeries ( "
S i m i l a r i t y " , new Double ( e v a l . i t e r a t o r ( ) . n e x t ( ) .
getEval () ) ) ;
jCOLIBRI2 Tutorial
20. Recommenders
86
20 Recommenders
jCOLIBRI 2 includes an extension to implement recommendation systems. This extension has been designed having in mind the future design process of CBR systems in
jCOLIBRI.
We proposes a flexible way to design CBR systems in future versions of jCOLIBRI
using a library of templates obtained from a previously designed set of CBR systems.
In case-based fashion, jCOLIBRI will retrieve templates from a library of templates (i.e.
a case base of CBR design experience); the designer will choose one, and adapt it.
We represent templates graphically as shown in following Figures 29 and 30. Each rectangle in the template is a subtask. Simple tasks (shown as blue or pale grey rectangles)
can be solved directly by a method included in this extension. Complex tasks (shown
as red or dark grey rectangles) are solved by decomposition methods having other associated templates. There may be multiple alternative methods to solve any given task.
These methods are usual java methods of the classes in the framework.
Before implementing this extension, we developed templates for recommender systems
and then generated the methods that solve every task. That allowed us to develop many
recommender systems that are included in the jcolibri.test.recommenders
package. By now, templates are only a graphical representation of CBR systems, although in a short future they will be used to generate them.
The jCOLIBRI team thanks to Derek Bridge his collaboration and supervision during
the development of this extension.
jCOLIBRI2 Tutorial
20.1
87
After retrieving items, Conversational Systems (Figure 30) may invite or allow
the user to refine his/her current preferences, typically based on the recommended
items. Iterated Preference Elicitation might be done by allowing the user to select and critique a recommended item thereby producing a modified query, which
requires that one or more retrieved items be displayed (Figure 30 left). Alternatively, it might be done by asking the user a further question or questions thereby
refining the query, in which case the retrieved items might be displayed every
time (Figure 30 left) or might be displayed only when some criterion is satisfied
(e.g. when the size of the set is small enough) (Figure 30 right). Note that
both templates share the One-Off Preference Elicitation and Retrieval tasks with
single-shot systems.
20.1.1 One-Off Preference Elicitation
We can identify three templates by which the users initial preferences may be elicited:
One possibility is Profile Identification where the user identifies him/herself, e.g.
by logging in, enabling retrieval of a user profile from a profile database. This
profile might be a content-based profile (e.g. keywords describing the users longterm interests, or descriptions of items consumed previously by this user, or descriptions of sessions this user has engaged in previously with this system); or it
might be a collaborative filtering style of profile (e.g. the users ratings for items).
The template allows for the possibility that the user is a new user, in which case
there may be some registration process followed by a complex task that elicits an
initial profile.
An alternative is Initial Query Elicitation. This is itself a complex task with multiple alternative decompositions. The decompositions include: Form-Filling (Figure 31 left) and Navigation-by-Asking (i.e. choosing and asking a question) (Fig-
jCOLIBRI2 Tutorial
20.1
88
ure 31 center). Various versions of the Entre system [11] offered interesting
further methods: Identify an Item (where the user gives enough information to
identify an item that s/he likes, e.g. a restaurant in his/her home town, whose
description forms the basis of a query, e.g. for a restaurant in the town being visited); and Select an Exemplar (where a small set of contrasting items is selected
and displayed, the user chooses the one most similar to what s/he is seeking, and
its description forms the basis of a query).
The third possibility is Profile Identification & Query Elicitation, in which the
previous two tasks are combined.
jCOLIBRI2 Tutorial
20.1
89
20.1.2 Retrieval
Because we are focussing on case-based recommender systems (and related memorybased recommenders including collaborative filters), Retrieval is common to all our recommender systems. Retrieval is a complex task, with many alternative decompositions.
The choice of decomposition is, of course, not independent of the choice of decomposition for One-Off Preference Elicitation and Iterated Preference Elicitation. For example,
if One-Off Preference Elicitation delivers a ratings profile, then the method chosen for
achieving the Retrieval task must be some form of collaborative recommendation.
The following is a non-exhaustive list of papers that define methods that can achieve
the Retrieval task: Wilke et al. 1998 [56] (similarity-based retrieval using a query of
preferred values); Smyth & McClave 2001 [53] (diversity-enhanced similarity-based
retrieval); McSherry 2002 [36] (diversity-conscious retrieval); Bridge & Fergsuon 2002
[7] (order-based retrieval); McSherry 2003 [37] (compromise-driven retrieval); Bradley
& Smyth 2003 [6] (where user profiles are mined and used); Herlocker et al. 1999 [26]
(user-based collaborative filtering); Sarwar et al. 2001 [47] (item-based collaborative
filtering). In all these ways of achieving Retrieval, a scoring process is followed by a
selection process. For example, in similarity-based retrieval (k-NN), items are scored by
their similarity to the users preferences and then the k highest-scoring items are selected
for display; in diversity-enhanced similarity-based retrieval, items are scored in the same
way and then a diverse set is selected from the highest-scoring items; and so on. Note
also that there are alternative decompositions of the Retrieval task that would not have
this two-step character. For example, filter-based retrieval, where the users preferences
are treated as hard constraints, conventionally does not decompose into two such steps.
On the other hand, there are recommender systems in which Retrieval decomposes into
more than two steps. For example, in some forms of Navigation-by-Proposing (see
below), first a set of items that satisfy the users critique is obtained by filter-based
retrieval, then these are scored for similarity to the users selected item, and finally a
subset is chosen for display to the user.
20.1.3 Iterated Preference Elicitation
In Iterated Preference Elicitation the user, who may or may not have just been shown
some products (Figure 30), may, either voluntarily or at the systems behest, provide
further information about his/her preferences. Alternative decompositions of this task
include:
Form-Filling where the user enters values into a form that usually has the same
structure as items in the database (Figure 31 left). We have seen that Form-Filling
is also used for One-Off Preference Elicitation. When it is used in Iterated Preference Elicitation, it is most likely that the user edits values s/he previously entered
into the form.
Navigation-by-Asking is another method that can be used for both One-Off Preference Elicitation and for Iterated Preference Elicitation. The system refines the
jCOLIBRI2 Tutorial
20.2
90
query with the users answer to a question about his/her preferences. The system
uses a heuristic to choose the next best question to ask. Bergmann [3] reviews
some of the methods that have been used to choose this question.
Navigation-by-Proposing (also known as tweaking and as critiquing) requires that
the user has been shown a set of candidate items. S/he selects the one that comes
closest to satisfying his/her requirements but then offers a critique (e.g. like this
but cheaper). A complex query is constructed that is intended to retrieve items
that are similar to the selected item but which also satisfy the critique. The selection of the candidate item and its critiques must be performed during the Display
Item List task. Therefore, the Create Complex Query task will receive that information and modify the query according to the user selection. Burke reviews early
work on this topic [11]. We note that there has been a body of new work since
then, some of it cited in [8] and [52]. (Although we describe Navigation-byProposing only as a decomposition of Iterated Preference Elicitation, this need
not be so. We could additionally, for example, define a template for One-Off
Preference Elicitation that uses Select an Exemplar followed by Navigation-byProposing.)
Note that the templates for Conversational Systems do not preclude the possibility that
the system uses a different method for Iterated Preference Elicitation on different iterations. ExpertClerk is an example of such a recommender. It uses Navigation-by-Asking
for One-Off Preference Elicitation, and then it chooses between Navigation-by-Asking
and Navigation-by-Proposing for Iterated Preference Elicitation, preferring the former
while the set of candidate items remains large [51]. In fact, we have confirmed that
we can build a version of ExpertClerk based on the template in Figure 30 right and
using methods included in the recommenders extension. This informally illustrates the
promise of our templates approach: complex systems (such as ExpertClerk) can be constructed by adapting existing templates.
jCOLIBRI2 Tutorial
20.2
91
jCOLIBRI2 Tutorial
20.2
92
McSherry More is Better : (note that the user query value is ignored)
sim(c.a, q.a) = 1 (max(a) c.a)/(max(a) min(a))
Table : reads a table data from a csv file. Axes values must be strings or enumerations.
20.2.3 Conditional methods
Continue : Receives an UserChoice and returns true if the value is Edit Query.
BuyOrQuit : Receives an UserChoice object and returns true or false depending on
its value (Quit or Buy).
DisplayCasesIfNumber : Returns true if the number of cases received is inside a
range. Optionally this method can show a message. Useful in conversational B
systems when it is used with FilterBased retrieval (with k-NN it has no sense).
DisplayCasesIfSimil : Returns true if the retrieved cases have a minimum similarity.
Useful only with k-NN.
20.2.4 Navigation by Asking methods
ObtainQueryWithAttributeQuestionMethod Asks the user for the value of an attribute. This method is used in Navigation by asking. It only shows the available
options, removing the values that dont appear in the working cases set. Customs
labels are allowed.
Information Gain Returns the attribute with more information gain in a set of cases
[3][50]. Used in Navigation by Asking with ObtainQueryWithAttributeQuestionMethod.
m
X
|C j |
|C j |
log2
Gain(A) =
|C|
|C|
j=1
where C is partitioned according to the attribute A into m subsets C = C 1
C m such that the attribute value of all cases in C j is vj .
Similarity Influence Measure Selects the attribute that has the highest influence on
the k-NN similarity. The influence on the similarity can be measured by the expected variance of the similarities of a set of selected cases. See [3][31][49] for
details.
X
SimV ar(q, A, C) =
pv V ar(qAv , c)
v
where:
v are the possible values of the attribute A.
qAv is the query once assigned v to the attribute A.
pv = |C v |/|C|.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
20.2
1
V ar(q, C) = |C|
cC (sim(q, c)
93
When using Attribute Selection and Filter based retrieval, the numerical attributes must
be transformed into ranges, and the Filter based retrieval only uses EqualTo() predicates.
20.2.5 Navigation by Proposing
DisplayCasesTableWithCritiquesMethod : Displays the cases in a table allowing
the user to buy one item, finsh, or critique the selected item. Critiques are configured by the designer providing a list of
emphCritiqueOptions. This method returns a CriticalUserChoice that is
an extension of UserChoice storing also the critiques of the selected item.
This method enables or disables the critiques depending on the values of the working cases. For example, it has no sense to show a cheaper button if there are
not cheaper cases. Usually, displayed cases are the same than working cases, but
when using diversity algorithms only three of the working cases are displayed.
CritiqueOption : Object that encapsulates the possible critiques of a case. It stores the
label of the button, the criticized attribute, and a Filter-Based Retrieval predicate
to perform the critique.
CriticalUserChoice : Extension of UserChoice that also stores the critiques over
the selected case.
In the Iterated preference elicitation of Navigation by Proposing there are several methods to modify the query (see [35]:
More Like This replaces current query with the description of the selected case.
Partial More Like This partially replaces current query with the description of the
selected case. It only transfers a feature value from the selected case if none of
the rejected cases have the same feature value.
Weighted More Like This transfers all attributes from the selected case to the query
but weights them given preference to diverse attributes among the proposed cases.
The new weights are stored into a NNConfig object, so this strategy should be
used with NN retrieval.
Less Like This is a simple one: if all the rejected cases have the same featurevalue combination, which is different from the preferred case then this combination can be added as a negative condition. This negative condition is coded
as a NotEqualTo(value) predicate in a FilterConfig object. The
query is not modified. That way, this method should be used together with
FilterBasedRetrieval.
More + Less Like This combines both More Like This and Less Like This. It copies
the values of the selected case into the query and returns a FilterConfig
jCOLIBRI2 Tutorial
20.2
94
object with the negative conditions. This method should be used together with
Filtered and NN retrieval.
20.2.6 Profile management methods
CreateProfile : Obtains a user profile (query) using the FormFilling method and
stores it in a xml file. This method is not part of the CBR cycle. It is executed in
the PreCycle or in a separate application.
ObtainQueryFromProfile : Obtains the query form the xml file generated by CreateProfile.
20.2.7 Collaborative Recommendations
These kind of recommendations are based on past experiences of other users. They need
a specific case base implementation that manages cases composed by a description, a
solution and a result. The description usually contains the information about the user
that made the recommendation, the solution contains the information about the recommended item, and the result stores the rating value for the item (the value that the user
assigns to an item after evaluating it). This way, this case base implementation stores
cases as a table:
Item1
Item2
rating12
User1
User2 rating21
User3
...
UserN
ratingN2
Item3
Item4
rating14
Item5
...
ItemM
rating23
rating33 rating34
rating2N
rating3N
ratinN5
ratingNN
The values of the first column and row contain the ids of the description and solution
components of the case. These ids must be integer values. The ratings are obtained from
an attribute of the result component.
Note that these cases base allows to have different cases with the same description (because each user can make several recommendations).
The behavior of collaborative recommenders can be split into three steps [26]:
1. Weight all users with respect to similarity with the active user.
2. Select a subset of users as a set of predictors.
3. Normalize ratings and compute a prediction from a weighted combination of selected neighbors ratings.
The case base implementation of the collaborative package performs the first step. The
other two final steps are performed by the collaborative retrieval method.
See [29] for further details.
jCOLIBRI2 Tutorial
20.2
95
MatrixCaseBase : Specific implementation of CBRCaseBase to allow collaborative recommendations. As there are several ways to compute the first step of the
behavior of collaborative recommenders. This class provides most of the code, but
it must be specialized as in PearsonMatrixCaseBase. The subclasses must
implement the abstract methods defined here. These methods return the similarity among neighbors. This similarity value is stored into SimilarityTuple
objects.
Ratings must be sorted by neighbors to allow an efficient comparison. Looking
to the previous table this means that cases are organized by rows. This process is
performed internally in this class.
There are also two internal classes (CommonRatingTuple and
CommonRatingsIterator) that allow to efficiently obtain the common ratings of two users. This will be used by subclasses when computing
the neighbors similarity. The code of these classes is an adaptation of the one
developed by Jerome Kelleher and Derek Bridge for the Collaborative Movie
Recommender project at University College Cork (Ireland).
PearsonMatrixCaseBase : Extension of the MatrixCaseBase that computes
similarities among neighbors using the Pearson Correlation:
Pm
(ra,i ra )(ru,i ru ) s
sim(a, u) = i=1
a b
f
where: a and u are the compared neighbors, m the number of common items. r
denotes a mean value and denotes a standard derivation, and these are computed
on co-rated items only. The Pearson correlation is weighted by a factor fs where
s is the number of co-rated items and f is defined by the designer. This decreases
the similarity between users who have fewer than f co-rated items.
CollaborativeRetrievalMethod : This method returns cases depending on the recommendations of other users. It uses a PearsonMatrix case base to compute
the similarity among neighbors. Then, cases are scored according to a rating that
is estimated using the following formula:
Pn
(ru,i ru )(sim(a, u))
pa,i = ra + u=1 Pn
u=1 sim(a, u)
where n is the number of users.
20.2.8 Retrieval methods
Filter Based Retrieval : Retrieves cases which attributes comply some conditions.
It computes the boolean AND operator over the condition of each attribute. The
evaluation of each attribute is configured with predicates:
Equal.
NotEqual.
Group for Artificial Intelligence Applications
jCOLIBRI2 Tutorial
20.2
96
QueryLessOrEqual.
QueryLess.
QueryMoreOrEqual.
QueryMore.
OntologyCompatible: To use with instances. Returns true if the case
instance is under the query instance.
Expert Clerk Retrieval : This is the method of the ExpertClerk system [51]. This
algorithm chooses the first case that is closed to the median of all cases. Then the
remaining are selected taking into account negative and positive characteristics.
A characteristic is an attribute that exceeds a predefined threshold with respect
to the median case. It is positive if is greater than the value of the median. And
negative otherwise. The number of positive plus the negative characteristics is
used to rank the cases and obtain the retrieved cases.
The first sample product (1st-SP) is the case closest to the median of the cases.
Let C = c1 , c2 , ..., ck be the set of cases and let ci = vi1 , vi2 , ..., vin be the set of
attribute values of a case ci . Then, the median cmed of C is calculated by:
!
k
k
k
1X
1X
1X
vj1 ,
vj2 , ...,
vjn
cmed =
k j=1
k j=1
k j=1
The distance (D) between cmed and a case ci is given by:
Pn
j=1 Wj d(cmed.j , ci.j))
Pn
D(cmed , ci ) =
j=1 Wj
where cmed.j is the j-th attribute of cmed , vi.j) is the j-th attribute value of ci , and
Wj is the j-th attribute weight.
In case an attribute is an enumerative type, the most dominant value among attribute values is chosen as a median.
1st-SP is a record whose distance D(cmed , ci ) from cmed is the shortest among
cj (1 j k).
Then positive and negative characteristics of retrieved records are generated, and
following cases are selected. For each retrieved record, attribute distance ADj
between each attribute value and that of cmed is calculated by:
ADj (ci .j, cmed .j) = Wj (ci .j cmed .j)
If the absolute value of ADj (ci .j, cmed .j) exceeds a predefined threshold, ci .j is
regarded as a characteristic of the record ci , and is called a positive characteristic
if the value of ci .j is more highly ranked than that of cmed .j, otherwise it is called
a negative characteristic. The total number of characteristics is the sum of positive
characteristics and negative characteristic.
The k-1 records having the maximum number of characteristics are also retrieved
together with 1st-SP.
jCOLIBRI2 Tutorial
20.2
97
jCOLIBRI2 Tutorial
20.2
98
}
GreedySelection This method (see [53]) incrementally builds a retrieval set, R. During each step the remaining cases are ordered according to their quality with the
highest quality case added to R.
The quality metric combines diversity and similarity. The quality of a case c is
proportional to the similarity between c and the query, and to the diversity of c
relative to those cases so far selected in R = r1 , ..., rm .
Quality(t, c, R) = Similarity(t, c) RelDiversity(c, R)
RelDiversity(c, R) = 0 if R = {}
Pm
(1 Similarity(c, ri ))
, otherwise
= i=1
m
The pseudocode of the algorithm is:
t: query, C: case-base, k: #results, b: bound
GreedySelection(t,C,k){
R = {}
for(i=1 to k){
Sort C by Quality(t,c,R) for each c in C
R = R + First(C)
C = C - First(C)
}
return R
}
This algorithm is very expensive. It should be applied to small case bases.
BoundedGreedySelection Tries to reduce the complexity of the greedy selection
algorithm first selecting the best bk cases according to their similarity to the query
and then applies the greedy selection method to these cases (See [53]).
t: query, C: case-base, k: #results, b: bound
BoundedGreedySelection(t,C,k,b){
C = bk cases in C that are most similar to t
R = {}
for(i=1 to k){
Sort C by Quality(t,c,R) for each c in C
R = R + First(C)
C = C - First(C)
}
return R
}
jCOLIBRI2 Tutorial
99
21 Other Features
jCOLIBRI 2 includes several other features implemented by external contributors. This
section briefly describes these features.
...
/ / C o n f i g u r e c o n n e c t o r and c a s e b a s e
C o n n e c t o r _ c o n n e c t o r = new P l a i n T e x t C o n n e c t o r ( ) ;
_connector . initFromXMLfile ( j c o l i b r i . u t i l . FileIO . f i n d F i l e ( "
j c o l i b r i / t e s t / t e s t 9 / p l a i n t e x t c o n f i g . xml " ) ) ;
CBRCaseBase _ c a s e B a s e = new L i n e a l C a s e B a s e ( ) ;
/ / Load c a s e s
_caseBase . i n i t ( _connector ) ;
/ / C o n f i g u r e NN
NNConfig s i m C o n f i g = new NNConfig ( ) ;
...
jCOLIBRI2 Tutorial
21.2
100
the
packages
jcolibri.method.reuse.classification
jcolibri.method.revise.classification.
and
Regarding the maintenance, the jcolibri.method.maintenance package includes methods that decide which cases should be removed from a case base
to improve the accuracy of the CBR application. Besides these methods, there
are other classes that evaluate the maintenance process. They are located in the
jcolibri.extensions.maintenance_evaluation package.
Read the documentation of those packages for details. Moreover, Tests 7, 14 and 15
show how to use these features.
jCOLIBRI2 Tutorial
101
22 Getting support
The first place you should consult for support is the documentation supplied with
jCOLIBRI 2 in the doc directory. There you will find the complete API documentation
with descriptions of all classes and files in the framework. This folder also contains
class and sequence UML diagrams of the most important components of jCOLIBRI 2.
The other mayor source of information are the tests included in the framework.
These tests serve as programming recipes and show how to use the components of
jCOLIBRI 2 to implement CBR applications with different characteristics.
If you cannot solve your problem/question with the provided documentation, the preferred way of getting support is the developers mailing list:
http://sourceforge.net/mailarchive/forum.php?forum_name=
jcolibri-cbr-developers
You can post by sending your question to:
[email protected]
Finally, you can subscribe to the list and receive messages from other developers (do
not worry, it is a low volume list):
https://lists.sourceforge.net/lists/listinfo/
jcolibri-cbr-developers
jCOLIBRI2 Tutorial
102
23 Contributing to jCOLIBRI
A contribution is a set of methods or classes that extend the functionality of the framework and are not bounded into the main release of jCOLIBRI. In earlier versions of
the framework, contributions were called extensions and were distributed into the main
release. However, the increasing interest of the community has driven us to create this
new way for including third party code, keeping its own authorship and license.
In this section you will find some simple steps to submit new contributions.
jCOLIBRI2 Tutorial
23.3
103
If you develop a new contribution, you should create some simple examples that will be
incorporated into this application.
An example is defined with a text file containing:
Name.
Short description. Allows html tags to decorate the text.
Class to execute.
Path to the most important documentation files (several lines).
This information must be included into a text file where each example is separated by
a line containing the <example> tag. Here there is an example (extracted from the
examples.config file in jcolibri/test/main):
T e s t 1 D a t a b a s e kNN
T e s t 1 shows how t o u s e a s i m p l e d a t a b a s e ( H i b e r n a t e )
...
j c o l i b r i . t e s t . t e s t 1 . Test1
doc / a p i / s r c h t m l / j c o l i b r i / t e s t / t e s t 1 / T e s t 1 . h t m l
doc / a p i / j c o l i b r i / t e s t / t e s t 1 / T e s t 1 . h t m l
doc / a p i / j c o l i b r i / t e s t / t e s t 1 / T r a v e l D e s c r i p t i o n . h t m l
doc / a p i / j c o l i b r i / c o n n e c t o r / D a t a B a s e C o n n e c t o r . h t m l
doc / a p i / j c o l i b r i / method / r e t r i e v e / N N r e t r i e v a l / NNScoring . . .
< example >
Test 2 Enumerated t y p e s
T e s t 2 e x t e n d s T e s t 1 t o show t h e u s e o f e n u m e r a t e d . . .
j c o l i b r i . t e s t . t e s t 2 . Test2
doc / a p i / s r c h t m l / j c o l i b r i / t e s t / t e s t 2 / T e s t 2 . h t m l
doc / a p i / j c o l i b r i / t e s t / t e s t 2 / T e s t 2 . h t m l
doc / a p i / j c o l i b r i / t e s t / t e s t 2 / T r a v e l D e s c r i p t i o n . h t m l
doc / a p i / j c o l i b r i / t e s t / t e s t 2 / M y S t r i n g T y p e . h t m l
doc / a p i / j c o l i b r i / c o n n e c t o r / D a t a B a s e C o n n e c t o r . h t m l
doc / a p i / j c o l i b r i / method / r e t r i e v e / N N r e t r i e v a l / NNScoring . . .
< example >
T e s t 3 Compound a t t r i b u t e s
...
jCOLIBRI2 Tutorial
104
24 Versions ChangeLog
This section summarizes the main changes among versions.
jCOLIBRI2 Tutorial
24.2
Version 2.0
105
jCOLIBRI2 Tutorial
References
106
References
[1] A. Aamodt. Knowledge intensive case-based reasoning and sustained learning. In
Proceedings of the ninth European Conference on Artificial Intelligence (ECAI90), pages 16, August 1990.
[2] A. Aamodt and E. Plaza. Case-based reasoning: Foundational issues, methodological variations, and system approaches. AI Communications, 7(i), 1994.
[3] R. Bergmann. Experience Management: Foundations, Development Methodology,
and Internet-Based Applications. Springer-Verlag New York, Inc., Secaucus, NJ,
USA, 2002.
[4] R. Bergmann, S. Breen, E. Fayol, M. Gker, M. Manago, S. Schmitt, J. Schumacher, A. Stahl, S. Wess, and W. Wilke. Collecting experience on the systematic
development of CBR applications using the INRECA methodology. Lecture Notes
in Computer Science, 1488:460470, 1998.
[5] R. Bergmann, S. Breen, M. Goker, M. Manago, J. Schumacher, A. Stahl, E. Tartarin, S. Wess, and W. Wilke. The inreca-ii methodology for building and maintaining cbr applications, 1998.
[6] K. Bradley and B. Smyth. Personalized information ordering: a case study in
online recruitment. Knowledge-Based Systems, 16(5-6):269275, 2003.
[7] D. Bridge and A. Ferguson. An expressive query language for product recommender systems. Artif. Intell. Rev., 18(3-4):269307, 2002.
[8] D. Bridge, M. H. Gker, L. McGinty, and B. Smyth. Case-based recommender
systems. Knowledge Engineering Review, 20(3):315320, 2006.
[9] M. Brown, C. Frtsch, and D. Wissmann. Feature extraction - the bridge from
case-based reasoning to information retrieval. In Proceedings of 6th German
Workshop on Case-Based Reasoning 98 (GWCBR98), 1998.
[10] S. Brninghaus and K. D. Ashley. The role of information extraction for textual
CBR. In Proceedings of the 4th International Conference on Case-Based Reasoning, ICCBR 01, pages 7489. Springer-Verlag, 2001.
[11] R. Burke. Interactive critiquing forcatalog navigation in e-commerce. Knowledge
Engineering Review, 18(3-4):245267, 2002.
[12] E. by David Leake. Case Based Reasoning. Experiences, Lessons and Future
Directions. AAAI Press. MIT Press, USA, 1997.
[13] R. L. de Mantaras and E. Plaza. Case-based reasoning: An overview. AI Communications, 10(1), 1997.
[14] S. J. Delany and D. Bridge. Feature-based and feature-free textual CBR: A comparison in spam filtering. In Procs. of the 17th Irish Conference on Artificial Intelligence and Cognitive Science, pages 244253, Belfast, Northern Ireland, 2006.
jCOLIBRI2 Tutorial
References
107
[15] S. J. Delany and D. Bridge. Catching the drift: Using feature-free case-based
reasoning for spam filtering. In Procs. of the 7th International Conference on
Case Based Reasoning, Belfast, Northern Ireland, 2007.
[16] B. Daz-Agudo and P. A. Gonzlez-Calero. An architecture for knowledge intensive CBR systems. In E. Blanzieri and L. Portinale, editors, Advances in CaseBased Reasoning (EWCBR00). Springer-Verlag, Berlin Heidelberg New York,
2000.
[17] B. Daz-Agudo and P. A. Gonzlez-Calero. Knowledge intensive CBR through
ontologies. In Procs of the UK CBR Workshop. 2001.
[18] B. Daz-Agudo and P. A. Gonzlez-Calero. CBROnto: a task/method ontology
for CBR. In S. Haller and G. Simmons, editors, Procs. of the 15th International
FLAIRS02 Conference. AAAI Press, 2002.
[19] B. Daz-Agudo and P. A. Gonzlez-Calero. Ontologies in the Context of Information Systems, chapter An ontological approach to develop Knowledge Intensive
CBR systems, page 45. Springer-Verlag, 2006.
[20] P. Funk and P. A. Gonzlez-Calero, editors. Advances in Case-Based Reasoning,
7th European Conference, ECCBR 2004, Madrid, Spain, August 30 - September 2,
2004, Proceedings, volume 3155 of Lecture Notes in Computer Science. Springer,
2004.
[21] H. Gomez-Gauchia, B. Daz-Agudo, P. P. Gomez-Martin, and P. A. GonzlezCalero. Supporting conversation variability in cobber using causal loops. In CaseBased Reasoning Research and Development- Proc. of the ICCBR05. Springer
Verlag LNCS/LNAI, 2005.
[22] P. A. Gonzlez-Calero, M. Gmez-Albarrn, and B. Daz-Agudo. Applying dls
for retrieval in case-based reasoning. In Procs. of the 1999 Description Logics
Workshop (Dl 99). Linkopings universitet, Sweden, 1999.
[23] P. A. Gonzlez-Calero, M. Gmez-Albarrn, and B. Daz-Agudo. A substitutionbased adaptation model. In Challenges for Case-Based Reasoning - Proc. of the
ICCBR99 Workshops. University of Kaiserslautern, 1999.
[24] K. M. Gupta and D. W. Aha. Towards acquiring case indexing taxonomies from
text. In Proceedings of the 17th Int. FLAIRS Conference, pages 307315, Miami
Beach, FL, 2004. AAAI Press.
[25] E. Hatcher and O. Gospodnetic. Lucene in Action (In Action series). Manning
Publications Co., Greenwich, CT, USA, 2004.
[26] J. L. Herlocker, J. A. Konstan, A. Borchers, and J. Riedl. An algorithmic framework for performing collaborative filtering. In SIGIR 99: Proceedings of the
22nd annual international ACM SIGIR conference on Research and development
in information retrieval, pages 230237, New York, NY, USA, 1999. ACM.
[27] M. Jaczynski and B. Trousse. An object-oriented framework for the design and
jCOLIBRI2 Tutorial
References
108
jCOLIBRI2 Tutorial
References
109
[41] J. A. Recio-Garca, B. Daz-Agudo, M. A. Gmez-Martn, and N. Wiratunga. Extending jCOLIBRI for textual CBR. In H. Muoz-Avila and F. Ricci, editors,
Proceedings of Case-Based Reasoning Research and Development, 6th International Conference on Case-Based Reasoning, ICCBR 2005, volume 3620 of Lecture Notes in Artificial Intelligence, subseries of LNCS, pages 421435, Chicago,
IL, US, August 2005. Springer.
[42] J. A. Recio-Garca, B. Daz-Agudo, and P. A. Gonzlez-Calero. Textual CBR
in jCOLIBRI: From Retrieval to Reuse. In D. C. Wilson and D. Khemani, editors, Workshop Proceedings of the 7th International Conference on Case-Based
Reasoning (ICCBR07), pages 217226, Belfast, Northen Ireland, August 13-16
2007.
[43] J. A. Recio-Garca, B. Daz-Agudo, P. A. Gonzlez-Calero, and A. Snchez-RuizGranados. Ontology based cbr with jcolibri. In R. Ellis, T. Allen, and A. Tuson, editors, Applications and Innovations in Intelligent Systems XIV. Proceedings of AI-2006, the Twenty-sixth SGAI International Conference on Innovative
Techniques and Applications of Artificial Intelligence, pages 149162, Cambridge,
United Kingdom, December 2006. Springer.
[44] J. A. Recio-Garca, B. Daz-Agudo, A. Snchez, and P. A. Gonzlez-Calero.
Lessons learnt in the development of a cbr framework. In M. Petridis, editor,
Proccedings of the 11th UK Workshop on Case Based Reasoning, pages 6071.
CMS Press,University of Greenwich, 2006.
[45] J. A. Recio-Garca, A. Snchez, B. Daz-Agudo, and P. A. Gonzlez-Calero. jcolibri 1.0 in a nutshell. a software tool for designing cbr systems. In M. Petridis,
editor, Proccedings of the 10th UK Workshop on Case Based Reasoning, pages
2028. CMS Press,University of Greenwich, 2005.
[46] S. Salotti and V. Ventos. Study and formalization of a case-based reasoning system
using a description logic. In B. Smyth and P. Cunningham, editors, Advances in
Case-Based Reasoning (EWCBR98). Springer-Verlag, 1998.
[47] B. Sarwar, G. Karypis, J. Konstan, and J. Reidl. Item-based collaborative filtering
recommendation algorithms. In WWW 01: Proceedings of the 10th international
conference on World Wide Web, pages 285295, New York, NY, USA, 2001. ACM.
[48] R. C. Schank. Dynamic Memory. Cambridge Univ. Press, 1983.
[49] S. Schmitt, P. Dopichaj, and P. Domnguez-Marn. Entropy-based vs. similarityinfluenced: Attribute selection methods for dialogs tested on different electronic
commerce domains. In S. Craw and A. Preece, editors, Proceedings of the 6th
European Conference on Case-Based Reasoning, pages 380394, Aberdeen, Scotland, 2002. Springer-Verlag.
[50] S. Schulz. CBR-works: A state-of-the-art shell for case-based application building. In E. Melis, editor, Proceedings of the 7th German Workshop on CaseBased Reasoning, GWCBR99, Wrzburg, Germany, pages 166175. University
of Wrzburg, 1999.
jCOLIBRI2 Tutorial
References
110
[51] H. Shimazu. ExpertClerk: A Conversational Case-Based Reasoning Tool for Developing Salesclerk Agents in E-Commerce Webshops. Artif. Intell. Rev., 18(34):223244, 2002.
[52] B. Smyth. Case-based recommendation. In P. Brusilovsky, A. Kobsa, and W. Nejdl, editors, The Adaptive Web, pages 342376. Springer, 2007.
[53] B. Smyth and P. McClave. Similarity vs. diversity. In ICCBR 01: Proceedings
of the 4th International Conference on Case-Based Reasoning, pages 347361,
London, UK, 2001. Springer-Verlag.
[54] I. Watson. Applying case-based reasoning: techniques for enterprise systems.
Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1998.
[55] R. Weber, D. W. Aha, N. Sandhu, and H. Munoz-Avila. A textual case-based
reasoning framework for knowledge management applications. In Proceedings of
the 9th German Workshop on Case-Based Reasoning. Shaker Verlag., 2001.
[56] W. Wilke, M. Lenz, and S. Wess. Intelligent sales support with cbr. In Case-Based
Reasoning Technology, From Foundations to Applications, pages 91114, London,
UK, 1998. Springer-Verlag.
[57] I. H. Witten and E. Frank. Data mining: practical machine learning tools and
techniques with Java implementations. Morgan Kaufmann, USA, 2000.
jCOLIBRI2 Tutorial