Transfer Learning by Inductive Logic Programming
Transfer Learning by Inductive Logic Programming
net/publication/312618342
CITATION READS
1 622
3 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Hiroyuki Iida on 24 January 2017.
1 Introduction
An important property of a learning process is generalization. Consequently, a
learning process is seen as an intelligent system able to generalize knowledge. For
example, if a person has learned to play a game well, then that person is able to
transfer his1 knowledge to similar games. This means that the person is able to
learn general knowledge about one game and then apply it as specific knowledge
in another game. Based on this observation, we formulate the following research
question: How do we construct a game-playing AI that has the same ability to
adapt to new games as a human being? In other words, how can we transfer
knowledge which is learned from one game to another game?
For this purpose, General Game Playing (GGP) is an appropriate research
topic. A general Game Player is able to play, in principle, all discrete, finite and
perfect information games (defined by General Games) without any human inter-
vention [1]. There exist many successful implementations for a General Game
Player [2–4]. GGP is a good test bed of algorithms generating game knowl-
edge automatically. An example of such a generation for alpha-beta search and
UCT-MC is reported by Walȩdzik and Mańdziuk [5]. Moreover, game knowl-
edge generation for Heuristic functions that are produced by neural networks
is described by Michulke and Thielscher [6]. However, these studies do generate
game knowledge for a specific game with the aim to play that game well. What
we are trying to achieve is (1) learning general knowledge from a game, and then
1
For brevity, we use ‘he’ and ‘his’, whenever ‘he or she’ and ‘his or her’ are meant.
c Springer International Publishing Switzerland 2015
A. Plaat et al. (Eds.): ACG 2015, LNCS 9525, pp. 223–234, 2015.
DOI: 10.1007/978-3-319-27992-3 20
224 Y. Sato et al.
(2) applying the acquired knowledge as specific knowledge to another game. This
is called Transfer Learning. Transfer Learning is a learning strategy that trans-
fers previously learned general knowledge to improve the learning speed of a
new game [7]. In GGP, Hinrichs and Forbus have reported Transfer Learning by
analogy [8].
A telling example is learning the power of an additional square. This knowl-
edge can be transferred to another domain. For instance, consider the domain
of Tic-tac-toe. The game theoretical value of this game is draw. An interest-
ing question is: What is the game theoretical value when we add an additional
square as shown in Fig. 1? When using this board, the game is a win for the first
player, (start at square 9, with the threat to play on square 8; the idea is to use
the diagonal 2-6-10 as additional threat).
After this learning example, we consider the game of Chess. It is well-known
that a king and two knights are unable to force mate. The highest goal to reach
is stalemate. Assume that we augment the chess-board by an additional square
e0. Then the question again reads: What is the game theoretical outcome of the
KNNK endgame on this board? The answer reads: It is a win for the KNN side.
The end position is shown in Fig. 2. The important point for transfer learning is
that the power of an additional square in one game (Tic-tac-toe) may also unex-
pectedly change the original game theoretical outcome in another game (Chess).
See also [8]. We invite readers to find analogous transfer ideas of this kind.
In this paper we apply the Inductive Logic Programming (ILP) approach
to learn general knowledge for General Games. ILP is a successful approach,
e.g., learning Chess variants and rules is reported to be possible [9]. Some ILP
algorithms are able to make a reasonable specialization from general knowledge
by winning examples only. If the examples represent normal winning situations, a
winning strategy is expected to be learned. In our method, the general knowledge
are boolean functions which represent patterns in a game position. The patterns
may be winning position and losing position. The generated general patterns are
then made specific for incorporation in Heuristic functions that apply to another
game. This is an example of Transfer Learning between games.
Fig. 1. A Tic-tac-toe game board with Fig. 2. A Chess game board with an
an additional square additional square
Transfer Learning by Inductive Logic Programming 225
In this paper, Tic-tac-toe is chosen as the source game, Connect4 and Con-
nect5 are chosen as target games. We attempt to transfer general knowledge
that is learned from Tic-tac-toe to Connect4 and Connect5. In Sect. 2, we define
the general source concepts in such a way that they are suitable for transfer.
In Sect. 3, we generate the concepts that will be transferred from Tic-tac-toe. In
Sect. 4, we explain how ILP and Transfer Learning work. In Sect. 5, we transfer
concepts that are learned from Tic-tac-toe to Connect4 and Connect5 in order
to generate Heuristic functions for the game involved. In Sect. 6, we test the per-
formance of the generated Heuristic functions. Section 7 provides a discussion.
Section 8 concludes the paper.
piece(cell,1,1,blank).
piece(cell,1,2,blank).
These two pieces are adjacent since the x coordinate is the same and the y
coordinate differs by one. Let us introduce four variables (viz. C, X, Y , and S)
to generalize a pattern in this position. Next to the above propositions, we also
have arithmetic propositions such as Y 2 is Y + 1 (notation is in Prolog). Now
consider the following pattern.
In games, there exist also other types of patterns. Examples are (1) a dis-
junction of propositions, and (2) patterns of time evolution of positions, e.g., a
sequence of changes of positions. For simplicity, we focus only on non-complex
patterns in a position. From now on, let us concentrate on straightforward pat-
terns in a game position (i.e., is the square occupied by x, o, or blank) and
consider them as concepts in the games.
disjunction :- concept1.
disjunction :- concept2.
disjunction :- concept3.
The concepts (i.e., conjunctions and disjunctions) that are generated in this
section are used to make a Heuristic function by ILP.
piece(cell, 1, 1, o).
piece(cell, 1, 2, o).
piece(cell, 1, 3, o).
Transfer Learning by Inductive Logic Programming 229
piece(cell, 1, 1, x).
piece(cell, 1, 2, x).
piece(cell, 1, 3, x).
The Heuristic function is made for the first player when playing Connect4.
Let us have a closer look at the specifications. We take concept 11. This concept
obtained the specification that the type of piece is characterized by cell; the x
coordinate is specified by 1, the y coordinate is specified by 3 and the occupation
by w that is the symbol for the white player (first player) in Connect4 (second
player is r).
For specialization toward Connect4, all conjunctions and the disjunction that
are obtained by Tic-tac-toe were used as background knowledge. However, not
Transfer Learning by Inductive Logic Programming 231
all of them were used for the specialization. This means that some concepts are
useful, but others are not useful for this type of game. Here, we may anticipate
on the difference in complexity of Connect4 and Connect5. For instance, we
may state that, in a Tic-tac-toe specialization process toward Connect5, only
concepts which appear in the specialization for Connect4 are used as background
knowledge. This is a meta-concept, i.e., a relationship between concepts. The
meta-concept is suitable for reducing the computation time.
Once general concepts are specified to a target game, the specialized concepts
are useful to make a Heuristic function for that game. Our Heuristic functions
are a set of specialized concepts. If a specialized concept is true in a position,
we may evaluate the position; it will have a positive constant value. If some
specialized concepts are true in a position, the evaluated value of the position is
the total sum of the constant values.
232 Y. Sato et al.
Fig. 9. Game simulation with a player using a Heuristic function vs a random player
for Connect4
There exists a tendency that the winning ratio increases as the index of the
Heuristic functions also increases (see Figs. 9, 10 and 11). This means that the
Transfer Learning by Inductive Logic Programming 233
Fig. 10. Game simulation with heuris- Fig. 11. Game simulation with heuris-
tics player vs random player for Con- tics player vs 1-depth alpha-beta player
nect5, search depth 1 for Connect4
Heuristic functions that were generated by many positive and negative exam-
ples, have a better performance when compared to Heuristic functions that were
generated by a smaller number of positive and negative examples. The tendency
clearly appears for the depth-3 search case. We note that depth-1 searches totally
rely on Heuristic functions. However, our Heuristic functions are not perfect, they
have inaccuracies. Therefore we surmise that they guide the middle games suc-
cessfully, but miss sometimes a win in the endgame. This is why our Heuristic
functions perform better in 3-depth search than in 1-depth search, even though
the same Heuristic function is used.
7 Discussion
8 Conclusions
Acknowledgments. We would like to express great thanks to Aske Plaat for his
advice to this research, and Siegfried Nijssen for his advice on Inductive Logic Pro-
gramming.
References
1. Love, N., Hinrichs, T., Haley, D., Schkufza, E., Genesereth, M.: General Game
Playing: Game Description Language Specfication. Technical report LG-2006-01
Stanford Logic Group (2006)
2. Schiffel, S., Thielscher, M.: Fluxplayer: a successful general game player. In: The
Twenty-Second AAAI Conference on Artificial Intelligence, pp. 1191–1196 (2007)
3. Björnsson, Y., Finnsson, H.: CADIAPLAYER: a simulation-based general game
player. IEEE Trans. Comput. Intell. AI Games 1(1), 4–15 (2009)
4. Méhat, J.M., Cazenave, T.: Ary, a general game playing program. Board Games
Studies Colloquium (2010)
5. Walȩdzik, K., Mańdziuk, J.: An automatically-generated evaluation function in
general game playing. IEEE Trans. Comput. Intell. AI Games 6(3), 258–270 (2014)
6. Michulke, D., Thielscher, M.: Neural networks for state evaluation in general game
playing. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.)
ECML PKDD 2009, Part II. LNCS, vol. 5782, pp. 95–110. Springer, Heidelberg
(2009)
7. Taylor, E.M., Stone, P.: Transfer learning for reinforcement learning domains: a
survey. J. Mach. Learn. Res. 10, 1633–1685 (2009)
8. Hinrichs, R.T., Forbus, D.K.: Transfer learning through analogy in games. AI Mag.
32(1), 70–83 (2011)
9. Muggleton, S., Paes, A., Santos Costa, V., Zaverucha, G.: Chess revision: acquiring
the rules of chess variants through fol theory revision from examples. In: De Raedt,
L. (ed.) ILP 2009. LNCS, vol. 5989, pp. 123–130. Springer, Heidelberg (2010)
10. Mitchell, M.T., Keller, M.R., Kedar-cabelli, T.S.: Explanation-based generaliza-
tion: a unifying view. Mach. Learn. 1(1), 47–80 (1986)
11. Aleph. http://www.cs.ox.ac.uk/activities/machlearn/Aleph/aleph.html
12. Muggleton, S.H., De Raedt, L.: Inductive logic programming: theory and methods.
J. Logic Program. 19–20, 629–679 (1994)
13. Muggleton, S.: Inverse entailment and progol. New Gener. Comput. 13(3–4), 245–
286 (1995)