0% found this document useful (0 votes)
81 views6 pages

Perception and Experience in Problem Solving

This document discusses how perception and experience play an important role in problem solving, even in abstract domains like mathematics. It argues that experts utilize vast amounts of mathematical knowledge efficiently due to how well their knowledge is organized and grasped. Perceptual features help order knowledge in memory from less to more important, with the most important results requiring no mention since they are so familiar. The Contextual Memory System model allows knowledge to dynamically change over time as a novice gains experience to become an expert through organizing perceptual features and memory.

Uploaded by

davidrojasv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views6 pages

Perception and Experience in Problem Solving

This document discusses how perception and experience play an important role in problem solving, even in abstract domains like mathematics. It argues that experts utilize vast amounts of mathematical knowledge efficiently due to how well their knowledge is organized and grasped. Perceptual features help order knowledge in memory from less to more important, with the most important results requiring no mention since they are so familiar. The Contextual Memory System model allows knowledge to dynamically change over time as a novice gains experience to become an expert through organizing perceptual features and memory.

Uploaded by

davidrojasv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Perception and Experience in Problem Solving

Edmund Furse
Department of Computer Studies University of Glamorgan Pontypridd, Mid Glamorgan CD37 1DL UK [email protected] Rod Nicolson Department of Psychology University of Sheffield Sheffield

S10 2TN
UK [email protected]

Abstract
Whilst much emphasis in AI has been placed on the use of goals in problem solving, less emphasis has been placed on the role of perception and experience. In this paper we show that in the domain that may be considered the most abstract, namely mathematics, that perception and experience play an important role. The mathematician has a vast amount of mathematical knowledge, and yet is able to utilise the appropriate knowledge without difficulty. We argue that it is essential to model how well the knowledge is grasped, so that mathematical knowledge can grow from partial knowledge to important results that are easily accessed. Not all knowledge is equal in its importance, and we argue that perception and experience play a key role in ordering our knowledge. Features play a role in both representing the information from the environment, and indexing the knowledge of our memories, but a key requirement is that the features should be dynamic and not be built in. This research is implemented in the program M U , the Mathematics Understander, which utilises the CMS, Contextual Memory System. MU has sucessfully "read" university level texts in pure mathematics, checking the proofs and solving the simple problems.

1. Introduction
Problem Solving has been thought of as the primary examplar of intelligence, and has been central to work in artificial intelligence from the early work of Newell and Simon's GPS, to expert systems and theorem proving. Laird, Rosenbloom and Newell's SOAR (1987) has problem solving as the cornerstone of its architecture. However, whilst expert systems, and in particular Lenat and Feigenbaum (1991), have shown the importance of a large amount of knowledge to give systems power, theorem proving in contrast has tended to concentrate on general methods such as resolution, tempered by metalevel reasoning (eg Bundy 1983). Given the evidence that expert problem solving tends to have access to large amounts of knowledge and tends to use shallow search methods, it is surprising that much of the AI community

continues to pursue methods which are knowledge thin rather than knowledge rich. It can be argued that if one is concerned with discovering powerful machine problem solving methods, then these do not necessarily have to be similar to human problem solving methods. Furthermore, it is clearly more elegant to have neat general problem solving methods, rather than a collection of special purpose methods. However, Schank (1981) has argued that even if one's goal is to build intelligent machines, it is a good first step to model how humans perform the task. We argue that the neat approaches to problem solving do not sufficiently model genuine ecological tasks, are of insufficient power, and since they take little account of learning are inadequate accounts of human cognition. Instead we present an alternative thesis on cognition which places learning at the centre of what it is to be intelligent, rather than placing problem solving at the centre. Interestingly, this is a view much nearer to Alan Turing's thought (1953), than his Turing machine conception of computation. Turing wanted to know not only how machines could solve problems, but how they could learn. Certainly SOAR places great impotance on learning, but it is in our view an impoverished view of learning to see it only as search within a problem space. For too long perception has been seen as almost a separate faculty from the mainstream of Al and cognition, and yet deGroot (1966) and Chase and Simon (1973) have convincingly showed that expert knowledge in chess is very largely a matter of 50,000 perceptual features. One could argue that board games are naturally prone to perceptual processing, but we show that the same arguments apply in the domain of pure mathematics. Given the highly abstract nature of pure mathematics, if perceptual features play an important part in problem solving in this domain, it may well be the case that they play an important role in many other types of problem solving. It is easy to demonstrate that perception plays an important part in problem solving. Consider the following proposition: [1) [1) is difficult to recognise, but by using the usual letter names as in [21:

Furse and Nicolson

181

[2] we have a proposition that is easily recognised by mathematicians as the definition of the limit of the sequence xn as n tends to infinity is x. However, since [1 ] is obtained from [2] simply by a change of letters, it is logically equivalent to [1 ]. Thus there is more to problem solving than logic. What we are arguing is that expressons such as act as perceptual features which enable the expert to rapidly recognise the mathematical result. It is obvious that experience plays a role in an expert's ability to solve problems. However, apart from our own work, we believe there is no convincing account of how this might work. Too many models of expertise are relatively static, and yet all expertise must have been learned at some time, and experts continue to grow in expertise. Through the study of education, and in particular mathematics text books, one can see how knowledge is slowly acquired. We argue that in text books, any particular mathematical result goes through three stages of usage in problem solving: 1. The result is stated explicitly, for example: 2. The result is used implicitly, for example: (ie the same as (1) but without the explanation). 3. The result is used in compound inferences, e.g.: Not all mathematical results get beyond stage (1), some results may only be used a few times, and may then be forgotten. In contrast, results which are frequently used become so well known through their stage (2) usage, that eventually they get used in stage (3). Thus, ironically the most important results are the ones which are not mentioned at all, precisely because they are so well known. If all of an expert's knowledge was of equal importance, and therefore equally easy/difficult to access, most experts could not function at all. Yet most truth maintenance systems work on this basis. In contrast we argue that the expert has a perceptual system and memory so organised that important results are easily recognised and retrieved. Furthermore, we will show how the Contextual Memory System, CMS, (Furse 1992, Furse and Nicolson 1992) allows a continuing change of the perceptual and memory systems, so that the novice can become the expert through sufficient learning experience. The CMS is a network architecture of features and items. The features are used to both encode the external object in the environment, and to index items in memory. Both features and items have an energy level, and the links between features and items have a strength. The novel characteristic of this architecture is that the features are generated d y n a m i c a l l y from the environment, and the configuration of connections is frequently changed during memory processes.

2. Pure Mathematics
Pure Mathematics is a good domain in which to study problem solving since although it is a genuinely ecologically valid task, it uses little common sense knowledge, and all the mathematical knowledge has been acquired through learning. Indeed, some branches of pure mathematics, such as group theory, can be learned by arts undergraduates, illustrating the point that the subject can be learned w i t h little prior knowledge. Within a course a large body of knowledge is built up which is used to understand proofs and solve problems. This research has concentrated on modelling the understanding of mathematics texts. Since about the beginning of this century pure mathematics texts have taken the form of definitions, theorems and lemmas (a lemma is a little theorem), their proofs and exercises. Clearly it would be a large scale exercise to develop a program to read verbatim mathematics texts, and so the natural language component has been factored out by rewriting the texts by hand in the language FEL (Formal Expression Language), Furse (1990). FEL has a formal syntax and is very expressive, as the example in Figure 1 shows.

There are many levels of understanding a mathematics text, but this research has focused on the ability to check proofs by giving explanations of the steps and solving the simple problems. The problem solver uses a number of built-in general heuristics and uses the CMS to filter the mathematical results to ensure there is no combinatorial explosion. Many theorem provers model an artificial task by carefull feeding of only the results needed to solve the problem. In contrast, MU has access to the whole body of mathematical knowledge it has learned at the time, and uses its experience to focus on the problem in hand.

182

Cognitive Modeling

But the mathematician does not re-represent in his or her head the definition in this lower level. Rather the representation is at the original level, and can even be utilised at this level without further unpacking, for instance the lemma can be used in a step such as: is a homomorphism of G onto H with kernel K => K is a normal subgroup of G simply by pattern matching. Rather, we represent the proposition in terms of its component concepts, namely "homomorphism", "kernel" and "normal-subgroup". If the student on encountering this lemma was very familiar with normal subgroups then this should be easy to encode, and the student should already have developed features for this purpose. In the CMS, features are generated dynamically from the environment using built in feature generating mechanisms. There are no built in features (see Furse and Nicolson 1992, Furse 1993b). The program MU utilises a dynamic parser which converts the FEL propositions into a parse tree which is then fed to the feature generating processes. Thus, e.g., the above proposition is represented as the tree shown in Fig. 3.

Figure 3, Extracting features from a tree Here, the leaves of the tree have been replaced by a canonical representation using the letters a,b,c,... By using mechanisms which construct different parts of this tree a very large number of features can be built. In the following we use LHS to represent the left hand side of the tree, RHS the right hand side, and RHS-LHS to represent the RHS of the LHS. Thus in this case the RHS-LHS is the proposition kernel(c). It is also possible to abstract a node, so for example if we abstract the whole of the left hand side we obtain the proposition a => normal-subgroup(b,c), where again the letters have been replaced by their canonical form. Using such methods one obtains features such as: has-form-[homomorphism_a_b] has-form-[kernel_a 1 rhs-has-form-[normal-subgroup_a_b] rhs-lhs-is-form-[kernel_al is-form-[=>Jand_a_b]_[normal-subgroup_c_d]) is-form-[=>_[and_[homomorphism_a_bLc]L[normalsubgroup_d__a]] The last feature captures the notion that the student can remember that the lemma was something about being a

Furse and Nicolson

183

homomorphism i m p l y i n g that there was a normalsubgroup. Thus these features enable a rich representation of partial knowledge. With sufficient features the original propostion can be reconstructed, and given the development of sufficiently specialised features, it can be represented exactly and compactly.

4. Learning and Experience


Not only does the expert mathematician learn a large body of mathematical results, but also a large number of features of these results. This then ensures that results are easily recongised and retrieved. Given a mathematical step such as: the mathematician has little difficulty in noticing that the result: has been used, or at least in checking the forward reasoning on applying this inference rule to produce the intermediate step: Given that the mathematician has hundreds or even thousands of results that might be relevant at any particular step, it is an important computational problem how the appropriate result is retrieved without difficulty. We argue that it is patterns or features that the mathematician learns to recognise. Thus, in the above example, the mathematician has no difficulty in seeing the pattern (x - y)(x + y) as being the left hand side of a well known result. If this result has been used sufficiently often then a specialised feature such as has-formmay have been stored with which the result is indexed and retrieved. It remains to explain how items are first stored in memory in terms of features, and how these features change through experience. As explained in the previous section, when a result is first processed by the CMS it is broken up into a large number of features using knowledge free methods. Some of these features may already have been stored, others will be completely novel. The CMS stores a mixture of old and new features, where the old ones are selected from the ones with highest energy, all features being given an energy value which is adjusted w i t h utilisation. If only old features were used, then we would soon be in a closed box representation, but at the time of storage one does not know which of the new features may be useful. For example, consider the definition of a normal subgroup: Definition. N isa normal-subgroup of G iff N isa subgroup of G and Here, on initial encoding features that might be used could include: has-form-[*_a_b] has-form- [ subgroup_a_b ] but the crucial feature is: has-form-[*_a_[*_bJinv_a]]]

b u t it is o n l y t h r o u g h experience that the mathematician learns of the importance of recognising the feature gng- 1 . Within the CMS, when a result is retrieved (for example in proof checking or problem solving), the features are adjusted to ensure that retrieval is more efficient in future. Recall involves first computing the features of the probe. For example, in trying to prove that: log[(x - y)(x + y)l + 21og(y) = 21og(x) the system first searches for a whole matching result before just trying to reason from the left hand side. In reasoning from the left hand side, the LHS acts as the probe for the CMS, generating features such as: lhs-has-form-[log_a] has-form-[_*_ a_b] lhs-has-form-|-_a_bl lhs-lhs~has-form-[*_a_b] All these features must be existing ones, for clearly it is pointless to use a novel feature as an index to old knowledge. The feature generating mechanisms generate many features, and a subset is chosen of those of highest energy. In the second stage of recall this set of features is used to index a number of mathematical results. Results which do not match the probe are considered failures. Ideally the features should only index relevant results, which can be applied, but this only comes through experience as the feature space changes. The matching results are then applied and the outcome which has the highest plausibility in terms of being nearer to the goal and a simpler expression is chosen. In the third stage of recall learning takes place to ensure that the found item is more easy to retrieve in future. This means ensuring that the same set of input features should be more likely to retrieve the found item than the failures. This is achieved by four processes: storing the uncomputed features encouraging the useful features discouraging the unuseful features creating new distinguishing features Most of the features used in the probe w i l l already index the found item, but some of them may have arisen from storing other items, and may not be connected to the found item. These are known as uncomputed features, and provided the found item possesses the feature, they are now stored. By this means new connnections between features and items are regularly being made during retrieval. Useful features are features which index the found item but not the failures. These features are encouraged by increasing both their energy and the link strength between feature and found item. Conversely, unuseful features index the failures but not the found item, and have their energy decreased. The final method of adjusting the CMS involves the dynamic creation of novel features which index the found item but not the failures, and do not already exist. This is achieved by means of comparing the large feature sets of the found item and the failures generated

184

Cognitive Modeling

by bottom-up methods. specialised features such as

By such a mechanism,

can be generated. By these means the CMS ensures that results which are important have a number of high energy specialised features so that such results are recognised easily. Thus as a result is used through the stages (l)-(3) described above, so its representation changes from initially being encoded only partially in terms of its features, and likely to be retrieved with other similar results, to ultimately being a result with specialised features easy to recall and use. See Furse (1992) for more details of the CMS.

5. T h e Mathematics Understander
The Mathematics Understander (MU), (Furse 1993a), is a computer program which utilises the CMS to "understand" mathematics texts expressed in FEL. The original text is translated by hand into FEL, which is then input to the program. MU parses the input using a parser whose syntactic knowledge is built up dynamically from definitions and stored in the CMS. The parsed input is fed to the proof checker and problem solver which utilises the acquired knowledge in the CMS to produce explanations of the steps and solutions to the problems (see Fig. 4). MU thus has parsing, proof checking and problem solving knowledge built in, but no knowledge of mathematical results. All the mathematical results arc acquired through the reading of texts. Fig. 2 shows an example of a problem solution generated by M U . The built in heuristics used in problem solving are: expand a definition break a compound step into parts suppose the left hand side

simplify
Simplification involves the application of known results using the CMS to only retrieve relevant results rather than all possibly applicable results. The results retrieved are applied and their plausibility measured in terms of nearness to the goal and simplicity. The result which achieves the highest plausibility is the one chosen, and if necessary, MU will backrack, but this is very rarely found to be necessary. MU is implemented in Macintosh Common LISP and runs on Apple Macintosh computers. The CMS can be displayed in a graphical format, and navigated through. It is also possible to set parameters for individual modelling. See Fig. 5. MU has successfully read texts in group theory and classical analysis, and solved simple problems in group theory.

Education in pure mathematics is often a process of the student understanding the proofs of theorems and solving the problems. This experience enables the student to develop appropriate features so that relevant results can easily be retrieved when necessary. The lazy student who skips the proofs and the simple problems runs the risk of not being able to solve the harder problems. If MU "reads" only the definitions and theorems, and skips the proofs and simple problems, it too cannot retrieve the results needed when trying to solve the problem in Figure 2 and gives up. 6. D i s c u s s i o n In some sense an expert "sees" that in solving a problem a particular step should be taken. It seems implausible that a large scale search takes place of knowledge which might be useful. Rather the knowledge required almost springs out from the problem. It is this notion which places perception and learning centre stage in cogntion which this research attempts to address. We have shown how in the domain of pure mathematics expertise can be captured w i t h a large body of knowledge indexed by features. Further the CMS allows the modelling of how these features are learned through experience.

Furse and Nicolson

185

N a t u r a l l y the C M S uses several ideas w h i c h are w e l l k n o w n i n the l i t e r a t u r e . For i n s t a n c e , the f e a t u r e r e c o g n i t i o n m e c h a n i s m has s i m i l a r i t i e s w i t h E P A M (Feigenbaum a n d S i m o n , 1986). The energy mechanism resembles the activation used by A n d e r s o n in A C T - R and A C T * (1983), t h o u g h t h e t h e r e i s n o s p r e a d i n g activation in the C M S . We believe, h o w e v e r , that the C M S i s d i s t i n c t i v e i n c o m b i n i n g all these ideas w i t h i n an integrated e n v i r o n m e n t . The C M S can, in p r i n c i p l e , be applied to other d o m a i n s than mathematics, since there is no b u i l t in mathematics in its construction. T h e m a j o r r e q u i r e m e n t i s the development of appropriate feature generating mechanisms. W o r k i s i n h a n d i n a p p l y i n g the C M S t o the l e a r n i n g of board games. Research in neural n e t w o r k s has b r o u g h t learning back to the centre of w o r k in artificial intelligence, and there is a clear e m p h a s i s in t h e i r a p p l i c a t i o n to p e r c e p t u a l classification p r o b l e m s . However, to our knowledge there has n o t been m u c h success i n a p p l y i n g such architectures to p r o b l e m s o l v i n g in d o m a i n s as complex as p u r e mathematics. F u r t h e r m o r e , it could be argued t h a t a n e u r a l n e t w o r k is r e s t r i c t e d in its i n t e r n a l representations by the features used as the i n p u t to the n e t w o r k . I n contrast, the C M S generates its features d y n a m i c a l l y f r o m the e n v i r o n m e n t . The o t h e r m a i n c o m p e t i t o r s t o the C M S m o d e l are A n d e r s o n ' s A C T * m o d e l a n d L a i r d , R o s e n b l o o m and Newell's SOAR. W h i l s t both architectures model p r o b l e m s o l v i n g , a n d i n p a r t i c u l a r m o d e l goal based reasoning, w h i c h the C M S does not m o d e l , the focus is q u i t e d i f f e r e n t . Both m o d e l s are essentially based on a p r o d u c t i o n systems a r c h i t e c t u r e , and this u l t i m a t e l y l i m i t s their scope for a d a p t a t i o n . A l t h o u g h b o t h A C T * and S O A R m o d e l l e a r n i n g , they do not appear to m o d e l d e c l a r a t i v e l e a r n i n g w h i c h the C M S i s designed for. A n d e r s o n has concentrated on m o d e l l i n g h o w k n o w l e d g e becomes proceduralised, but does not address the question o f h o w the i n i t i a l k n o w l e d g e i s learned i n the f i r s t place. N e w e l l d e f i n e d l e a r n i n g as search w i t h i n a p r o b l e m space, b u t as b o t h N o r m a n (1991), and Boden (1988), h a v e r e m a r k e d , t h i s seems a n i m p o v e r i s h e d v i e w of learning. Pure mathematics cannot be represented as a p r o b l e m space since n e w constructs are c o n t i n u a l l y b e i n g i n t r o d u c e d in a r e l a t i v e l y ad hoc manner. SOAR appears to w a n t the d o m a i n to be precharacterised as a p r o b l e m space in a d v a n c e , before l e a r n i n g can take place. A s i m i l a r p r o b l e m occurs w i t h e x p l a n a t i o n based g e n e r a l i s a t i o n w h e r e b y a d o m a i n theory is required in advance. H o w e v e r , h u m a n learning is m u c h m o r e piecemeal, and teachers almost never m a p o u t in advance a characterisation of the d o m a i n to be learned. I n c o n c l u s i o n , p e r c e p t i o n a n d experience p l a y a n essential p a r t in p r o b l e m s o l v i n g . It is n o t sufficient to be an expert to k n o w a lot of facts. The expert also has to be able to recognise w h e n they are relevant and be able to retrieve them.

7. References
A n d e r s o n J.R. (1983), T h e A r c h i t e c t u r e of C o g n i t i o n , H a r v a r d U n i v e r s i t y Press. B o d e n , M. (1988) Computer models of mind. C a m b r i d g e : C a m b r i d g e U n i v e r s i t y Press. B u n d y A , (1983), T h e C o m p u t e r M o d e l l i n g o f M a t h e m a t i c a l Reasoning, A c a d e m i c Press. Chase W . G . & S i m o n H.A. (1973) Perception in Chess, Cognitive Psychology A p p 5 5 - 8 1 . D e G r o o t A . D . (1966) P e r c e p t i o n a n d M e m o r y versus t h o u g h t : Some o l d ideas and recent f i n d i n g s , In B . K l e i n m u n t z (ed) Problem Solving: Research, Method and Theory, W i l e y . F e i g e n b a u m , E.A. a n d S i m o n , H . A . (1986), E P A M - l i k e models of recognition a n d l e a r n i n g . Cognitive Science, 8: 305-336. Furse E., (1990), A F o r m a l Expression Lanuage for Pure M a t h e m a t i c s , Technical R e p o r t CS-90-2, D e p a r t m e n t of C o m p u t e r Studies, The U n i v e r s i t y of G l a m o r g a n . Furse E. (1992) T h e C o n t e x t u a l M e m o r y System: A Cognitive Architecture for Learning W i t h o u t Prior K n o w l e d g e , In Cognitive Systems 3-3, September 1992 pp305-330. Furse E. and N i c o l s o n R.I. (1992) Declarative L e a r n i n g : C o g n i t i o n w i t h o u t P r i m i t i v e s , Proceedings of the 14th A n n u a l Conference of the Cognitive Science Society, p p . 832-7. Furse E, (1993a) T h e M a t h e m a t i c s U n d e r s t a n d e r . In ' A r t i f i c i a l I n t e l l i g e n c e in M a t h e m a t i c s ' , (eds) J H Johnson, S. M c K e e , A. Vella, C l a r e n d o n Press, O x f o r d (in press). Furse E, (1993b), Escaping f r o m the Box. In Prospects for I n t e l l i g e n c e : P r o c e e d i n g s o f A I S B 9 3 , (Eds) A a r o n Sloman, D a v i d H o g g , G l y n n H u m p h r e y s , A l l a n Ramsey, Derek Partridge, IOS Press, A m s t e r d a m . L a i r d J.E., N e w e l l A. & Rosenbloom P.S.. (1987) SOAR : A n A r c h i t e c t u r e f o r General I n t e l l i g e n c e , I n A r t i f i c i a l Intelligence 33 N o . l , Elsevier Science Publishers. L a i r d J.E., R o s e n b l o o m P.S. & N e w e l l A. (1986) C h u n k i n g in S O A R the a n a t o m y of a General L e a r n i n g M e c h a n i s m , In Machine Learning 1 p p l l - 4 6 . L e n a t , D.B. a n d F e i g e n b a u m , E..A. (1991). O n the thresholds of k n o w l e d g e . A r t i f i c i a l Intelligence, 47: 185-250. N o r m a n D . A , (1991), A p p r o a c h e s t o t h e S t u d y o f intelligence, A r t i f i c i a l Intelligence 47 (1991), 327-346. Schank R.C. a n d Riesbeck C.K., (1981), Inside C o m p u t e r U n d e r s t a n d i n g , Lawrence E r l b a u m 1981. T u r i n g A . M . (1953), D i g i t a l C o m p u t e r s a p p l i e d t o games. In B.V. B o w d e n (Ed.), Faster than Thought, L o n d o n . P i t m a n , 286-310.

186

Cognitive Modeling

You might also like