0% found this document useful (0 votes)
16 views10 pages

ANDERSON. Problem Solving and Learning. 1993

John R. Anderson discusses the integration of problem-solving and learning theories, emphasizing the need for a stronger framework to account for variability in problem-solving behavior. He critiques traditional learning theories for neglecting problem-solving analysis and highlights the ACT* theory as a means to bridge the gap between these fields. The article also reviews the canonical conception of problem-solving, particularly through the lens of Newell and Simon's work, and explores methods like means-ends analysis and subgoaling in understanding human cognition.

Uploaded by

736783
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views10 pages

ANDERSON. Problem Solving and Learning. 1993

John R. Anderson discusses the integration of problem-solving and learning theories, emphasizing the need for a stronger framework to account for variability in problem-solving behavior. He critiques traditional learning theories for neglecting problem-solving analysis and highlights the ACT* theory as a means to bridge the gap between these fields. The article also reviews the canonical conception of problem-solving, particularly through the lens of Newell and Simon's work, and explores methods like means-ends analysis and subgoaling in understanding human cognition.

Uploaded by

736783
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Science Watch

Problem Solving and Learning


John R. Anderson

Newell and Simon (1972) provided a framework for un- computer simulation of human thought and was basically
derstanding problem solving that can provide the needed unconnected to research in animal and human learning.
bridge between learning and performance. Their analysis Research on human learning and research on prob-
of means-ends problem solving can be viewed as a general lem solving are finally meeting in the current research
characterization of the structure of human cognition. on the acquisition of cognitive skills (Anderson, 1981;
However, this framework needs to be elaborated with a Chi, Glaser, & Farr, 1988; Van Lehn, 1989). Given nearly
strength concept to account for variability in problem- a century of mutual neglect, the concepts from the two
solving behavior and improvement in problem-solving skill fields are ill prepared to relate to each other. I will argue
with practice. The ACT* theory (Anderson, 1983) is such in this article that research on human problem solving
an elaborated theory that can account for many of the would have been more profitable had it attempted to in-
results about the acquisition of problem-solving skills. Its corporate ideas from learning theory. Even more so, re-
central concept is the production rule, which plays an search on learning would have borne more fruit had
analogous role to the stimulus-response bond in earlier Thorndike not cast out problem solving.
learning theories. The theory has provided a basis for con- This article will review the basic conception of prob-
structing intelligent computer-based tutoring systems for lem solving that is the legacy of the Newell and Simon
the instruction of academic problem-solving skills. tradition. It will show how this conception solves the gen-
eral problem of the relationship between learning and
performance that has haunted learning theory. In partic-
ular, it provides a concrete realization of Tolman’s in-
Thorndike’s (1898) original learning experiments in- sights. I will also present the case for problem solving as
volved cats learning to solve the problem of getting out the structure that organizes human thought and means—
of a puzzle box. As most introductory psychology texts ends analysis as the principal realization of that structure.
recount, Thorndike concluded that his cats managed to I will argue, however, that this research has been stunted
get out of the puzzle box by a trial and error process. In because of its inability to deal with variability and change
Thorndike’s conception there was really nothing happen- in behavior.
ing that could be called problem solving. What was hap- — Then I will turn to the more recent research on ac-
pening was the gradual strengthening of successful re- quisition of cognitive skills. I will discuss the critical role
sponses. Thorndike’s research is often cited as the begin- of the production rule, a computational improvement
ning of the analysis of learning that occupied American over the stimulus-response bond, in organizing that re-
psychology for much of this century. It could also be cited search. I will show how the acquisition of complex skills
as the beginning of the neglect of problem solving as a can be accounted for by the separate acquisition of these
topic worthy of analysis. rules, thus realizing the goal of learning theory to account
Although KGhler (e.g., 1927) and the other Gestalt for complex learning in terms of the acquisition of simple
psychologists used problem-solving tasks to demonstrate units: I will close by discussing the implications of this
the inadequacies in the behaviorist conceptions of learn- analysis for education, one of Thorndike’s great concerns.
ing, they failed to offer an analysis of the problem-solving Here I will describe my own research on intelligent tu-
process. Tolman (1932) saw the critical role of goals in toring systems, which has been based on the recent in-
learning and behavior but failed to put that insight into sights into problem solving and learning. We have been
a coherent theory, leaving him vulnerable to Guthrie’s able to greatly accelerate and improve the acquisition of
(1952) famous criticism that he left his rat buried in
thought and inaction.
Problem solving finally was given a coherent pro- Donald J. Foss served as action editor for this article.
gram of analysis by Newell and Simon (1972) in a line This research was supported by National Science Foundation Grant
of research that culminated in their book Human Problem BNS-870581 i and Office of Naval Research Contract N00014-90-J-1489.
I would like to thank Allen Newell and Lynne Reder for their comments.
Solving. The basic conception of problem solving they Correspondence concerning this article should be addressed to John
set forth continues to frame research in the field. Their R. Anderson, Department of Psychology, Carnegie Mellon University.
conception had its foundation in artificial intelligence and Pittsburgh, PA 15213.

January 1993 * American Psychologist


Copyright 1993 by the American Psychological Association, Inc. 0003-066X/93/$2.00
Vol. 48, No. 1. 35-44
complex skills, such as proof skills in geometry or com-
puter programming skill. This serves to illustrate the Figure 2
powerful practical applications that can be achieved if Problem Space for the Three-Disk Tower of Hanoi Problem

4] |.
only the fields of problem solving and learning listen to
each other.

Canonical Conception of Problem


Solving ztlelt
In this section I will try to sketch the canonical conception tlt LL
of problem solving that has its origins with the work of
Newell and Simon.
tlds tol [Ls |
Problem Space {Ls [$e
The concept of a problem-solving state is probably the tlt] ad [bebe
tel f#l Lit Lie
most basic term in the Newell and Simon characterization
of problem solving. A problem solution can be charac-
terized as the solver beginning in some initial state of the
problem, traversing through some intermediate states, and [allellibstldlblitl lel le
arriving at a state that satisfies the goal. If the problem Note. Adjacent configurations can be reached by a single, legal move of the
disk.
is finding one’s way through a maze, the states might be
the various locations in the maze. If the problem is solving
the Tower of Hanoi problem (see Figure 1), the states
would be various configurations of disks and pegs.' The
actual reference of state is ambiguous. It could mean ei- disk. Newell and Simon conceived of the problem solver
ther some external state of affairs or some internal coding as having an internal representation of the operators, their
of that state of affairs. Newell and Simon, with their em- preconditions, and their effects.
phasis on problem solving by computer, typically took it Together the concepts of state and operator define
to mean the internal coding. the concept of a problem space. At any state some number
The second key construct is that ofa problem-solving of operators apply, each of which will produce a new state,
operator. An operator is an action that transforms one from which various operators can apply producing new
state into another state. In the maze the obvious operators states, and so forth. Figure 2 illustrates the complete
are going from one location to another, whereas in Tower problem space for the three-disk Tower of Hanoi problem,
of Hanoi they are various movements of disks. An op- one of the smaller of the problem spaces. As can be seen,
erator can be characterized by what must be true for it many problem spaces are closed with only a finite set of
to apply and what change it produces in the state. In the reachable states and loops among those states. Within the
. case of the maze, there must be a path between the two problem-space conception, the problem in problem solv-
locations for the move operator, and its effect is to change ing is search, which is to find some sequence of problem-
the location of the organism. In the case of Tower of solving operators that will allow traversal in the problem
Hanoi, the disk to be moved must be on top of the source space between the current state and a goal state.
peg and must be smaller than the smallest disk at the In contrast to states and operators, Newell and Simon
destination peg. Its effect is to change the location of the did not hold that there is an internal representation of
an entire problem space. Rather problem solvers can dy-
namically generate paths in this space by applying their
aS operators. This generation process can either be done ex-
Figure 1 ternally, in which case direct actions are taken, or inter-
Tower of Hanoi Problem nally, in which case the problem solver imagines some
sequence of actions to evaluate them.

Problem-Solving Methods
Whether one is performing operators externally or imag-
ining them, the critical issue is how to select the next

Start Finish
peg peg, ' The Tower of Hanoi task is one of a number of “toy” tasks that
Note. The goal is to move all the disks from the start peg to the finish peg. Only had an important role in the early development of ideas about problem
one disk may be moved ata time, and one cannot place a larger disk on a smaller solving. Studies of problem solving have now extended to complex and
disk. important problem-solving tasks. However, the Tower of Hanoi task and
others like it remain useful both for exposition of the basic concepts and
aS
as paradigms for studying these concepts in relative isolation.

36 January 1993 « American Psychologist


operator. The term problem-solving method refers to the goal state. People are very reluctant to pursue paths that
principles used for selecting operators. The method cho- temporarily take them in the direction of states less similar
sen can vary from blind search to executing an algorithm to the goal (see Anderson 1990b). One of Kohler’s (1927)
that is guaranteed to find a minimum-step solution. interests was to understand the difficulties various species
Problem solvers’ behavior in a particular situation can of animals have with detour problems that require them
be understood by knowing which method is being used. to take a nondirect path to the goal. So the reliance on
Artificial intelligence textbooks (e.g., Nilsson, 1971) fre- similarity is hardly unique to humans. Anderson (1990a)
quently recount a large array of often exotic methods. can be consulted for arguments that this reliance on sim-
Anderson (1990b) can be consulted for evidence that hu- ilarity is adaptive in that most problems can be effectively
mans at various times use some of the simpler methods. solved by moving in the direction of the goal. Of course,
For instance, people tend to select operators that create how one measures similarity can be tricky, and some
states more similar to the goal state (this method is called kinds of problem-solving learning take the form of de-
hill climbing). The next subsection discusses in some de- veloping more useful ways of assessing similarity to the
tail the method of means-ends analysis, which seems to goal state. This is often characterized as problem solvers
be the premier human problem-solving method. going beyond the surface features of a problem to its deep
Although problem solving can be typically under- features (e.g., Chi, Feltovich, & Glaser, 1981).
stood as some method applying in a fixed problem space, Subgoaling can be nicely illustrated in the Tower of
occasionally problem solving can progress by changing Hanoi problem. For instance, consider the following pro-
the problem space by re-representing the problem states tocol of one of Neves’s (1977) subjects who was faced
or the operators or by adding new operators. These tend with the Tower of Hanoi problem in Figure 3:
to be thought of as the more insightful problem solutions.
Research on functional fixedness (e.g., Duncker, 1945) The 4 has to go to the 3,
can be thought of in these terms, as can research on prob- But the 3 is in the way.
So you have to move the 3 to the 2 post.
lem-solving representation (e.g., Kaplan & Simon, 1990).
The | is in the way there.
Newell and Simon in their 1972 monograph showed
So you move the | to the 3.
how to apply their method of analysis to a number of
problem-solving situations. By characterizing a subject’s As in this case, subgoaling can involve creating a
representation of states, his or her operators, and the stack of such subgoals. Simon (1975) discussed the dif-
problem-solving method, one is able to simulate the be- ficulty in remembering these subgoals. Anderson and
havior of subjects down to the point of predicting every Kushmerick (in press) showed that the time to make a
(or nearly every) move they make in a complex problem- move in the Tower of Hanoi task is strongly correlated
solving episode. One can walk away from such an analysis with the number of subgoals that must be set before that
with the claim of having understood the episode in a fairly move.
rich and detailed way. Although the issue of evaluating Means-ends analysis provides a way of understand-
the fit of such a simulation model to the episode has always ing why difference reduction and subgoaling are so per-
been a sore point, often the qualitative fit can be quite vasive in human problem solving and how they relate to
compelling. one another. Figure 4 illustrates the logic of means—ends
It is of interest to consider the outlines of the appli- analysis. The basic cycle of the problem solver is to look
cation of this analysis to some classic learning task,such for the biggest difference between the current state and
as an animal learning to run a maze. Under this analysis the goal state and try to reduce that difference. The prob-
the learning that takes place is effectively operator learn- lem solver makes a subgoal of eliminating that difference.
ing—learning that moving along a path will get the animal Thus, if a problem solver correctly perceives the Tower
from one location to another. The performance that takes of Hanoi problem, he or she would consider the biggest
place would use this operator knowledge through some difference to be the largest disk out of place, as did Neves’s
problem-solving method to achieve the goal. Thus, as (1977) subject. The problem solver searches for some op-
Tolman (1932) insisted, learning is separate from perfor- erator relevant to removing that difference. If the operator
mance, and it is goals that trigger the conversion of what can be applied, it is, and problem solving progresses for-
has been learned into performance. Tolman was criticized
for not unpacking how that conversion took place. It is
the problem-solving method that converts what is learned
into performance in service of a goal. Thus, the rat is no Figure 3
longer left lost in thought, and there is nothing nonme- State of Tower of Hanoi Problems Facing the Subject
chanical guiding the animal through the maze. Whose Protocol is Reported in the Article
Means-Ends Analysis
Two key features often observed of human problem solv-
ing are difference reduction and subgoaling. Difference DISC 4
reduction refers to the tendency of problem solvers to
select operators that produce states more similar to the

January 1993 * American Psychologist 37


Figure 4
Application of Means—Ends Analysis
Flowchart I Goal: Transform current state into goal state

Y
Match current state SUCCESS
Difference Subgoal: Eliminate
to goal state to find the
the difference
most important difference detected

[xo DIFFERENCES co
SUCCESS FAIL

Flowchart I! Goal: Eliminate the difference

SUCCESS
Y
Search for operator |Operator Match condition of Difference Subgoal:
relevant to reducing > operator to current > Eliminate
the difference found state to find most detected | the difference
important difference
)sove FOUND
, {xo DIFFERENCE

FAIL APPLY OPERATOR


Note. Flowchart | breaks a problem down into a set of differences and tries to eliminate each. Flowchart Il searches for an operator relevant to eliminating a difference.

ward. However, if it cannot (as when a disk blocks the role in accounting for behavior in puzzles like Tower of
move of another disk in Tower of Hanoi), the problem Hanoi, academic problem solving (Larkin, McDermott,
solver sets the subgoal of eliminating the blocking con- Simon, & Simon, 1980), and everyday problem solving
dition. Thus, for instance, Neves’s subject set the subgoal (Klahr, 1978). Often because of the structure of the prob-
of removing Disk 3, which was blocking the move of Disk lem, all the aspects of the underlying means—ends method
4. The problem solver no longer is working on the original do not manifest themselves. Thus, problem solving on
goal but is working on a subgoal, which is only a means certain puzzles may look like hill climbing (e.g., Jeffries,
to the ultimate end. The three key features of means— Polson, Razran, & Atwood, 1977) because the operators
ends analysis are the focus on eliminating a single large for the problem do not have the kind of prerequisite
difference, the selection of operators by what differences structure that leads to subgoaling, and so we only see
they reduce, and the subgoaling of the preconditi6ns of difference reduction. Conversely, a problem may look like
the operator if they are not met in the current state. An- pure subgoal decomposition (Anderson, Farrell, & Sauers,
derson (1990a) can be consulted for a general analysis of 1984) because there is no similarity structure to guide
why this problem-solving method can lead to optimal the choice of subgoals.
problem solving in novel situations. It is of interest to speculate how far means-—ends
Means-ends analysis does not just apply to exotic analysis is found down the phylogenic scale and devel-
laboratory puzzles. Newell and Simon (1972) emphasized opmental scales. Klahr (1978) has argued that children
that it is found in all aspects of life. Consider, for instance, are quite capable of means—ends analysis. Their problem
their following example: solving is often ineffective because of inadequate repre-
I want to take my son to nursery school. What’s the difference sentation of the problem, and they become more effective
between what I have and what I want? One of distance. What means-ends problem solvers when their representations
changes distance? My automobile. My automobile won’t work. of the problem and the operators become sophisticated
What is needed to make it work? A new battery. What has new enough to enable means-ends problem solving to apply.
batteries? An auto repair shop. I want the repair shop to put in K6hler’s (1927) characterization of chimpanzee problem
a new battery: but the shop doesn’t know I need one. What is solving would seem to imply a means-—ends capacity for
the difficulty? One of communication. What allows communi-
them, even as his more dismal characterization of lower
cation? A telephone. . . and so on. (p. 416)
organisms would imply they do not have a means—ends
Whereas it would be incorrect to assert that all hu- capacity. There should be a very strong connection be-
man problem solving is organized by means-—ends anal- tween tool manufacture and use and means-ends problem
ysis, this problem-solving method has played the largest solving. A tool is a concrete means to an end. My own

38 January 1993 »* American Psychologist


belief is that the means-ends problem-solving method is Variability in Problem Solving
an innate part of the cognitive machinery of humans and
other primates. One of the things that is apparent when human problem
solving is considered is that the solutions produced vary
Central Role of Problem Solving in Cognition across replications of the problem with different individ-
uals or indeed for the same individual on different oc-
The remark above about the possible innate status of the casions. This variability shows up in subjects taking dif-
means-—ends method raises the issue of how to conceive ferent paths of solutions to solve a problem and in terms
of the place of problem solving in cognition generally. of their making occasional errors in their problem solving.
There is a tendency of some psychologists to view research It is not much noted, but if one looks at the latencies one
on problem solving as a narrow domain approximately sees considerable variability in the times required to per-
equivalent to research on mathematical behavior. That form the same step of a solution (Anderson, Kushmerick,
is, it is an intellectual activity that we may engage in a & Lebiere, in press). Such variability has been observed
few times a day and that can be understood in terms of by many researchers in human problem solving but is
principles of cognition more general than problem solv- perhaps best documented in our research on LISP pro-
ing. This is far from how some researchers on problem gramming where we observed more than 100 students
solving (e.g., Newell, 1980) have viewed the matter. For
solving more than 100 LISP programming problems
them, all higher level cognition is problem solving. This (Anderson, Conrad, & Corbett, 1989). The canonical
is an implication of the proposal made above for how
problem-solving framework with its emphasis on deter-
problem solving provides the bridge between learning and
ministic behavior is not well prepared to handle this vari-
performance. The problem-solving methods provide the ability.
mechanisms for converting knowledge into behavior, in-
There are two basic ways that such variability has
cluding cognitive behavior. They provide this bridge
been approached within the canonical framework. One
everywhere and not just with esoteric puzzles.
is to attribute the differences to differences among the
One problem with the claim for the central role of
cognitive models of different people (and sometimes
problem solving is that much of human cognition does
among the cognitive models of the same person at different
not feel like problem solving. Some activities, like solving
times). In the standard framework, this comes down to
a Tower of Hanoi problem or solving a new kind of physics
differences in problem-solving representations, operators,
problem, feel like problem solving, whereas other more
and methods. This leads to a style of theorizing in which
routine activities, such as using a familiar computer ap-
separate models are proposed for each subject, which cre-
plication or adding up a restaurant bill, do not. This re-
ates a frustrating problem of generality in the claims that
flects the difference between the reference of problem
can be made.
solving in everyday speech and its use by researchers. In
Perhaps the most hopeful effort of this sort has been
everyday speech the term problem solving refers to activ-
the attempt to account for errors in problem solving in
ities that are novel and effortful. The theorist’s claim is
terms of bugs or misconceptions about the problem do-
that the underlying organization of these activities is no
main (e.g., Brown & Van Lehn, 1980). In one notable
different from the underlying organization of the more
routine. effort, Burton (1982) accounted for a large fraction of
Newell (1980) argued that the dimension of differ- subtraction errors by assuming over 100 different bugs.
ence between routine problem solving and real problem The term bugs comes from analogy to programming
solving is the amount of search involved. When we be- where a program can have an error that leads to a sys-
come familiar with a problem domain, we learn which tematic mistake. It was hoped that we could come up
operators apply without having to search among them. with a theory of the origins of these bugs in terms of the
The experience of effort is correlated with the amount of learning history of the students (e.g.. Van Lehn, 1989).
problem-solving search. Newell argued that we are always A learning account of variability would be a way to
in a search space, as witnessed by what happens when achieve generality. Unfortunately, subsequent research has
we hit on some novel problem state in an otherwise rou- cast doubt on the systematicity of these errors (Anderson
tine problem space. Newell claimed that we transit & Jefferies, 1985; Anderson & Reder. 1992; Katz & An-
smoothly into problem-solving search and indeed that derson, 1988; Payne & Squibb, 1990). Often students are
much of human cognition is a mixture of routine problem best characterized as doing the right thing most of the
solving and problem solving that involves search. This time and, when they make errors, being unsystematic in
claim is realized in his Soar model of cognition (Newell, the errors they make.
1990). The second approach is simply to assume a certain
randomness in which alternative operators (perhaps some
Complications With the Canonical buggy) are indiscriminately chosen among. This has not
Conception been a popular move but can be found in some attempts
to deal with the statistical distribution of solutions across
In this section, I consider problems with the canonical subjects (e.g., Atwood & Polson, 1976; Jeffries et al.,
conception of problem solving that arise because of its 1977). This approach certainly has a grain of truth to it,
failure to incorporate the perspective of a learning theory. but it fails to reflect the systematicity that does exist in

January 1993 * American Psychologist 39


the choices that are made. Anderson et al. (in press) were cases, the expert is adopting approaches that are effective
able to show that the distribution of choices among op- for that problem domain. In the case of programming.
erators was strongly correlated with the optimality of the the strategy is explicitly taught as the structured pro-
operators. Also, the frequencies of erroneous choices de- gramming methodology in programming courses; in the
crease gradually (within a single subject) with experience. case of physics, it appears to be induced.
Variability in behavior and the gradual improvement Experts also appear to use better problem represen-
of the distribution of responses with experience are, of tations. In particular, experts appear to represent prob-
course, the bread and butter of typical learning theories. lems in terms of deeper features, which are connected to
This suggests that problem-solving approaches would do problem-solving success, rather than superficial features.
well to incorporate into their analyses some of the stan- For instance, Chi et al. (1981) found that novices sorted
dard ideas from learning theory. The trick is to do this problems on superficial features, such as whether they
and maintain the computational power of existing ap- involved inclined planes, whereas experts sorted them
proaches that is clearly needed to deal with the complex, according to Newton’s laws.
coordinated structure of a problem-solving sequence.
Increased Problem-Solving Capacity
Learning in Knowledge-Rich Domains
In contrast to these improvements that seem to be cap-
In the last decades, there has been a surge of research on tured by changes in the problem space, other changes
how the transition is made from novel to routine problem seem to reflect a fundamental increase in capacity for
solving as one gathers experience with a problem domain. solving problems within a fixed problem space. For in-
This reflects a shift in research interest both toward stance, there is evidence for improved memory for prob-
learning and toward knowledge-rich, real problem-solving lem states. This was first well documented with respect
domains, such as physics, and away from knowledge-lean to chess, where it was shown that chess experts were able
toy tasks like the Tower of Hanoi. This effort has identified to reproduce much more of a chessboard given a brief
both strengths and weaknesses in the canonical theory. exposure than were chess novices (Chase & Simon, 1973).
A great deal of this research has taken the form of The same phenomenon has been shown subsequently in
comparing subjects who are relative experts at a problem- a large number of domains. It was first thought that this
solving task with subjects who are relative novices at the could be accounted for by the fact that experts had learned
task. Inferences are made about learning on the basis of a great many complex problem patterns and so could
the comparisons. Perhaps the most significant single ob- store in a single chunk information that novices required
servation is that no one achieves a high level of perfor- many chunks to store. That is, it was thought it could be
mance in any domain without a great investment of time. accounted for by changes in problem-solving represen-
Hayes (1985) estimated that it takes 10 years to achieve tations. However, newer evidence and analysis now in-
master’s levels of performance in most professional do- dicate that experts can store more information (more
mains. This indicates that problem-solving expertise does chunks) in long-term memory (Charness, 1976; Van
not come from superior problem-solving ability but rather Lehn, 1989). This increased long-term memory capacity
from domain learning. is something outside the canonical theory. It does not
Not surprisingly, there are great differences between contradict the canonical theory, but the canonical theory
problem-solving experts and novices as a function of the does not provide the terms to explain it.
extensive learning experiences of the experts. These dif- One of the most straightforward effects of increased
ferences are reviewed in Anderson (1990b) and Van Lehn practice of a particular skill is that it is performed more
(1989). quickly and more accurately. The form of the reduction
Some of these differences appear to be nicely cap- of time or errors with practice can be shown (Newell &
tured within the canonical model. For instance, there are Rosenbloom, 1981) to be a power function of the form
changes in how experts go about solving problems. It is
possible to separate these changes into what has been P=AN~°,
called tactical learning and strategic learning. Tactical
learning refers to the acquisition of new, often more com- where P is the performance measure (time or errors), A
plex problem-solving operators. So, for instance, with is a scaling constant, N is the number of trials of practice,
practice geometry students learn to recognize vertical an- and 6 is a constant usually less than one that reflects
gle configurations involving triangles they are trying to learning rate. The fact that learning satisfies this func-
prove congruent (e.g., Anderson, 1990b). Strategic learn- tional form is not altogether trivial. The typical learning
ing refers to wholesale changes in the methods students function that has been proposed in most learning theories
use to organize their problem solving. So, novice problem is exponential:
solvers in physics work backwards from what they are P= Ab",
trying to find to the givens of the problem, whereas experts
work in the opposite direction (e.g., Larkin et al., 1980). where 5 is again less than one. The exponential learning
In programming, more expert students will use top-down, function has the intuitively appealing property that for
breadth-first progressive refinement, whereas novices will each unit of practice, performance improves by a constant
not (Jeffries, Turner, Polson, & Atwood, 1981). In all fraction b. This predicts much more rapid learning than

40 January 1993 »* American Psychologist


what is observed. The fact that power-law learning is memory. It is interesting in this regard to consider how
ubiquitous creates an interesting connection between amnesia patients who suffer serious deficits to long-term
learning theory and problem solving because the power declarative memory might acquire a problem-solving
law also describes simple learning situations, such as skill. Phelps (1989) has argued that this can happen only
learning paired associates, as well as extremely complex _ when the examples from which they work are present in
problem solving, such as learning to do proofs in ge- the environment and do not have to be recalled from
ometry. Newell and Rosenbloom (1981) developed a the- long-term memory.
ory of power-law learning that holds this result derives The interpretive stage can involve substantial ver-
from learning more and more complex operators. How- balization as the learner rehearses the critical aspects of
ever, this explanation applies only in the case of combi- the example from which the analogy derives. There is a
natorially complex tasks, and it does not seem to apply dropout of verbalization that is associated with the tran-
to simple tasks like paired-associate learning. Rather, this sition from this interpretive stage to a stage where the
learning appears to reflect general associative strength- skill is encoded procedurally. Knowledge compilation is
ening mechanisms. Anderson (1982) argued that it is a the term given to the process of transiting from the in-
simple strengthening process that accounts for all power- terpretive stage to the procedural stage.
law learning including that which is occurring in com- Procedural knowledge is encoded in terms of pro-
binatorially complex problem-solving tasks. duction rules that are condition-action pairs, such as the
following two from geometry:
The ACT* Theory of the Acquisition of
Problem-Solving Skills IF the goal is to prove two triangles congruent,
THEN try to prove corresponding parts are congruent.
The list of changes that occur with experience (only par-
tially reviewed above) is probably too challenging to ac- IF segment AB is congruent to segment DE, and segment BC
count for with a single theoretical proposal. Certainly, no is congruent to segment EF, and segment AC is congruent
one-factor theory has been forthcoming. I describe here to segment DF.
THEN conclude triangle ABC is congruent to triangle DEF
my ACT* theory (Anderson, 1982, 1987, 1989) of the
because of the side-side-side postulate.
learning process, which captures some of the major em-
pirical trends and offers some straightforward connections
to more traditional research on human learning. This These rules are basically encodings of the problem-
section concludes with a description of the application of solving operators in an abstract form that can apply across
this theory to the development of intelligent tutors. a range of situations. The Anderson and Thompson (1989)
Basic Concepts in the ACT* Theory model shows how one can extract such problem-solving
operators in the process of doing problem solving by
The ACT* theory of cognition (Anderson, 1983) makes analogy. Knowledge, once in production form, will apply
a distinction between declarative knowledge, which en- much more rapidly and reliably.
codes our factual knowledge, and procedural knowledge,
Strength of Knowledge Encoding
which encodes much of cognitive skill including problem-
solving skill. The theory assumes that problem solving According to the ACT* theory, a critical factor that de-
takes place basically within a means-ends problem;solv- termines both the accessibility of declarative knowledge
ing structure. ACT* is a theory of the origin and nature and the performance of procedural knowledge is the
of the problem-solving operators that feed the means— strength of encoding of this knowledge, which basically
ends engine. It assumes that when a problem solver reflects amount of practice. According to the ACT* the-
reaches a state for which there are no adequate problem- ory, this strength grows as a power function of practice.
solving operators, the problem solver will search for an (For an in-depth analysis of why it is a power function,
example of a similar problem-solving state and try to see Anderson & Schooler, 1991.) It is this growth of
solve the problem by analogy to that example. There is strength that controls the power-function improvements
substantial evidence that a subject’s early problem solving occurring in skill learning. Anderson (1982) showed that,
is strongly influenced by analogy to similar examples (e.g., although other learning processes such as knowledge
Pirolli, 1985; Ross, 1984). Anderson and Thompson compilation are at work, the factor that controls rate of
(1989) have developed a simulation model of this analogy learning is strength. For instance, to compile a production
process. rule from an example, the example has to be retrieved
This initial stage of problem solving is called the and maintained in working memory, which will depend
interpretative stage. It often requires recalling specific on its strength of encoding. Thus, according to ACT* the
problem-solving examples and interpreting them. The ubiquitous power law of learning reflects the ubiquitous
memories retrieved are declarative memories. However, growth of strength of nowledge with practice. It is curious
there is no necessary long-term memory involvement. to note that the growth of strength in ACT* is just a
For instance, students use examples in a mathematics particular instantiation of Thorndike’s law of exercise,
section to guide solution to a problem given at the end which he later rejected. However, there is good evidence
of the section without ever committing the examples to for a law of exercise with respect to dependent measures

January 1993 * American Psychologist 41


like speed of performance of a problem-solving skill (even standard deviation better than control classrooms (if given
in the absence of external feedback). same amount of time on task) or taking one half to one
The concept of strength in ACT* is much like other third the time to reach the same achievement levels as
strength concepts that have appeared in other theories of control students. Currently we are working with the Pitts-
learning and memory over this century. In particular, the burgh Public Schools (Anderson, 1992) to revolutionize
probability of a particular production rule applying is a and greatly accelerate their high school mathematics cur-
function of its strength. This probabilistic manifestation riculum on the basis of our model-tracing approach.
of strength accounts for the gradual disappearance of er- The ability to attribute segments of the student’s
rors and for the variation in how people solve problems. problem-solving behavior to specific production rules has
There can be multiple productions (some correct, some also enabled us to monitor the performance of these rules.
not) that might apply at a particular time, and the prob- We can measure how many errors students make on spe-
ability of each will reflect their strength. Thus, the ACT* cific rules and how that error rate decreases with practice
theory has no problem dealing with the phenomenon of on that specific rule. Figure 5 shows some data on this
variability in problem-solving behavior. More recently, issue from the LISP tutor (Anderson et al., 1989). That
Anderson et al. (in press) reported considerable success figure displays mean number of errors (where the maxi-
applying the theory to the specific distribution of problem mum possible is three). We can also see, when students
choices. make no errors on a specific rule, how their time to per-
Intelligent Tutoring Research form the rule decreases with practice. This is displayed
in Figure 6 for the LISP tutor. Both figures display average
I conclude with a discussion of the work we (Anderson, data and data from specific lessons to give a sense of vari-
Boyle, Corbett, & Lewis, 1990) are doing on intelligent ability. The dependent measure, opportunities, in these
tutoring, both as an indication of the application of this figures refers to the number of times that rule has been
approach and as a source of further evidence for the theory used in solving problems within that lesson. We look only
of problem-solving skill outlined above. Work on intel- at production rules new to that lesson.
ligent tutoring (for a review see Polson & Richardson, These learning curves have a number of interesting
1988) refers to efforts to create computer-based systems features. First, they are plotted on log—log coordinates so
for instruction using artificial intelligence approaches. The that a power function should appear as a linear relation-
approach to development of intelligent tutors that we take ship. There appears to be a dramatic improvement in
is called the model-tracing approach. It involves devel- performance from the first use of a production rule to
oping a cognitive model of the skill that should be learned the second. After that, improvement is quite slow and
(e.g., doing proofs in geometry or writing computer pro- apparently satisfies a power-law function. Similar data
grams in the language LISP). This model takes the form have been obtained with the geometry tutor. This dra-
of a set of production rules that can solve the class of matic first-trial improvement may reflect the compilation
problems the student is being asked to solve in the same
way that the student should solve the problems. Our ap-
proach is relatively unique in the field in terms of the
strong emphasis it places on use of a real-time cognitive Figure 5
model in instruction. Errors per Production Made by Students as a Function of
Our tutors interact with the students while they try Amount of Practice in Lesson in Which Productions Were
to solve a problem on the computer. It is assumed that Introduced
the student is taking an overall means—-ends approach
and that learning involves acquiring production rules that 1.00 }
encode operators to use within this problem-solving or-
ganization. The tutor tries to interpret the student’s prob-
lem solving in terms of the firing of a set of production
rules in its cognitive model. The instruction and help it £
E
delivers to the student is determined by its interpretation al 50 Lesson 5
r)
of the student’s problem-solving state; furthermore, its
choice of subsequent problems to present to the student 3E
3
is determined by its interpretation of which rules the stu- z
dent has not mastered. One of the major technical ac- Average
complishments of our work has been the development of Lesson2
a set of methods for actually diagnosing the student’s be- 20 L Lesson 3

havior and attributing segments of the problem-solving


behavior to the operation of specific production rules.
The various evaluations of the tutor have been generally t i ! i

positive, and we attribute our success at instruction to 1 2 3&4 5-8

our success at interpreting the student’s behavior. Typical Opportunities


evaluations have students performing approximately one a

42 January 1993 « American Psychologist


nn
The number of such rules can be large. For a modest
Figure 6 semester’s course in LISP, we estimate that approximately
Time for Correct Coding per Production as a Function of 500 separate production rules must be acquired.
Amount of Practice of Production in Lesson in Which Thus, the production rule is serving much of the
Production Was Introduced same function that had been assigned to the stimulus—
response bond in past theories. The skill appears to be
nothing more than the sum of these rules. Each rule is
learned independently, and individual differences are re-
flected in the learning of these rules and not the perfor-
mance of these rules once acquired. Complex cognitive
skill reflects the accretion of many specific pieces of
knowledge.
Lesson 5
Seconds

Conclusion
T

Average
Lesson 2
I think we are beginning to see rapid and important prog-
Lesson 3 ress being made with respect to understanding how com-
plex problem-solving skills are learned. This progress has
depended on bringing together ideas from problem-solv-
ing theory and learning theory. We can understand ac-
quisition of complex problem-solving skills only when
1 2 3&4 5-8 we recognize the problem-solving structure that organizes
Opportunities
their performance while recognizing the rather simple
ES
learning that governs the acquisition and strengthening
of the individual problem-solving operators.

of domain-specific production rules. There are problems REFERENCES


with advancing this interpretation too forcefully because
it rests on the exact way the data are averaged and on Anderson, J. R. (Ed.). (1981). Cognitive skills and their acquisition.
relatively strong assumptions about scale. So, a first trial Hillsdale, NJ: Erlbaum.
Anderson, J. R. (1982). Acquisition of cognitive skill. Psychological Re-
discontinuity remains as an intriguing possibility awaiting
view, 89, 369-403.
further research and analysis. Anderson, J. R. (1983). The architecture of cognition. Cambridge, MA:
There are a number of additional points to make Harvard University Press.
about the learning curves found in Figures 5 and 6. If Anderson, J. R. (1987). Skill acquisition: Compilation of weak-method
one analyzes the data on the basis of surface-level cate- problem solutions. Psychological Review, 94, 192-210.
Anderson, J. R. (1989). A theory of human knowledge. Artificial Intel-
gories of behavior such as writing variables, the improve- ligence, 40, 313-351.
ment in performance does not seem orderly. Systematic Anderson, J. R. (1990a). The adaptive character of thought. Hillsdale.
learning functions show up only when defined in terms NJ: Erlbaum.
of production rules. A second point is that these rules Anderson, J. R. (1990b). Cognitive psychology and its implications (3rd
ed.). New York: Freeman.
appear to be learned independently. We do not find evi-
Anderson, J. R. (1992). Intelligent tutoring and high school mathematics.
dence that similar types of rules tend to be learned at the In Proceedings of the Second International Conference on Intelligent
same rate as would be shown by intercorrelations in the Tutoring Systems (pp. 1-10). Montreal, Quebec, Canada.
learning rates of thematically related productions. Thus, Anderson, J. R. (in press). Rules of the mind. Hillsdale, NJ: Erlbaum.
the production rule does appear to be the right unit of Anderson, J. R., Boyle, C. F., Corbett, A., & Lewis, M. W. (1990). Cog-
nitive modelling and intelligent tutoring. Artificial Intelligence, 42,
analysis. 7-49.
We have been able to identify some general factors Anderson, J. R., Conrad, F. G., & Corbett, A. T. (1989). Skill acquisition
that determine how well subjects perform within the tutor: and the LISP tutor. Cognitive Science, 13, 467-506.
In the case of LISP, these factors turn out to be (a) the Anderson, J. R., Farrell, R., & Sauers, R. (1984). Learning to program
speed with which subjects acquire new rules and (b) the in LISP. Cognitive Science, 8, 87-130.
Anderson, J. R., & Jeffries, R. (1985). Novice LISP errors: Undetected
degree to which they retain old rules. In the case of ge- losses of information from working memory. Human Computer [n-
ometry these factors turn out to be (a) the success students teraction, I, 107-131.
have with algebraic rules and (b) the success they have Anderson, J. R., & Kushmerick, N. (in press). Tower of Hanoi and goal
with rules that involve spatial relations (see Anderson, in structures. In J. R. Anderson (Ed.), Rules of the mind. Hillsdale, NJ:
Erlbaum.
press, for a review). However, with remedial practice, stu- Anderson, J. R., Kushmerick, N., & Lebiere, C. (in press). Navigation
dents of differing abilities can be brought to equivalent and conflict resolution. In J. R. Anderson (Ed.), Rules of the mind.
levels of performance on these rules. Students brought to Hillsdale, NJ: Erlbaum.
equivalent levels perform equally well on various nontutor Anderson, J. R., & Reder, L. M. (1992). Working memory load ard
posttests of ability. Thus, it would appear that acquiring performance in algebra. Manuscript in preparation.
Anderson, J. R., & Schooler, L. J. (1991). Reflections of the environment
a skill is basically learning each of the individual rules. in memory. Psychological Science, 2, 396-408.

January 1993 ¢ American Psychologist


Anderson, J. R., & Thompson, R. (1989). Use of analogy in a production Children’s thinking: What develops? (pp. 181-212). Hillsdale. NJ:
system architecture. In S. Vosniadou & A. Ortony (Eds.), Similarity Erlbaum.
and analogical reasoning (pp. 267-297). Cambridge, England: Cam- KGhler, W. (1927). The mentality of apes. New York: Harcourt, Brace.
bridge University Press. Larkin, J. H., McDermott, J., Simon, D. P., & Simon, H. A. (1980).
Atwood, M. E., & Polson, P. G. (1976). A process model for water jug Models of competence in solving physics problems. Cognitive Science,
problems. Cognitive Psychology, 8, 191-216. 4. 317-345. .
Brown, J. S., & Van Lehn, K. (1980). Repair theory: A generative theory Neves, D. (1977). An experimental analysis of strategies of the Tower of
of bugs in procedural skills. Cognitive Science, 4, 379-426. Hanoi (C.I.P. Working Paper No. 362). Unpublished manuscript,
Burton, R. R. (1982). Diagnosing bugs in a simple procedural skill. In Carnegie Mellon University.
D. Sleeman & J. S. Brown (Eds.), Intelligent tutoring systems (pp.
Newell, A. (1980). Reasoning, problem-solving, and decision processes:
157-183). San Diego, CA: Academic Press.
The problem space as a fundamental category. In R. Nickerson (Ed.).
Charmess, N. (1976). Memory for chess positions: Resistance to inference.
Attention and performance VIII (pp. 693-718). Hillsdale, NJ: Erlbaum.
Journal of Experimental Psychology: Human Learning and Memory,
Newell, A. (1990). Unified theories of cognition. Cambridge, MA: Har-
2, 641-653.
vard University Press.
Chase, W. G., & Simon, H. A. (1973). The mind’s eye in chess. In
W. G. Chase (Ed.), Visual information processing (pp. 215-281). San Newell, A., & Rosenbloom, P. (1981). Mechanisms of skill acquisition
Diego, CA: Academic Press. and the law of practice. In J. R. Anderson (Ed.), Cognitive skills and
Chi, M. T. H., Feltovich, P. J., & Glaser, R. (1981). Categorization and their acquisition (pp. 1-55). Hillsdale, NJ: Erlbaum.
representation of physics problems by experts and novices. Cognitive Newell, A., & Simon, H. A. (1972). Human problem solving. Englewood
Science, 5, 121-152. Cliffs, NJ: Prentice-Hall.
Chi, M. T. H., Glaser, R., & Farr, M. (Eds.). (1988). The nature of ex- Nilsson, N. J. (1971). Problem-solving methods in artificial intelligence.
pertise. Hillsdale, NJ: Erlbaum. New York: McGraw-Hill.
Duncker, K. (1945). On problem-solving (L. S. Lees, Trans.). Psycho- Payne, S. J., & Squibb, H. R. (1990). Algebra mal-rules and cognitive
logical Monographs, 58, (Whole No. 270). accounts for error. Cognitive Science, 14, 445-481.
Fitts, P. M., & Posner, M. I. (1967). Human performance. Monterey, Phelps, E. A. (1989). Cognitive skill learning in amnesiacs. Unpublished
CA: Brooks/Cole. doctoral dissertation, Princeton University.
Guthrie, E. R. (1952). The psychology of learning. New York: Harper Pirolli, P. L. (1985). Problem solving by analogy and skill acquisition in
& Row. the domain of programming. Unpublished doctoral dissertation, Car-
Hayes, J. R. (1985). Three problems in teaching general skills. In S. negie Mellon University.
Chipman, J. Segal, & R. Glaser (Eds.), Thinking and learning skills Polson, M., & Richardson, J. (Eds.). (1988). Handbook of intelligent
(pp. 391-406). Hillsdale, NJ: Erlbaum. training systems. Hillsdale, NJ: Erlbaum.
Jeffries, R. P., Polson, P. G., Razran, L., & Atwood, M. E. (1977). A Ross, B. H. (1984). Remindings and their effects in learning a cognitive
process model for missionaries-cannibals and other river-crossing
skill. Cognitive Psychology, 16, 371-416.
problems. Cognitive Psychology, 9, 412-440.
Simon, H. A. (1975). The functional equivalence of problem solving
Jeffries, R. P., Turner, A. A., Polson, P. G., & Atwood, M. E. (1981).
skills. Cognitive Psychology, 7, 268-288.
The processes involved in designing software. In J. R. Anderson (Ed.),
Thorndike, E. L. (1898). Animal intelligence: An experimental study of
Cognitive skills and their acquisition (pp. 255-284). Hillsdale, NJ:
the associative processes in animals. Psychological Review, Monograph
Erlbaum.
Kaplan, C. A., & Simon, H. A. (1990). In search of insight. Cognitive Supplement, 2 (2, Whole No. 8).
Psychology, 22, 374-419. Tolman, E. C. (1932). Purposive behavior in animals and men. New
Katz, I. R., & Anderson, J. R. (1988). Debugging: An analysis of bug- York: Appleton-Century-Crofts.
location strategies. Human Computer Interaction, 3, 351-399. Van Lehn, K. (1989). Problem-solving and cognitive skill acquisition.
Klahr, D. (1978). Goal formation, planning, and learning by pre-school In M. Posner (Ed.), The foundations of cognitive science (pp. 527-
problem solvers, or: My socks are in the dryer. In R. S. Siegler (Ed.), 580). Cambridge, MA: MIT Press.

44 January 1993 « American Psychologist

You might also like